Headings of UCAV Based on Nash Equilibrium

Li DAI, Zheng XIE

Journal of Systems Science and Information ›› 2018, Vol. 6 ›› Issue (3) : 269-276.

PDF(209 KB)
PDF(209 KB)
Journal of Systems Science and Information ›› 2018, Vol. 6 ›› Issue (3) : 269-276. DOI: 10.21078/JSSI-2018-269-08
 

Headings of UCAV Based on Nash Equilibrium

Author information +
History +

Abstract

Given n vertices in a plane and UCAV going through each vertex once and only once and then coming back, the objective is to find the direction (heading) of motion in each vertex to minimize the smooth path of bounded curvature. This paper studies the headings of UCAV. First, the optimal headings for two vertices were given. On this basis, an n-player two-strategy game theoretic model was established. In addition, in order to obtain the mixed Nash equilibrium efficiently, n linear equations were set up. The simulation results demonstrated that the headings given in this paper are effective.

Key words

UCAV / Dubins path / headings / Nash equilibrium / game theory

Cite this article

Download Citations
Li DAI , Zheng XIE. Headings of UCAV Based on Nash Equilibrium. Journal of Systems Science and Information, 2018, 6(3): 269-276 https://doi.org/10.21078/JSSI-2018-269-08

1 Introduction

The trajectory of machines like UCAV (unmanned combat aircraft vehicles) or Robots is constrained by their kinematics. A suitable and useful mathematical model is DTSP (Dubins traveling salesman problem). Let P={p1,p2,,pn} be a set of n vertices in a plane. The well-known TSP is to find the shortest tour (closed path) visiting every vertex once and only once. Depending on the definition of the distance between any pair of n vertices, TSP can be divided into two branches. One is the Euclidean traveling salesman problem (ETSP), in which the distance is defined as the Euclidean distance. The other is the Dubins traveling salesman problem (DTSP), in which the distance is defined as the Dubins distance. See Figure 1: The red line is the tour of ETSP, and the blue dotted line is the DTSP. DTSP differs from ETSP with respect to the constraint on the curvature of the tour; that is, the tour of the DTSP is smooth enough that the tour curvature is limited by 1/ρ, where ρ is the minimal turning radius.
Figure 1 ETSP and DTSP

Full size|PPT slide

It is well known that the ETSP is NP-complete[1, 2] and that the DTSP is also NP-complete[3].
Let ETSP(P) denote the length of the shortest tour of the ETSP over P. Correspondingly, let DTSPρ(P) denote the length of the shortest Dubins tour of the DTSP over P with the minimal turning radius ρ. Note that both ETSP(P) and DTSPρ(P) are in connection with the visiting order, and the optimal ETSP ordering of P might be quite different with the optimal DTSP ordering. So, we denote LETSP(i1,i2,,in)(P) as the Euclidean distance, and LDTSP(i1,i2,,in),ρ(P) as the Dubins distance over P with the visiting order (i1,i2,,in). Without confusion, they are abbreviated as LETSP(P) and LDTSPρ(P).
If the visiting order (i1,i2,,in) is given, then LETSP(P) can be calculated as
LETSP(P)=|pi1pi2|+|pi2pi3|++|pin1pin|+|pinpi1|,
but the LDTSPρ(P) is still not easy to calculate because of unknown optimal headings. Let LDTSPρ,θ(P) be the Dubins distance with the visiting order (i1,i2,in) and the headings θ=(θi1,θi2,,θin) are given. Obviously,
LDTSPρ,θ(P)=k=1nLDρ,(θik,θik+1)(pik,pik+1),
where LDρ,(θik,θik+1)(pik,pik+1) is the Dubins distance between pik and pik+1, and θin+1=θi1, pin+1=pi1.
To calculate LDρ,(θik,θik+1)(pik,pik+1), we can use the famous results given by Dubins in 1957[4]. Dubins gave a sufficient set of paths (the Dubins set) that always contains the shortest path between any two vertices with given headings. The famous Dubins set is
{LSL,RSR,LSR,RSL,RLR,LRL},
which includes six admissible paths, where L and R are arcs of the minimal allowed turning radius ρ turning left or turning right, respectively, and S is a segment. In 2001, Shkel and Lumelsky considered the logical classification scheme, which allows one to extract the shortest path from the Dubins set directly, without explicitly calculating the candidate paths for the 'long path case'[5]. In [5], the initial vertex is (0,0;α), the terminal vertex is (d,0;β) and the radius ρ=1.
For DTSP, there are n vertices and cases with any initial and terminal vertices need to be considered. In 2015, Dai and Xie provided the formula to directly calculate the length of the Dubins distance with any initial vertex (x0,y0;α), terminal vertex (x1,y1;β) and minimal turning radius ρ>0[6].
Obviously, there are two key problems in the DTSP: Determining the optimal visiting order (i1,i2,,in) and the optimal headings (θi1,θi2,,θin). The majority of papers focus on the optimal visiting order rather than the optimal headings. Savla, et al.[7] presented a way to calculate the headings of each vertex, and based on this type of heading, they provided an algorithm called Alternating Algorithm (AA), and they proved that the upper bound of DTSP is
DTSPρ(P)ETSP(P)+κn2πρ,
(1)
where κ[2.657,2.658]. In 2013, Kim and Cheong proved in their paper[8] that κ=73.
We focus on determining the optimal headings. Note that the fitness of one vertex's head depends on both its own and the neighbor's as well. So, we modeled this problem based on Game Theory, which has received an increasing amount of attention as a promising technique for formulating action strategies for agents in complex situations. The priority of Game Theory in solving control and decision-making problems has been shown in many studies[9-15].
Finding the Nash Equilibrium in an n-player (n3) game has been proved to be PPAD-complete[16, 17]. So, the problem of computing Nash Equilibria in games is computationally extremely difficult, if not impossible. But, for a two-player game, it is much easier to solve.
The main contribution of this paper is the development of a game theoretic approach to the DTSP. The headings of vertices are handled from a game theoretic perspective and obtained efficiently by solving the mixed Nash equilibrium of an n-player two-strategy game.
The remainder of the paper is organized as follows: Section 2 is the main body of this paper, it describes the Game Theory model for determining better headings and introduces the theory of the Nash Equilibrium. Section 3 concludes our conclusions.

2 Game Theory model

Let p1,p2,,pn be n vertices in a plane. Without loss of generality, let the optimal visiting order be p1p2pnp1. So, there are n players p1,p2,,pn in the game. Each player pi choose it's heading θi[0,2π] (i=1,2,,n). The heading θi is seen as a pure strategy in the game. We defined the payoff of player pi as the Dubins distance LDρ,(θi,θi+1)(pi,pi+1) between pi and pi+1 for all i{1,2,,n} (where pn+1=p1).
The Dubins distance changes greatly with the headings. So, which heading θi to choose is the key problem. When θi ranges from 0 to 2π, there is an infinitely pure strategy for each vertex pi. The problem is difficult to solve because the Dubins distance is not a continuous function.

2.1 Pure Strategies and Payoffs

In this section, we consider the length of the optimal Dubins tour with two vertices. Let A(0,0) be the initial vertex and B(d,0) be the terminal vertex. DTSPρ(A,B) denotes the length of the optimal Dubins tour. We have the following Proposition 1, which suggests that there are only two pure strategies that need to be considered for each vertex.
Proposition 1 For two vertices, A(0,0) and B(d,0),
(ⅰ) If d2ρ, DTSPρ(A,B)=2πρ+2(d2ρ).
(ⅱ) If d2ρ, DTSPρ(A,B)=2πρ.
Proof (ⅰ) Based on the Dubins result in 1957, the minimum length between any two vertices must lie in the following Dubins set {RSR,RSL,LSR,LSL,RLR,LRL}. By noting that these six types of Dubins paths all start and end with an arc with radius ρ, let A and B be the center and ρ be the radius; we draw two circles A and B. See the circles with dotted lines in Figure 2.
Figure 2 The Dubins tour for two vertices with ρd/2

Full size|PPT slide

It can be observed that given any arc with radius ρ and which A lies on, its center must lie in the circle A. This is also the case with B. Then, the Dubins tour depends on the position of the center. Without loss of generality, we set the centers as OA and OB. See Figure 2. The length of the Dubins tour is 2πρ+2OAOB. And the minimum length of OAOB is d2ρ.
(ⅱ) If d2ρ, A and B must intersect at two places. See Figure 3. Let A and B intersect at OA and OB. Then, both A and B lie in the circle OA(or OB) with radius ρ. So the minimum length of the Dubins tour must not be larger than 2πρ.
Figure 3 The Dubins tour for two vertices with ρd/2

Full size|PPT slide

On the other hand, when we travel from the initial vertex A with the minimal turning radius ρ and come back to A, the minimal length needed is 2πρ regardless of whether B is visited.
Proposition 1 gives the optimal Dubins tour for two vertices, and it also tells us the optimal headings θA of initial vertex A and θB of the terminal vertex B, that is,
θA={π/2,ρd2,π/2+arccosd2ρ,ρd2.θB={π/2,ρd2,π+arcsind2ρ,ρd2.
(2)
Further, for any initial vertex A(xA,yA) and terminal vertex B(xB,yB), let θAB be the angle of vector AB. Ihe optimal headings θ^A and θ^B would be
θ^A=θA+θAB,θ^B=θB+θAB.
(3)
There are n players p1,p2,,pn in the game. Let Si be the strategy set of pi then, by Proposition 1, we have
Si={θi1,θi2},
where θi1 and θi2 are both determined by (3), and θi1 is the heading of pi when pi1 is the initial vertex and pi is the terminal vertex, θi2 is determined by pi is the initial vertex and pi+1 is the terminal vertex.
The payoff of pi is defined as
ui=LDρ,(θi,θi+1)(pi,pi+1)+LDρ,(θi1,θi)(pi1,pi),i=1,2,,n.
(4)
From a game theory point of view, each vertex tries to minimize its own payoff by calculating a pure strategy Nash equilibrium or a mixed Nash Equilibrium. In this way, we get an n-players two strategies game.

2.2 Nash Equilibrium

Let (θ1i1,θ2i2,,θnin) be a pure strategy Nash equilibrium, Note that each vertex tries to minimize its own payoff for the length of DTSP, so we have
uj(θjij,θj)uj(θjik,θj),θjikSj,j{1,2,,n}.
(5)
If player pi has a pure strategy in a Nash equilibrium, from (5), we need to make 2n1 comparisons for each ui.
If player pi does not have a pure strategy, we need to consider its mixed strategy.
Let xi=(xi,1xi) be the mixed strategy of player pi, that is, the heading θi1 of vertex pi is selected with a probability of xi and θi2 is selected with a probability of 1xi (i=1,2,,n), where xi(0,1). Let x=(x1,x2,,xn) be the mixed Nash equilibrium. We have the following proposition 2.
Proposition 2 Let x=(x1,x2,,xn) be the mixed Nash equilibrium, where xi=(xi,1xi) and xi0 (i=1,2,,n), then, for any i{1,2,,n},
[LDρ,(θi1,1,θi,2)(pi1,pi)LDρ,(θi1,2,θi,2)(pi1,pi)LDρ,(θi1,1,θi,1)(pi1,pi)+LDρ,(θi1,2,θi,1)(pi1,pi)]xi1+[LDρ,(θi,2,θi+1,1)(pi,pi+1)LDρ,(θi,2,θi+1,2)(pi,pi+1)LDρ,(θi,1,θi+1,1)(pi,pi+1)+LDρ,(θi,1,θi+1,2)(pi,pi+1)]xi+1+[LDρ,(θi1,2,θi,2)(pi1,pi)+LDρ,(θi,2,θi+1,2)(pi,pi+1)LDρ,(θi1,2,θi,1)(pi1,pi)LDρ,(θi,1,θi+1,2)(pi,pi+1)]=0.
Proof Since x is a mixed Nash equilibrium, we have
xi(Ei(x)Ei(θi1,xi))=0,  (1xi)(Ei(x)Ei(θi2,xi))=0,i{1,2,,n}.
Because of xi0, then Ei(θi1,xi)=Ei(x)=Ei(θi2,xi). By the definition of expected payoff,
Ei(θ,xi)=(LDρ,(θi1,1,θ)(pi1,pi)+LDρ,(θ,θi+1,1)(pi,pi+1))xi1xi+1                +(LDρ,(θi1,1,θ)(pi1,pi)+LDρ,(θ,θi+1,2)(pi,pi+1))xi1(1xi+1)                +(LDρ,(θi1,2,θ)(pi1,pi)+LDρ,(θ,θi+1,1)(pi,pi+1))(1xi1)xi+1                +(LDρ,(θi1,2,θ)(pi1,pi)+LDρ,(θ,θi+1,2)(pi,pi+1))(1xi1)(1xi+1)              =LDρ,(θi1,1,θ)(pi1,pi)xi1+LDρ,(θi1,2,θ)(pi1,pi)(1xi1)                +LDρ,(θ,θi+1,1)(pi,pi+1)xi+1+LDρ,(θ,θi+1,2)(pi,pi+1)(1xi+1).              =[LDρ,(θi1,1,θ)(pi1,pi)LDρ,(θi1,2,θ)(pi1,pi)]xi1                +[LDρ,(θ,θi+1,1)(pi,pi+1)LDρ,(θ,θi+1,2)(pi,pi+1)]xi+1                +[LDρ,(θi1,2,θ)(pi1,pi)+LDρ,(θ,θi+1,2)(pi,pi+1)].
then by Ei(θi1,xi)=Ei(θi2,xi), we have
    [LDρ,(θi1,1,θi,2)(pi1,pi)LDρ,(θi1,2,θi,2)(pi1,pi)]xi1    +[LDρ,(θi,2,θi+1,1)(pi,pi+1)LDρ,(θi,2,θi+1,2)(pi,pi+1)]xi+1    +[LDρ,(θi1,2,θi,2)(pi1,pi)+LDρ,(θi,2,θi+1,2)(pi,pi+1)]=[LDρ,(θi1,1,θi,1)(pi1,pi)LDρ,(θi1,2,θi,1)(pi1,pi)]xi1    +[LDρ,(θi,1,θi+1,1)(pi,pi+1)LDρ,(θi,1,θi+1,2)(pi,pi+1)]xi+1    +[LDρ,(θi1,2,θi,1)(pi1,pi)+LDρ,(θi,1,θi+1,2)(pi,pi+1)].
That is,
[LDρ,(θi1,1,θi,2)(pi1,pi)LDρ,(θi1,2,θi,2)(pi1,pi)  LDρ,(θi1,1,θi,1)(pi1,pi)+LDρ,(θi1,2,θi,1)(pi1,pi)]xi1+[LDρ,(θi,2,θi+1,1)(pi,pi+1)LDρ,(θi,2,θi+1,2)(pi,pi+1)  LDρ,(θi,1,θi+1,1)(pi,pi+1)+LDρ,(θi,1,θi+1,2)(pi,pi+1)]xi+1+[LDρ,(θi1,2,θi,2)(pi1,pi)+LDρ,(θi,2,θi+1,2)(pi,pi+1)  LDρ,(θi1,2,θi,1)(pi1,pi)LDρ,(θi,1,θi+1,2)(pi,pi+1)]=0.
By Proposition 2, we will get n linear equations. The solution of the equations will be the mixed Nash equilibrium.
Above all, we set up a Game Theory model for headings of DTSP. The vertice p1,p2,,pn are the players, and each player has two strategies given by (3), and the payoff defined as (4). For the Nash equilibrium, we first discuss whether it has a pure strategy Nash equilibrium. If it does not have a pure strategy for some players, we obtain its mixed strategy from Proposition 2 by solving linear equations.

2.3 An Example

Let p1(0,0),p2(1,1),p3(3,0) be three vertices in a plane and ρ=1. By (3), we get the strategies Si of each player pi,
S1={3.1416,1.5708},S2={4.7124,1.1071},S3={4.2488,4.7124}.
The payoffs are as follows:
LDρ,(θ11,θ21)(p1,p2)=4.7124,LDρ,(θ11,θ22)(p1,p2)=6.7325,LDρ,(θ12,θ21)(p1,p2)=5.7778,LDρ,(θ12,θ22)(p1,p2)=7.6407,LDρ,(θ21,θ31)(p2,p3)=8.2035,LDρ,(θ21,θ32)(p2,p3)=8.5193,LDρ,(θ22,θ31)(p2,p3)=3.3776,LDρ,(θ22,θ32)(p2,p3)=3.2406,LDρ,(θ31,θ11)(p3,p1)=3.2911,LDρ,(θ31,θ12)(p3,p1)=3.8706,LDρ,(θ32,θ11)(p3,p1)=3.8578,LDρ,(θ32,θ12)(p3,p1)=4.1416.
By the formula of payoff (4), we have Table 1, and from this Table, we know that for all i,j{1,2},
u1(θ11,θ2i,θ3j)<u1(θ12,θ2i,θ3j),u2(θ1i,θ21,θ3j)>u2(θ1i,θ22,θ3j),u3(θ1i,θ2j,θ31)<u3(θ1i,θ2j,θ32),
Table 1 Payoff of the three players
u1 u2 u3
θ11,θ21,θ31 8.0035 12.9159 11.4946
θ11,θ21,θ32 8.5702 13.2317 12.3771
θ11,θ22,θ31 10.0236 10.1101 6.6687
θ11,θ22,θ32 10.5903 9.9731 7.0984
θ12,θ21,θ31 9.6484 13.9813 12.0741
θ12,θ21,θ32 9.9194 14.2971 12.6609
θ12,θ22,θ31 11.5113 11.0183 7.2482
θ12,θ22,θ32 11.7823 10.8813 7.3822
So, there exists a pure strategy Nash equilibrium (θ11,θ22,θ31). The length of the DTSP is
LDTSPρ(P)=12(u1(θ11,θ22,θ31)+u2(θ11,θ22,θ31)+u3(θ11,θ22,θ31))=13.4012.
The tour curvature is given in Figure 4.
Figure 4 DTSP with heading given by a pure strategy Nash equilibrium

Full size|PPT slide

The length of ETSP is ETSP(P)=|p1p2|+|p2p3|+|p3p1|=6.6503, and
ETSP(P)+73n2πρ21.3111.
So, the length of DTSP with headings given by the Nash equilibrium is much smaller than the upper bound given by (1).

3 Conclusions

In this paper, we study the headings of the DTSP, which is widely used in the field, e.g., UCAV and Robots. It is important to note that we design the headings based on Game Theory. It is the Nash equilibrium that makes our headings quite different.
The simple example in Subsection 2.3 shows that the n-player two-strategy Game Theory model provides fairly good headings. In addition, the Nash equilibrium in this model is a pure strategy. The mixed Nash equilibrium can be easily obtained by solving the linear equations given in Proposition 2.
All vertices p1,p2,,pn are in a plane. So, how to determine the headings when the points are located in the three-dimensional space is worth further study. Since any three vertices in a three-dimensional space are coplanar, the presented results in this paper also give a new interesting insight into the three-dimensional DTSP problem.

References

1
Papadimitriou C. The Euclidean traveling salesman problem is NP-complete. Theor. Comp. Sci., 1977, 4 (3): 237- 244.
2
Garey M R, Graham R L, Johnson D S. Some NP-complete geometric problems. Proc. 8th Annu. ACM Symp. Theory Comp., 1976, 10- 22.
3
Ny J L, Feron E, Frazzoli E. On the Dubins traveling salesman problem. IEEE Transactions on Autonatic Control, 2012, 57, 265- 270.
4
Dubins L E. On curves of minimal length with a constraint on average curvature and with prescribed initial and terminal positions and Tangents. American Journal of Mathematics, 1957, 79, 497- 516.
5
Shkel A M, Lumelsky V. Classification of the Dubins set. Robotics and Automonous Systems, 2001, 34, 179- 202.
6
Dai L, Xie Z. On the length of dubins path with any initial and terminal configurations. Pure and Applied Mathematics Journal, 2015, 4, 248- 254.
7
Savla K, Frazzoli E, Bullo F. Traveling salesman problems for the Dubins vehicle. IEEE Transections on Automatic Control, 2008, 53, 1378- 1391.
8
Kim H S, Cheong O. The cost of bounded curvature. Computational Geometry: Theory and Applications, 2013, 46, 648- 672.
9
Dixon W. Optimal adaptive sontrol and differential games by reinforcement learning principles. Journal of Guidance, Control, and Dynamics, 2014, 37 (3): 1048- 1049.
10
Gu D. A game theory approach to target tracking in sensor networks. IEEE Transactions Systems, Man and Cybernetics, Part B: Cybernetices, 2011, 41 (1): 2- 13.
11
Duan H, Wei X, Dong Z. Multiple UCVAs cooperative air combat simulation platform based on PSO, ACO, and game theory. IEEE Aerospace and Electronic Systems Magazine, 2013, 28 (11): 12- 19.
12
Duan H, Pei L, Yuan Y X. A predator-prey particle swarm optimization approach to multiple UCAV air combat modeled by dynamic game theory. IEEE Journal of Automatica Sinica, 2015, 2 (1): 11- 18.
13
Wang M, Du Z, Duan H. Study on participant behavior game of electronic products reverse supply chain based on ECP. Journal of Systems Science and Information, 2017, 5 (5): 441- 434.
14
Wu J, Yang H, Cheng Y. Domino effect analysis, assessment and prevention in process industries. Journal of Systems Science and Information, 2015, 3 (6): 481- 498.
15
Dai Y, Gao Y. Real-time pricing decision based on leader-follower game smart grid. Journal of Systems Science and Information, 2015, 3 (6): 481- 498.
16
Porter R, Nudelman E, Shoham Y. Simple search methods for finding a Nash equilibrium. Games and Economic Behavior, 2008, 63 (2): 642- 662.
17
Chen X, Deng X, Teng S H. Settling the complexity of computing two-player Nash equilibrium. Journal of the ACM, 2009, 56 (3): Article No.14.

Funding

Research Programme of National University of Defense Technology(JC14-02-10)
PDF(209 KB)

186

Accesses

0

Citation

Detail

Sections
Recommended

/