• No results found

Game Theory

N/A
N/A
Protected

Academic year: 2023

Share "Game Theory"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

OPERATIONS RESEARCH

Chapter 2

Game Theory

Prof. Bibhas C. Giri

Department of Mathematics Jadavpur University

Kolkata, India

Email: bcgiri.jumath@gmail.com

(2)

1.0 Introduction

Game theory was developed for decision making under conflicting situations where there are one or more opponents (players). The games like chess, poker, etc. have the characteristics of competition and are played according to some definite rules. Game theory provides optimal solutions to such games, assuming that each of the players wants to maximize his profit or minimize his loss.

Game theory has applications in a variety of areas including business and eco- nomics. In 1994, the Nobel Prize for Economic Sciences was won by John F. Nash, Jr., John C. Harsanyi, and Reinhard Selton for their analysis of equilibria in the theory of noncooperative games. Later, in 2005, Robert J. Aumann and Thomas C. Schelling won the Nobel Prize for Economic Sciences for enhancing our understanding of con- flict and cooperation through game theory analysis.

(3)

MODULE - 1: Basic Concept and Terminologies, Two-person Zero-sum Game, and Game with Pure and Mixed Strategies

In this Module, we will discuss some basic terminologies used in Game Theory, two- person zero-sum game and games with pure and mixed strategies.

1.1 Basic Terminologies

The following terminologies are commonly used in Game theory.

Player : Each participant (interested party) of a game is called a player.

Strategy :The strategy of a player is the predetermined rule by which a player decides his course of action from the list of courses of action during the game. A strategy may be of two types:

Pure strategy -It is a decision, in advance of all plays, always to choose a partic- ular course of action.

Mixed strategy - It is a decision, in advance of all plays, to choose a course of action for each play in accordance with some particular probability distribution.

Optimal strategy : The course of action which maximizes the profit of a player or minimizes his loss is called an optimal strategy.

Payoff:The outcome of playing a game is called payoff.

Payoff matrix : When the players select their particular strategies, the payoffs (gains or losses) can be represented in the form of a matrix called the payoffmatrix.

3

(4)

Saddle point : A saddle point is an element of the payoff matrix, which is both the smallest element in its row and the largest element in its column. Furthermore, the saddle point is also regarded as an equilibrium point in the theory of games.

Value of the game : It refers to the expected outcome per play when players follow their optimal strategy.

1.2 Two-Person Zero-Sum Game

A game with only two players is called a two-person zero-sum game if the losses of one player are equivalent to the gains of the other so that the sum of their net gains is zero. This game also known asrectangular game.

In a two-person game, suppose that the playerAhasmactivities and the playerBhas nactivities. Then a payoffmatrix can be formed by adopting the following rules:

(i) Row designations for each matrix are activities available to the playerA.

(ii) Column designations for each matrix are activities available to the playerB.

(iii) Cell entryvij is the payment to the playerAinA’s payoffmatrix whenAchooses the activityi andBchooses the activityj.

(iv) For a zero-sum game, the cell entry in the playerB’s payoffmatrix will be nega- tive corresponding to the cell entryvij in the playerA’s payoffmatrix so that the sum of payoff matrices for the playersAandB is ultimately zero, see Tables 1.1 and 1.2.

PlayerB

1 2 ··· n

PlayerA

1 v11 v12 ··· v1n

2 v21 v22 ··· v2n ... ... ... ... ... m vm1 vm2 ··· vmn

Table 1.1:PlayerA’s payoffmatrix

PlayerB

1 2 ··· n

PlayerA

1 −v11 −v12 ··· −v1n

2 −v21 −v22 ··· −v2n

... ... ... ... ... m −vm1 −vm2 ··· −vmn Table 1.2:PlayerB’s payoffmatrix

Consider a two-person coin tossing game. Each player tosses an unbiased coin simul- taneously. Each player selects either a head H or a tail T. If the outcomes match (i.e., (H, H) or (T, T)) then A wins Rs. 4 from B; otherwise, B wins Rs. 3 from A. Player A’s

(5)

payoff matrix is given in Table 1.3. This game is a two-person zero-sum game, since the winning of one player is taken as losses for the other. Each player has his choice from amongst two pure strategies H and T.

Player B

H T

Player A H 4 -3

T -3 4

Table 1.3

1.3 Pure Strategies (Minimax and Maximin Criterion)

The simplest type of game is one where the best strategies for both players are pure strategies. This is the case if and only if, the payoffmatrix contains a saddle point.

Theorem 1.1: Let (vij) be the m×npayoff matrix for a two-person zero-sum game. If v denotes the maximin value andv¯the minimax value of the game, thenv¯≥v ¯

¯. That is, minj [max

i {vij}]≥max

i [min

j {vij}].

Proof: We have

maxi {vij} ≥vij f or all j = 1,2, ..., n and min

j {vij} ≤vij f or all i= 1,2, ..., m

Let the above maximum and minimum values be attained ati =i1 and j=j1, respec- tively, i.e.,

maxi {vij}=vi1j and min

j {vij}=vij1 Then, we must have

vi1jvijvij1 f or all j = 1,2, ..., n; f or all i= 1,2, ..., m.

From this, we get

minj vi1jvijmax

i vij1 f or all j = 1,2, ..., n; i= 1,2, ..., m.

Therefore, min

j [max

i {vij}]≥max

i [min

j {vij}].

(6)

Note: A game is said to be fair, if v

¯ = 0 = ¯v and it is said to be strictly determinable if

¯v=v= ¯v.

Example 1.1: Consider a two-person zero-sum game matrix which represents payoff to the player A, see Table 1.4. Find the optimal strategy, if any.

Player B

Player A

I II III IV V

I -2 0 0 5 3

II 4 2 1 3 2

III -4 -3 0 -2 6

IV 5 3 -4 2 -6

Table 1.4:Payoffmatrix for Example 1.1

Solution: We use the maximin (minimax) principle to determine the optimal strategy.

The playerAwishes to obtain the largest possiblevij by choosing one of his activities (I, II, III, IV), while the playerBis determined to makeA’s gain the minimum possible by choice of activities from his list (I, II, III, IV, V). The playerAis called themaximiz- ing playerandB, theminimizing player. If playerAchooses the activity I then it could

Player B

Player A

I II III IV V Row minimum

I -2 0 0 O5 3 -2

II 4 2 O1 3 2 1←Maximin

III -4 -3 0 -2 O6 -4

IV O5 O3 -4 2 -6 -6

Column maximum O5 O3 O1 O5 O6

↑Minimax

Table 1.5:Player A’s payoffmatrix

happen that the playerBalso chooses his activity I. In this case, the playerBcan guar- antee a gain of at least−2 to playerA, i.e.,min{−2,0,0,5,3}=−2. Similarly, for other choices of playerA, i.e., activities II, III and IV,B can force the playerAto gain only 1, −4 and−6, respectively, by proper choices from (II, III, IV) i.e.,min{4,2,1,3,2}= 1, min{−4,−3,0,−2,6}=−4 andmin{5,3,−4,2,−6}=−6. For playerA, minimum value in each row represents the least gain to him if he chooses his particular strategy. These

(7)

are written in Table 1.5 by row minimum. Player A will select the strategy that maxi- mizes his minimum gains, i.e.,max{−2,1,−4,−6}= 1 i.e., playerAchooses the strategy II. This choice of playerAis called the maximin principle, and the corresponding gain (here 1) is called the maximin value of the game. In general, the playerAshould try to maximize his least gains or to findmax

i min

j vij=v

¯.

For player B, on the other hand, likes to minimize his losses. The maximum value in each column represents the maximum loss to him if he chooses his particular strat- egy. These are written in Table 1.5 by column maximum. Player B will then select the strategy that minimizes his maximum losses. This choice of playerBis called the minimax principle, and the corresponding loss is the minimax value of the game. In this case, the value is also 1 and playerBchooses the strategy III. In general, the player Bshould try to minimize his maximum loss or to findmin

j max

i vij= ¯v.

If the maximin value equals the minimax value then the game is said to have a saddle point (here (II, III) cell) and the corresponding strategies are called optimum strategies. The amount at the saddle point is known as the value of the game.

Example 1.2: Solve the game whose payoffmatrix is given below:

Player B

Player A

I II III I -2 15 -2 II -5 -6 -4 III -5 20 -8

Table 1.6:A’s payoffmatrix

Solution: We use the maximin (minimax) principle to determine the optimal strategy.

The game has two saddle points at positions (1, 1) and (1, 3).

Player B

Player A

I II III Row minimum

I -2O 15 -2O -2←Maximin

II -5 -6O -4 -6

III -5 20 -8O -8

Column maximum -2O 20O -2O

↑Minimax ↑Minimax

(8)

(i) The best strategy for playerAis I.

(ii) The best strategy for playerBis either I or III.

(iii) The value of the game is−2 for playerAand +2 for playerB.

1.4 Mixed Strategy: Game without A Saddle Point

Ifmaxmin valueis not equal tominimax valuethen the game is said to have no saddle point. In such a case, both the players must determine an optimal mixture of strategies to find an equilibrium point. The optimal strategy mixture for each player may be de- termined by assigning to each strategy its probability of being chosen. The strategies so determined are calledmixed strategies.

The value of game obtained by the use of mixed strategies represents the least payoffwhich playerAcan expect to win and the least payoffwhich playerBcan expect to lose. The expected payoff to a player in a game with payoffmatrix [vij]m×ncan be defined as

E(p,q) =

m

i=1

n

j=1

pivijqj =pvqT

wherep= (p1, p2, ..., pm) andq= (q1, q2, ..., qn) denote probabilities or relative frequency with which a strategy is chosen from the list of strategies associated with m strate- gies of player A and n strategies of player B, respectively. Obviously, pi ≥ 0 (i = 1,2, ...m), qj≥0(j= 1,2, ..., n) andp1+p2+...+pm= 1;q1+q2+...+qn= 1.

Theorem 1.2: For any2×2two-person zero-sum game without any saddle point having

the payoff matrix for player A given by





B1 B2 A1 v11 v12 A2 v21 v22



, the optimal mixed strategies SA=



 A1 A2

p1 p2



 andSB =



 B1 B2

q1 q2



 are determined by pp1

2 = vv22−v21

11−v12, qq1

2 = vv22−v12

11−v21 where p1+p2= 1andq1+q2= 1. The valuevof the game toAis given byv=v v11v22−v21v12

11+v22−(v12+v21). Proof: Let a mixed strategy for playerAbe given bySA=



 A1 A2

p1 p2



, wherep1+p2= 1.

Thus, if playerBmovesB1then the net expected gain ofAwill beE1(p) =v11p1+v21p2 and ifBmovesB2, the net expected gain ofAwill beE2(p) =v12p1+v22p2.

Similarly, ifBplays his mixed strategySB =



 B1 B2 q1 q2



, whereq1+q2 = 1, thenB’s net expected loss will beE1(q) =v11q1+v12q2ifAplaysA1, andE2(q) =v21q1+v22q2ifA

(9)

playsA2. The expected gain of playerA, whenBchooses his moves with probabilities q1 and q2, is given by E(p,q) =q1[v11p1+v21p2] +q2[v12p1+v22p2]. Player A would always try to mix his moves with such probabilities so as to maximize his expected gain.

Now, E(p,q) =q1[v11p1+v21(1−p1)] + (1−q1)[v12p1+v22(1−p1)]

= [v11+v22−(v12+v21)]p1q1+ (v12v22)p1+ (v21v22)q1+v22

=λ (

p1v22v21 λ

)(

q1v22v12 λ

)

+v11v22v12v21 λ

whereλ=v11+v22−(v12+v21).

We see that ifAchoosesp1= v22−vλ 21, he ensures an expected gain of at least (v11v22v12v21)/λ. Similarly, ifBchoosesq1 = v22−vλ 12, then he can limit his expected loss to at most (v11v22v12v21)/λ. These choices ofp1 and q1 will thus be optimal to the two players. Thus, we get

p1=v22v21

λ = v22v21

v11+v22−(v12+v21) andp2= 1−p1= v11v12 v11+v22−(v12+v21) q1=v22v12

λ = v22v12

v11+v22−(v12+v21) andq2= 1−q1= v11v21 v11+v22−(v12+v21) andv= v11v22v12v21

v11+v22−(v12+v21) Hence, we have

p1

p2 = v22v21 v11v12, q1

q2 =v22v12

v11v21; and v= v11v22v21v12

v11+v22−(v12+v21). (1.1) Note:The above formula forp1,p2,q1,q2andvare valid only for 2×2 games without saddle point.

Example 1.3: Suppose that in a game of matching coins with two players, one player wins Rs. 2 when there are 2 heads, and gets nothing when there are 2 tails and looses Re 1 when there are one head and one tail. Determine the best strategies for each player and the value of the game.

Solution: The payoffmatrix for playerAis given in Table 1.7. The game has no saddle point. Let the playerAplaysH with probabilityxandT with probability 1−x. Then, if the playerBplaysH, thenA’s expected gain is

E(A, H) =x(2) + (1x)(−1) = 3x−1.

If the playerBplaysT,A’s expected gain is

E(A, T) =x(−1) + (1−x).0 =−x.

(10)

Player B

H T

Player A H 2 -1

T -1 0

Table 1.7:Player A’s payoffmatrix

If the player Achoosesx such that E(A, H) =E(A, T) =E(A) say, then this will deter- mine best strategy for him. Thus we have 3x−1 =−x orx= 1/4. Therefore, the best strategy for the playerAis to playH andT with probability 1/4 and 3/4, respectively.

Therefore, the expected gain for playerAis E(A) =1

4(2) +3

4(−1) =−1 4.

The same procedure can be applied for playerB. If the probability ofB’s choice ofH isy and that ofT is 1−ythen for the best strategy of the playerB,

E(B, H) =E(B, T) which givesy= 1/4. Therefore, 1−y= 3/4.

Thus A’s optimal strategy is (1/4,3/4) and B’s optimal strategy is (1/4,3/4). The expected value of the game is−1/4 to the playerA.

This result can also be obtained directly by using the formulae (1.1).

References

Related documents

Daystar Downloaded from www.worldscientific.com by INDIAN INSTITUTE OF ASTROPHYSICS BANGALORE on 02/02/21.. Re-use and distribution is strictly not permitted, except for Open

In Section 3, duality theory for linear programming problems with fuzzy parameters is introduced, while the main result, that a two person zero sum matrix game with fuzzy pay-o(s

With such a huge class of potentially interesting S-matrices, a natural question is which of these boundary S-matrices exhibit linear Regge trajectories (we will refer to this

Red circles denote the sources with H − K > 1, sources with 1 H − K > 0.6 are shown with magenta crosses, blue circles denote the Class II-like objects detected from our

The background values at 1100 Å, or rather the best-fit model values at 1100 Å assuming a B star template, are plotted in Figure 4, highlighting both the faintest observations

wide companions to TYC 2930, Section 6 describes our search for photometric variability and any potential transits of the inner companion, Section 7 discusses the tidal evolution of

This rotation period agrees with the one derived above (13.16 days) using the SuperWASP photometry data.. However, we should note here that this chromospheric activity–rotation

* In the event of any discrepancy detected by any candidate, he/she has to bring it to the notice of the Controller of Examinations within 1 (one) month from the date of declaration