A Generalization of Quantal Response Equilibrium via Perturbed Utility

We present a tractable generalization of quantal response equilibrium via non-expected utility preferences. In particular, we introduce concave perturbed utility games in which an individual has strategy-specific utility indices that depend on the outcome of the game and an additively separable preference to randomize. We generalize logit best responses in three directions. First, the desire to randomize can depend on opponents' strategies. Second, we show how to derive a nested logit best response function. Lastly, we present tractable quadratic perturbed utility games that allow complementarity.


Introduction
This paper generalizes quantal response equilibrium (QRE) using the standard Nash equilibrium concept via the class of concave perturbed utility games (PUGs). QRE, as studied in McKelvey and Palfrey [1995] and McKelvey and Palfrey [1998], assumes that individuals maximize expected utility, but that there is an unobserved additive random shock to a base utility index that drives deviations from Nash equilibrium behavior. In contrast, perturbed utility games (PUGs) specify a deterministic non-expected utility function with a preference for randomization that is additively separable from a base utility index. Nonetheless, any QRE can be modeled as a Nash equilibrium of a concave perturbed utility game. 1 Using the Nash equilibrium concept with the more general concave PUGs is useful compared to QRE since it is often difficult to select a distribution of tractable additive random shocks. PUGs facilitate estimation of model parameters via standard maximimum likelihood methods (without numerical integration). Moreover, PUGs allows complementarity between strategies which is not allowed in any QRE. PUGs also can easily incorporate that the probability an individual chooses a strategy as a best response also depends on how likely opponents play their strategies. Importantly, any concave PUG has a Nash equilibrium as an immediate corollary of Debreu [1952]. This existence result allows us to generate different flavors of the logit best response, derive a nested logit best response function, and discuss quadratic perturbed utility games.
The result that QRE can be represented as Nash equilibria of a non-expected utility game matters for interpretation and has practical implications. Concerning interpretation, violations of standard Nash play can be interpreted either as errors of individual perception or as the manifestation of a non-expected utility preference. These two interpretations have different implications for welfare analysis. For example, one may not want perceptual errors to enter welfare calculations, but may want to include all facets of a non-expected utility preference in welfare calculations. On the practical side, it may be easier to place restrictions on perturbation functions than restricting the additive error term of QRE. Analytical results for error distributions are known only in some special cases. In contrast, rich classes of analytical perturbation functions have been studied in Fosgerau et al. [2019] and Monardo [2019].
We mention a non-extensive summary of related work. A perturbed utility game is similar to the control costs approach developed in van Damme [1987] whereby there is a cost associated to the ability to control a "tremble." Stahl [1990] looks at a game with trembles and an entropic cost to control the trembles. Mattsson and Weibull [2002] give axioms for entropic control functions for individual decisions and deals with a continuum of alternatives. With regards to QRE, the regular quantal response equilibrium of Goeree et al. [2005] places additional assumptions on QRE and a textbook treatment of QRE is available in Goeree et al. [2016]. 2 We also briefly describe some history regarding non-expected utility games and QRE. Shortly after the study of individual non-expected utility functions in Machina [1982], there was some interest to study non-expected utility in strategic games following Nash [1951] and Von Neumann and Morgenstern [1953]. For example, equilibrium concepts, existence of equilibria, and dynamic consistency properties when individuals do not satisfy the independence axiom were studied in Crawford [1990] and Dekel et al. [1991]. However, there is little applied work that resulted from this research. In contrast, quantal response equilibria was developed a few years later in McKelvey and Palfrey [1995] and McKelvey and Palfrey [1998], and has been extensively used in applications to account for deviations from Nash equilibrium. By linking these two approaches, we show applied researchers that Nash equilibrium with non-expected utility preferences provides a rich and tractable avenue to account for deviations from Nash equilibrium with expected utility preferences. Finally, PUGs model individuals with a deterministic preference for randomization following Machina [1985] and relate more broadly to the stochastic choice literature. 3 We say an individual has a preference for randomization when the individual plays the game as if they randomize their play according to a most-preferred distribution of pure strategies. We assume throughout that individuals commit to 2 Goeree et al. [2016] also links the development of QRE to games with decision errors that dates back to Selten [1975].
3 Perturbed utility preferences are tractable and have been studied general for individual stochastic choice [McFadden and Fosgerau, 2012, Fudenberg et al., 2015, Allen and Rehbeck, 2019a, population games [Iijima, 2014], consumer choice [McFadden andFosgerau, 2012, Allen andRehbeck, 2019b], and general equilibrium [Ma, 2017]. randomize according to their most-preferred distribution. 4 While a preference for randomization may seem foreign, there is growing experimental evidence that supports this interpretation [Mosteller and Nogee, 1951, Sopher and Narramore, 2000, Agranov and Ortoleva, 2017, Dwenger et al., 2018, Burghart, 2019, Feldman and Rehbeck, 2020, Agranov and Ortoleva, 2020, Agranov et al., 2020. The most important finding for our purposes is that of Agranov et al. [2020], which finds that individuals who randomize choices in individual decision problems also randomize their choices in strategic environments. Therefore, it may be important to account for a preference of randomization in games.
The rest of this paper is organized as follows. Section 2 describes the structure of a concave perturbed utility game and shows existence of equilibria for all such games. Section 3 derives the logit best response function from a concave perturbed utility game with entropy costs and discusses various flavors of the logit best response. Section 4 examines other forms of concave perturbed utility games. In particular, we examine nested logit equilibria and discusses quadratic perturbed utility games. We give our final remarks in Section 5. The proofs are mathematically simple, but included in an appendix for completeness.
2 Concave Perturbed Utility Games and Existence Consider a finite N -player game with a set t1, . . . , N u of players. Each player n P t1, . . . , N u has a set of pure strategies given by S n " ts n,1 , . . . , s n,Jn u where J n is the number of pure strategies for the nth player. Let a pure strategy profile be defined by s " ps 1 , . . . , s n q P S " ś N n"1 S n where S is the set of all pure strategy profiles. Occasionally, it is useful to represent a strategy profile by s " ps n , s´nq where s n is the strategy of the nth player while s´n contains the pure strategies taken by all other players where s´n P S´n " ś m‰n S m .
Let ∆ n be the set of probability measures on S n . We represent elements of ∆ n by p n P ∆ n " tp n P R Jn | for all j P J n , p n,j ě 0 and ř Jn j"1 p n,j " 1u. The element p n P ∆ n is a mixed strategy for the nth player. Here p n,j is the probability the nth agent plays their jth strategy s n,j . Further, let all mixed strategy profiles be given by ∆ " ś N n"1 ∆ n where a mixed strategy profile is given by p " pp 1 , . . . , p N q P ∆. We use the shorthand p " pp n , p´nq to denote a mixed strategy profile where the nth player plays the mixed strategy p n and all other agents play their corresponding mixed strategy in p´n P ∆´n " ś m‰n ∆ m .
We assume there are observed outcomes for each player n P t1, . . . , N u that depend on the strategy profile s P S denoted by x n,s P X n . For example, X n could be monetary outcomes that depend on the strategies of all players. If an individual has social preferences and cares about other individuals, then X n could be the monetary allocations to all individuals. This means that a motive for cooperation can be modeled in this framework. Finally, X n could be a consumption bundle. For example, if each pure strategy is to bring an item to a picnic, then X n could be the space of ordered tuples of consumption goods. Thus, we view the outcome of a game as covariates similar to Bresnahan and Reiss [1990] and Tamer [2003].
We let x n " px n,s q sPS denote the vector of the nth player's outcomes for all strategy profiles. Let x " px 1 , . . . , x N q denote the collection of all individual observable outcomes. We use the notation x j n " px n,ps n,j ,s´nq q s´nPS´n to denote the vector of outcomes the nth player can obtain for all other combinations of opponent pure strategies when their jth strategy is played. The different values of the outcomes are mapped into a utility index that depends on opponent mixed strategies.
We consider non-expected utility preferences for each player n P t1, . . . , N u given by the class of concave perturbed utility preferences. In particular, the nth player has preferences represented by the non-expected utility given by Jn ÿ j"1 p n,j U n,j pp´n, x j n q`D n pp n , p´nq.
Here, U n,j : ∆´nˆX n Ñ R is a utility index that captures the attractiveness of the nth player's jth strategy that is assumed continuous in p´n and depends on the outcomes associated with the nth player's jth strategy. The function D n : ∆ nˆ∆´n Ñ R is assumed concave in p n for every p´n and jointly continuous in pp n , p´nq. We call D n the nth individual's perturbation function. The perturbed utility approach differs from the control function approach of van Damme [1987] since the attractiveness of a pure strategy can depend on the play of opponents and the preference for randomization can also depend on the play of opponents.
In general, the utility indices do not need to satisfy the usual expected utility conditions. Nonetheless, the nth individual can evaluate the value of the jth strategy following EU when the utility is the conditional expected utility (CEU) given by U CEU n,j pp´n, x j n q " ř s´nPS´n p´nps´nqu n px n,ps n,j ,s´nq q where u n : X n Ñ R is a subutility index that maps outcomes directly to utility numbers for the nth player. Here, p´nps´nq is the probability of the pure strategy of all players except the nth player for a strategy s´n " ps 1,j 1 . . . , s n´1,j n´1 , s n`1,j n`1 , . . . , s N,j N q P S´n in the mixed strategy p´n, so p´nps´nq " ś m‰n p m,jm .
For the nth player, let U n " pU n,1 , . . . , U n,Jn q be the vector of the utility index functions for all J n pure strategies. Let U " pU 1 , . . . , U N q be the vector of all utility indices for all players. We also let the collection of all perturbation functions be given by D " pD 1 , . . . , D N q. When all individuals have concave perturbed utility preferences, we call this a concave perturbed utility game. This setup nests expected utility when for every player n P t1, . . . , N u the perturbation function satisfies D n pp n , p´nq " 0 and the utility index for every pure strategy is given by the conditional expected utility (CEU) U CEU n,j pp´n, x j n q " ř s´nPS´n p´nps´nqu n px n,ps n,j ,s´nq q where u n : X n Ñ R is a sub-utility index that maps outcomes directly to utility numbers for the nth individual. We later show that Nash equilibria exist for any continuous utility indices. Recall, utility indices can depend on mixed strategies of other opponents. Thus, concave perturbed utility games can apply the Nash equilibrium concept beyond the common conditional expected utility restriction that has been common following Nash [1951] and Von Neumann and Morgenstern [1953].
We focus on the standard definition of Nash equilibrium when studying concave perturbed utility games.
Definition 2. A mixed strategy profile p˚P ∆ is a Nash equilibrium of a concave perturbed utility game if for all n P t1, . . . , N u it holds that The definition of Nash equilibrium requires that mixed strategies be a best response. The above definition is exactly this condition translated to a concave perturbed utility game. We now state that Nash equilibria exist for every concave perturbed utility game.
Corollary 1 (Existence). For every concave perturbed utility game pN, S, x, U, Dq there exists a Nash equilibrium.
The above result is an immediate corollary of the main theorem in Debreu [1952]. 5 The Debreu [1952] result was also used in Crawford [1990] to show a Nash Equilibrium exists for any concave and jointly continuous non-expected utility function. We view the contribution of this paper as providing a tractable model that can be taken to data. In addition, the model can separate how an individual values outcomes from playing a particular pure strategy and the preference for randomization. As we show in the next section, this class of models can produce the logit best response function that is popular in applied work following quantal response equilibria of McKelvey and Palfrey [1995]. Thus, it is a natural springboard to explore other forms of strategic behavior.
We note that Nash equilibria may not exist when D n is not concave in p n for every p´n P ∆´n. Crawford [1990] provides one example of a game with non-expected utility preferences that are quasi-convex that has no Nash equilibrium. While we focus on concave perturbed utility games, there are equilibrium concepts which can be used for non-concave perturbed utility games and more generally any non-expected utility preference. In particular, Crawford [1990] defines a notion of equilibrium in beliefs for any non-expected utility game and shows existence of equilibrium without requiring the concavity assumption.

Entropy Perturbations and Logit Best Response
We now consider a concave perturbed utility game when each player has entropy perturbation functions. For this game, every player n P t1, . . . , N u has a perturbation function of Shannon entropy so that D E n pp n , p´nq "´λ n Jn ÿ j"1 p n,j lnpp n,j q with λ n ą 0. Here, p n,j lnpp n,j q is 0 when p n,j " 0. Stahl [1990] uses Shannon entropy in a control function approach with trembles. Cominetti et al. [2010] study how the limit of certain learning procedures can be represented with entropy costs. Outside of game theory, the Shannon entropy function is used extensively in discrete choice analysis [Anderson et al., 1992], information economics [Sims, 2003], to motivate games with learning [Matejka and McKay, 2014], and for route choice [Jiang et al., 2020]. The function D E n is concave and continuous, and thus Nash equilibria exist. When all individuals have entropy perturbations in a concave perturbed utility game, we call it an entropy perturbed utility game. Below we characterize the best response function of individuals for entropy perturbed utility games.
Proposition 1. The best response function of the nth agent in an entropy perturbed utility game is given by p E n pp´nq " pp E n,1 pp´nq, . . . , p E n,Jn pp´nqq where p E n,j pp´nq " exp´U n,j pp´n,x j n q λnř Jn k"1 exp´U n,k pp´n,x k n q λn¯.
When the utility index takes the conditional expected utility form U CEU n,j pp´n, x j n q " ř s´nPS´n p´nps´nqu n px n,ps n,j ,s´nq q, the best response function in Proposition 1 is the same as that from logit equilibrium in McKelvey and Palfrey [1995] when all λ n take the same value. The Nash equilibria thus have the same comparative statics as quantal response equilibria with respect to the λ n term. For example, as λ n Ñ 8 an individual will uniformly randomize among all of their pure strategies. When λ n " 0 for all individuals, we return to the standard Nash equilibria for a normal form game with expected utility preferences when utility indices follow U CEU n,j . A convenient feature of representing the logit equilibrium in this format is that it by-passes integrating over a distribution of random shocks. Instead, this best response is found by solving a constrained optimization problem.

Variations of Entropy Perturbation
There are several variations of the logit best response function that are similar to logit quantal response equilibria. First, one could consider different utility indices U n,j pp´n, x j n q. For example, when the utility is over money, one could use rank dependent preferences [Quiggin, 1982]. Alternatively, one could use the geometric mean rather than the arithmetic mean to aggregate probabilities and the outcome into a utility index.
Variations of logit best response can also occur by introducing unique weighting terms for each pure strategy. For example, one can consider the class of weighted entropy (WE) perturbed utility games where each individual has a continuous weighting function w n,j : ∆´n Ñ R`and the perturbation function takes the form D WE n pp n , p´nq "´J n ÿ j"1 w n,j pp´nqp n,j lnpp n,j q.
Here w n,j pp´nq weights how desirable it is for the nth player to play strategy j when opponents play p´n. Note that´w n,j pp´nqp n,j lnpp n,j q ě 0 since p n,j P r0, 1s. Thus, a higher weight means the player can potentially obtain more utility by choosing this strategy with a probability close to 1 e , i.e. the value that maximizes p n,j ln p n,j . Since the weights are all nonnegative, the perturbation is concave and equilibria exist by Corollary 1.
We consider some examples of weighting functions. Suppose an individual has ex-ante beliefs about opponent play given by µ n P ∆S´n. One example of a continuous weighting function that is everywhere nonnegative is w n,j pp´nq " α n,j d n,j pp´n, µ n q where d n,j : ∆S´nˆ∆S´n is a jointly continuous distance function for the nth player and jth strategy and α n,j P R`describes the weight of the discrepancy. This means that an individual only has a preference to randomize when opponents play strategies that differ from the player's beliefs. Another natural weighting function is w n,j pp´nq " λ n,j ě 0. In this case, the weighting function on the randomization term for the nth player's jth strategy does not depend on what others are playing.
Lastly, we can specialize so that w n,j pp´nq " w n pp´nq for every n P t1, . . . , N u and for all j P t1, . . . , J n u. This makes the preference for randomization symmetric in own-probabilities. One example is w n pp´nq "´ÿ s´nPS´n p´nps´nq lnpp´nps´nqq.
When w n does not depend on j, the best response has a sample analytic form following the logit equilibra except the desire to randomize depends on the probability opponents play various strategies.
Proposition 2. Suppose for every n P t1, . . . , N u and for all j P t1, . . . , J n u that w n,j pp´nq " w n pp´nq and w n : ∆´n Ñ R``. The best response function of the nth agent in a weighted entropy perturbed utility game is given by p WE n pp´nq " pp WE n,1 pp´nq, . . . , p WE n,Jn pp´nqq where p WE n,j pp´nq " exp´U n,j pp´n,x j n q wnpp´nqř Jn k"1 exp´U n,k pp´n,x k n q wnpp´nq¯.
To the best of our knowledge a logit best response where weights depend opponent's mixed strategies has not been studied. However, it seems intuitively sensible. For example, an individual may have a higher desire to randomize when other individuals choose disparate pure strategies with low probability.

Other Perturbed Utility Games
While the analysis above shows how entropy perturbation functions are related to logit quantal response equilibria, there are other games of interest. In particular, one of the features of concave perturbed utility games is that the desirability of mixing with one strategy can depend on opponents' probabilistic play. We consider two types of games that have this feature.

Mixed Entropy Perturbations and Nested Logit Best Response
Here we derive a nested logit best response using a mixed entropy perturbation function. The nested logit model of discrete choice is treated at a textbook level in Train [2009]. The main idea of nested logit models in discrete choice is that there are alternatives that share similar qualities. For example, when choosing between a car, a red bus, and a blue bus one might group the two buses together into a nest. This kind of similarity is also plausible for strategies in games. For example, consider a prisoner dilemma game with two players where each player can say nothing, deny involvement, or accuse the other player of the crime. Here a natural partition of actions is to treat saying nothing and denying involvement as having the feature of loyalty with the co-conspirator, while accusing the other player of the crime has the feature of disloyalty.
To formalize the mixed entropy perturbation function, we require a partition of the pure strategies of each agent. For every nth agent with J n ą 2, we partition the pure strategies into into L n sets given by R n,1 , R n,2 , . . . , R n,Ln . Thus, for P t1, . . . , L n u this means R n, Ď S n , for all ‰˜ it follows that R n, X R n,˜ " H, and Ť Ln "1 R n, " S n . We refer to the sets which define the partitions as nests. Each nest R n, is assigned a weight η n, P r0, 1s.
We consider the mixed entropy (ME) perturbation function for each player given by p n,j lnpp n,j q`p1´η n, q ÿ jPR n, p n,j ln¨ÿ kPR n, p n,k‚ fi fl .
The first summation term is a weighted entropy function while the second summation term is an entropy-like cost function that now depends on the probability that all items in a nest are chosen. Note that the above is a sum of functions that are all concave in p n since η n, P r0, 1s. 6 Thus, existence of equilibria is immediate from Corollary 1. We characterize the nested logit best response function below.
Proposition 3. For the nth player let pjq be the nest associated to the jth strategy so that R n, pjq is the nest that contains the strategy s n,j and η n, pjq is the corresponding nesting parameter. The best response function of the nth agent in a mixed entropy perturbed utility game is given by p ME n pp´nq " pp ME n,1 pp´nq, . . . , p ME n,Jn pp´nqq where p ME n,j pp´nq " exp´U n,j pp´n,x j n q η n, pjqř kPR n, pjq exp´U n,k pp´n,x k n q η n, pjq¯´ř kPR n, pjq exp´U n,k pp´n,x k n q η n, pjq¯¯η n, pjq To the best of our knowledge, nested logit equilibria have not been considered in the literature. The result above can also be further generalized to allow weighting functions that depend on the opponents' mixed strategies. For example, consider the weighted mixed entropy (WME) perturbation function of w n,j pp´nqp n,j lnpp n,j q`ÿ jPR n, w n, ,j pp´nqp n,j ln¨ÿ kPR n, p n,k‚ fi fl where w n,j : ∆´n Ñ R`is a continuous weighting function for the jth strategy that does not depend on the nest and w n, ,j : ∆´n Ñ R`is a continuous weighting function for the jth strategy that depends on its nest. This is a concave perturbation function so equilibria exist by Corollary 1. Now consider the restrictions that the weights for every player n P t1, . . . , N u satisfy that w n,j : ∆´n Ñ r0, 1s, for all j, k P R that w n,j " w n,k " w n, , and for all P t1, . . . , L n u and j P R that w n, ,j " 1´w n,j " 1´w n, . Under these conditions, the best response function takes the same form as the nested logit best response except with weights that depend on opponents' mixed strategies.
Proposition 4. In a weighted mixed entropy game, let the weights for every player 6 We also mention that even when η n, ą 0 but not in [0,1] the best response function conditional on p´n is a singleton as shown in Allen and Rehbeck [2019a] so a Nash equilibrium exists by Debreu [1952]. When η n, ą 1 this allows complementarity following Allen and Rehbeck [2019b] within a nest and cannot be imitated by any additive random error used to generate quantal response equilibria.
n P t1, . . . , N u satisfy that w n,j : ∆´n Ñ r0, 1s, for all j, k P R that w n,j " w n,k " w n, , and for all P t1, . . . , L n u and j P R that w n, ,j " 1´w n,j " 1´w n, . For the nth player, let pjq be the nest associated to the jth strategy so that R n, pjq is the nest that contains the strategy s n,j and w n, pjq is the corresponding weighting function. The best response function of the nth agent in a weighted mixed entropy perturbed utility game is given by p WME n pp´nq " pp WME n,1 pp´nq, . . . , p WME n,Jn pp´nqq where p WME n,j pp´nq " exp´U n,j pp´n,x j n q w n, pjq pp´nqř kPR n, pjq exp´U n,k pp´n,x k n q w n, pjq pp´nq¯´ř kPR n, pjq exp´U n,k pp´n,x k n q w n, pjq pp´nq¯¯w n, pjq pp´nq ř L N "1´řkPR n, ´U n,k pp´n,x k n q w n, pp´nq¯¯w n, pp´nq .

Quadratic Perturbations
Finally, we consider concave quadratic perturbed utility games where the perturbation function takes the form D n pp n , p´nq "´pp n´rn q 1 A n pp´nqpp n´rn q where A n : ∆´n Ñ R JnˆJn is continuous, A n pp´nq is positive semidefinite for every p´n, and r n P ∆S n is a reference probability. This specification makes the utility obtained from the perturbation lower for probabilities further away from the reference probability. We know that equilibria of concave quadratic perturbed utility games exist from Corollary 1. While these games do not yield analytical solutions in general, one can quickly compute the best response with quadratic programming for each p´n. One example of a quadratic perturbation function is the diagonal weighting (DW) perturbation function, where for all j P t1, . . . , J n u entries on the diagonal are given by A DW n pp´nq j,j " w n,j pp´nq where w n,j : ∆´n Ñ R`is a continuous weighting function and A DW n pp´nq j,k " 0 for j ‰ k. Thus, one can model best response functions that are linear in the utility index, but non-linear in their opponents' strategies. We also note that quadratic perturbations are flexible enough to allow individuals to express complementarities between strategies when J n ą 2 following Allen and Rehbeck [2019b].
This paper shows existence of Nash equilibria in concave perturbed utility games, relates the approach to quantal response equilibria, develops the nested logit equilibrium, and introduces quadratic perturbed utility games. Thus, we link the literature on Nash equilibrium without expected utility which has not been used in applications to quantal response equilibrium (QRE) which has been extensively used in applications. We also provide the reader with several classes of perturbations that allow flexibility in best responses and can be used in the study of games. Others perturbations that may be useful are developed for discrete choice analysis in Fosgerau et al. [2019] and Monardo [2019]. Although many of the models presented have many parameters, one can simplify estimation by making homogeneity assumptions across individuals. 7 By presenting a tractable class of games, we hope this paper is able to re-introduce games of non-expected utility preferences to those unfamiliar with the earlier work of Crawford [1990] and Dekel et al. [1991] in light of the new experimental evidence that supports a preference for randomization.

Appendix
Proof of Corollary 1. For every individual n P t1, . . . , N u, continuity of the utility indices and perturbation function ensures that the value function is continuous in p´n. Moreover, continuity and concavity of the perturbation function guarantees the set of best responses is convex and nonempty, and hence contractible. The Corollary now follows from the main Theorem in Debreu [1952].
Proof of Proposition 1. This follows from Proposition 2 with each w n pp´nq " λ P R``.
Proof of Proposition 2. We consider perturbed utility functions with mixed entropy perturbations. The utility function is given by Jn ÿ j"1 p n,j U n,j pp´n, x j n q´w n pp´nq Jn ÿ j"1 p n,j lnpp n,j q.
7 For example, letting α n,j " α for each individual and strategy in Equation (1) creates a one parameter model when U n,j and d n,j are pre-specified.
One can find the best response conditional on p´n by solving max pnPR Jǹ Jn ÿ j"1 p n,j U n,j pp´n, x j n q´w n pp´nq Jn ÿ j"1 p n,j lnpp n,j q s.t.
Setting up the Lagrangian for this problem we have that Lpp n , θq " Jn ÿ j"1 p n,j U n,j pp´n, x j n q´w n pp´nq Jn ÿ j"1 p n,j lnpp n,j q`θ˜1´J n ÿ j"1 p n,jw here θ P R is the Lagrange multiplier on the constraint. Note we do not need to consider Lagrange multipliers on the non-negativity constraints of the mixed strategies since the marginal utility of placing positive probability on a pure strategy goes to infinity as p n,j Ñ 0. More formally, the nonnegative constraints will automatically be satisfied as we show below.
Examining the first order conditions on p n,j we have that BL Bp n,j : U n,j pp´n, x j n q´w n pp´nq lnpp n,j q´w n pp´nq´θ " 0.
We also have the following complementary slackness condition that θ˜1´J n ÿ j"1 p n,j¸" 0.
Note that setting the first order conditions with respect to p n,j and p n,k equal to one another we obtain that U n,j pp´n, x j n q´w n pp´nq lnpp n,j q " U n,k pp´n, x k n q´w n pp´nq lnpp n,k q.
Rearranging the above function gives lnpp n,j q´lnpp n,k q " U n,j pp´n, x j n q w n pp´nq´U n,k pp´n, x k n q w n pp´nq .
Finally applying exponentiation to both sides of the quality yields p n,j p n,k " exp´U n,j pp´n,x j n q λnē xp´U n,k pp´n,x k n q λn¯.
(2) Choosing aj P t1, . . . , J n u we can use Equation 2 and the fact that probabilities sum to one to obtain that Jn ÿ j"1 p n,j " p n,j`pn,j ÿ k‰j exp´U n,k pp´n,x k n q wnpp´nqē xp´U n,j pp´n,x n,j q wnpp´nq¯"

1.
However, this simplifies to the best response function p WE n,j pp´nq " exp´U n,j pp´n,x j n q wnpp´nqř Jn k"1 exp´U n,k pp´n,x k n q wnpp´nq¯.
The above holds for anyj P t1, . . . , J n u and does not depend on the individual since all individuals have weighted entropy functions that satisfy the conditions in the statement of the proof of Proposition 2.
Proof of Proposition 3. This follows from Proposition 4 with each w n, pp´nq " η P p0, 1q.
Proof of Proposition 4. The proof is similar to the related derivation in discrete choice of Allen and Rehbeck [2019a]. Note that w n, ą 0 for " 1, . . . , L n ensures we have an interior solution. To see this, we will show that the marginal utility for the nth player associated with the probability of playing any jth strategy increases to`8 as p n,j Ñ 0. right hand side. By exponentiation, we see that the ratio of p n,j {p n,k gives p n,j p n,k " exp´U n,j pp´n,x j n q w n, pjq pp´nq¯{´p 1´w n, pjq pp´nqq ln´řj PR n, pjq exp´U n,j pp´n,x n,j q w n, pjq pp´nq¯¯ē xp´U n,k pp´n,x k n q w n, pkq pp´nq¯{´p 1´w n, pkq pp´nqq ln´řk PR n, pkq exp´U n,k pp´n,x n,k q w n, pkq pp´nq¯¯¯.
(4) Similar to solving the weighted entropy problem, since all choice probabilities sum to one, one gets that p n,j`ÿ k‰j p n,k " 1.
Using Equation 4 to write each p n,k as a function of p n,j and substituting in terms one arrives at p WME n,j pp´nq " exp´U n, pjq pp´n,x j n q w n,j pp´nqř kPR n,j exp´U n, pjq pp´n,x k n q w n, pjq pp´nq¯´ř kPR n, pjq exp´U n,k pp´n,x k n q w n, pjq pp´nq¯¯w n, pjq pp´nq ř L N "1´řkPR n, ´U n,k pp´n,x n,k q w n, pp´nq¯¯w n, pp´nq .