Karma Games

1 A self-contained Karma Economy   for the dynamic
allocation of shared resources  Ezzat Elokda, and Andrea Censi Saverio Bolognani Florian Dör fl er Emilio Frazzoli with Carlo Cenedese, Kenan Zhang and John Lygeros

Menu • Motivation • A "simple" resource sharing problem •
Dynamic population game model • Application to tra ffi c congestion management • Conclusions 2

Shared Resources

Engineers play an active role in managing shared resources ⇒
must be mindful of social implications of our designs Energy Transportation Internet 4

Fairness, efficiency, scalability Three con fl icting objectives 5
Scalability E ffi ciency* Fairness* *exact de fi nition to follow

Scalability E ffi ciency Fairness* *maximum fairness possible if encoded in cost function

Scalability E ffi ciency Fairness

Scalability E ffi ciency* Fairness *willingness to pay ≠ need

• Each agent has a private urgency u • The
daily urgency follows   an exogenous Markov chain φ Two AVs meet at an unsignalled intersection. Who goes first? 12

Reputation works in a small community • Just tell each
other the urgency • In a small community,  agents are inclined to be truthful  because freeloaders will be punished. • How can reputation work,  if you never see the same car again? 13

Tossing a coin is inefficient • E ffi ciency :=
- 𝔼 [sum of costs] Cost of going fi rst = 0 Cost of going last = urgency • Max e ff i ciency:   highest urgency goes fi rst • Coin toss is agnostic to private needs 14

Tossing a coin is not always fair • Fairness def.
1: "equal opportunity/access” Prob[ R goes fi rst ] = Prob[ B goes fi rst ] • Fairness def. 2: "equal outcome" 𝔼 [ cost(R) ] = 𝔼 [ cost(B) ] • Fairness def. 3: "reparation" 𝔼 [ future cost | past cost ] ∝ -past cost • Coin toss satis fi es equal opportunity, equal outcome (if players are homogeneous), but not reparation 15

Monetary solutions are unfair [1] • Simple reason: not everyone
has the same money • Would be fair if: bank account = social credit [1] Carlino et al., "Auction-based autonomous intersection management", ITSC (2013) 16

Monetary solutions are inefficient • … if e ffi ciency
is de fi ned based on the true private urgency • Willingness to pay ≠ urgency Depends on balance in bank [1] 17 [1] Börjesson et al., "On the income elasticity of the value of travel time", Transportation Research Part A (2011)

Karma mechanism (pairwise interactions) Today Tomorrow 1. Each agent bids
karma to access the shared resource 2. Who bids more gets the resource, and pays the bid to the other Karma balances re fl ect the access of shared resources There are many variations: pay to peer, pay to society; pay full, di ff erence Very similar results 18

Placing karma bids is not that simple Unlike money, karma
has no value a-priori Money world bid 'too much' bid 'too little' bid 'right amount' • Players face a dynamic optimization a ff ected by: Urgency process φ Future awareness α ∈ [0,1): discounts future rewards Bids of others 19 Go fi rst Karma world Go fi rst now Go fi rst later Go fi rst now Go fi rst later Go fi rst now Go fi rst later

• Individual time-varying state: x = [urgency u, karma k]
• Individual action: a = [karma bid b ≤ k] • Social state: (state distribution d, policy π) d[u, k] - distribution of individual states π[b | u, k] - map from individual state to probabilities of individual actions • Individual Markov Decision Processes, but coupled through (d, π) Immediate reward: ζ[u, b](d, π) Karma transition probabilities: κ[k+ | k, b](d, π) The karma game is a dynamic population game Notation [·] for discrete quantities (·) for continuous quantities [a | b] probability of a given b 21 Example policy

• Expected immediate reward: R[x](d, π) = Σa π[a |
x] ζ[x, a](d, π) • State stochastic matrix: P[x+ | x](d, π) = Σa π[a | x] ρ[x+ | x, a](d, π) • In fi nite-horizon value function: V[x](d, π) = R[x](d, π) + α Σx+ P[x+ | x] (d, π) V[x+](d, π) • Single-stage deviation rewards: Q[x, a](d, π) = ζ[x, a](d, π) + α Σx+ ρ[x+, a | x] V[x+](d, π) • Best response per state: B[x](d, π) := set of randomizations of a that maximize Q[x, a](d, π) Best response in dynamic population games De fi ned per state w.r.t. single-stage deviation rewards Notation [·] for discrete quantities (·) for continuous quantities x - individual state a - individual action d - state distribution π - policy ζ - immediate reward ρ - state transition probabilities α - future discount factor 22

• Stationary Nash Equilibrium (SNE): social state (d*, π*) where
state distribution is stationary d* = P(d*, π*) d* policy is a best response at all states π*[. | x] ∈ B[x](d*, π*) Assumption 1: Continuity ζ(d, π) and ρ(d, π) are continuous. Assumption 2: Karma preservation Karma is preserved in expectation, i.e., 𝔼 [k+] = 𝔼 [k]. Theorem 1 (Existence of SNE): Let Assumption 1 and 2 hold. Then, a SNE is guaranteed to exist. Nash equilibria in dynamic population games The game played every day is di ff erent. A best response fi xed point is not enough 23 Notation [·] for discrete quantities (·) for continuous quantities d - state distribution π - policy P - state stochastic matrix B - best response per state ζ - immediate reward ρ - state transition probabilities Required in general DPGs Speci fi c to karma DPGs

Dynamic vs. static population game Theorem 2 (Reduction to population
game): For every dynamic population game (DPG), there exists a static population game with augmented population p' whose NE coincide with SNE of the DPG. SNE in dynamic population game d* = P(d*, π*) d* π*[. | x] ∈ B[x](d*, π*) x - individual state NE in static population game π*[. | p] ∈ B[p]( π*) p - individual (static) population Trick: de fi ne an augmented population p' • Mixed strategy: π[. | p'] := d • Payo ff : F[p', .](d, π) := P(d, π) d - d Intuition: population p' "plays the dynamics" 24

Corollary: evolutionary dynamics for SNE computation 25 Sandholm, "Population
games and evolutionary dynamics", MIT Press (2010) Remark: Projection dynamic for p' = continuous-time state dynamics ḋ = P(d, π) d - d

Here is a karma Stationary Nash Equilibrium It consists of
an equilibrium policy π* and a stationary state distribution d* π* d* 26 • Urgency process φ[u+ | u] • Future awareness α = 0.98

And here are the efficiency and fairness of karma Performance
shown for α-sweep and karma schemes PBP & PBS 27 • Results from agent-based simulations with 200 agents and 1000 days • E ffi ciency = -AVG[costs] • Fairness = -STD[# of times went fi rst] Captures equal opportunity and reparation *Optimal e ffi ciency of MONEY under assumption that money is accurate measure of urgency Legend PBP Karma: Pay Bid to Peer PBS Karma: Pay Bid to Society COIN Baseline coin toss TURN Simple turn-taking MONEY Truthful monetary mechanism*

And here are the efficiency and fairness of karma Near-optimal
e ff i ciency and fairness for high α in both karma schemes 28 • Results from agent-based simulations with 200 agents and 1000 days • E ffi ciency = -AVG[costs] • Fairness = -STD[# of times went fi rst] Captures equal opportunity and reparation *Optimal e ffi ciency of MONEY under assumption that money is accurate measure of urgency Legend PBP Karma: Pay Bid to Peer PBS Karma: Pay Bid to Society COIN Baseline coin toss TURN Simple turn-taking MONEY Truthful monetary mechanism*

Heterogeneous future awareness "Ants" are more patient than "Grasshoppers"
29 29

Heterogeneous future awareness Closing the disparity with a small progressive
karma tax 30

Heterogeneous urgency processes Four user types with same average urgency
but di ff erent processes φ 31

Dynamic population game model • Application to tra ff i c congestion management • Conclusions 32

Let's turn to a persistent societal problem For decades, experts
have been seeking non-monetary solutions to tra ff i c congestion. Nothing has been quite satisfactory. • HOV lanes [1, 2] Limited controllability • License-plate rationing [3, 4] Ine ff i cient: can't travel on Wednesday instead of Monday • Mobility credits [5, 6] Essentially monetary: credits are tradable for money [5] or used to pay for tolls [6] [1] Dahlgren, "High occupancy vehicle lanes: Not always more e ff ective than general purpose lanes" (1998) [2] Wang et al., "Optimal capacity allocation for high occupancy vehicle (HOV) lane in morning commute" (2019) [3] Wang et al., "Tra ffi c rationing and short-term and long-term equilibrium" (2010) [4] Han et al., "E ff i ciency of the plate-number-based tra ff i c rationing in general networks" (2010) [5] Verhoef et al., "Tradeable permits: their potential in the regulation of road transport externalities" (1997) [6] Kalmanje et al., "Credit-based congestion pricing: travel, land value, and welfare impacts" (2004) 33

Karma for traffic management CARMA: fair and e ff i
cient bottleneck congestion management with karma • Commuters departing at same discrete time window bid karma • Regulated fast lane fi lled until free- fl ow capacity by highest bidders • All other tra ffi c goes to unregulated slow lane that can get congested • PBS scheme: fast lane commuters pay bid to society (to be uniformly redistributed at end of day) Ezzat Elokda, Carlo Cenedese, Kenan Zhang, Andrea Censi, John Lygeros, Emilio Frazzoli, Florian Dör fl er TRB Annual Meeting (2023)

The bottleneck model in the classical vs. karma world Notation
[·] for discrete quantities (·) for continuous quantities u - urgency / Value of Time t - departure time t* - desired arrival time tq - queuing delay β, γ - delay sensitivities 35 Classical model: Value of Time (VOT) queuing delay schedule early delay schedule late delay monetary toll

The bottleneck model in the classical vs. karma world There
is no monetary term in the karma cost function! Classical model: Karma model: 36 Value of Time (VOT) queuing delay schedule early delay schedule late delay monetary toll Notation [·] for discrete quantities (·) for continuous quantities u - urgency / Value of Time t - departure time t* - desired arrival time tq - queuing delay β, γ - delay sensitivities d - state distribution π - policy (d, π) - social state

Homogeneous commuters CARMA is as e ff i cient as
TOLL: high VOT commuters enter fast lane 37 Legend CARMA Karma solution TOLL Optimal monetary toll NOM No policy intervention ul Low urgency/VOT uh High urgency/VOT

Low-income vs high-income groups - TOLL: low income commuters never
enter fast lane - CARMA: equal opportunity of entering fast lane 38 Legend CARMA Karma solution TOLL Optimal monetary toll NOM No policy intervention τ1 Low income group τ2 High income group

Time-dependent karma redistribution CARMA more e ff i cient than
TOLL: less congestion in slow lane 39 Legend CARMA Karma solution TOLL Optimal monetary toll NOM No policy intervention ul Low urgency/VOT uh High urgency/VOT

Karma lies between reputation and money Embedding reputation into large-scale
resource sharing 41 Reputation Karma Money

Karma systems for socially responsible resource sharing 42 •
Stop using money as a design tool when allocating shared resources! • Karma mechanisms provide an alternative. • Many open questions for karma: How to learn the Stationary Nash Equilibrium? Do di ff erent systems need di ff erent karma accounts? Can we compose karma system and preserve fairness? • We also realized that fairness is high dimensional (equal access/opportunity, reparation, …) What is the right term to add to our cost functions?

Conclusions • Karma creates economies of favors: updating  the homo
sapiens reciprocipity advantage   to large scale, automation-mediated interactions • Karma is fair because it is closed and regulated You cannot buy Karma with other wealth Karma is exchanged based on strict rules • Karma achieves high e ff i ciency and fairness for a population of self-interested players …playing an equilibrium of a dynamic population game (no coordination) … yet players act as if they were altruistic   and consider the reputation of other agents. Legend PBP Karma: Pay Bid to Peer PBS Karma: Pay Bid to Society COIN Baseline coin toss TURN Simple turn-taking MONEY Monetary market

44 Andrea Censi Saverio Bolognani Florian Dör fl er
Emilio Frazzoli with Carlo Cenedese, Kenan Zhang and John Lygeros

Karma Games

Karma Games

More Decks by Florian Dörfler

Featured

Transcript