x] ζ[x, a](d, π) • State stochastic matrix: P[x+ | x](d, π) = Σa π[a | x] ρ[x+ | x, a](d, π) • In fi nite-horizon value function: V[x](d, π) = R[x](d, π) + α Σx+ P[x+ | x] (d, π) V[x+](d, π) • Single-stage deviation rewards: Q[x, a](d, π) = ζ[x, a](d, π) + α Σx+ ρ[x+, a | x] V[x+](d, π) • Best response per state: B[x](d, π) := set of randomizations of a that maximize Q[x, a](d, π) Best response in dynamic population games De fi ned per state w.r.t. single-stage deviation rewards Notation [·] for discrete quantities (·) for continuous quantities x - individual state a - individual action d - state distribution π - policy ζ - immediate reward ρ - state transition probabilities α - future discount factor  22