No regret algorithms for online k-submodular function maximization

No regret algorithms for online k-submodular function maximization Tasuku Soma
University of Tokyo . .6 @UBC / 6

Summary / 6

Motivation: Sensor placement We have k types of sensors and
n spots. What is an optimal placement of sensors to maximize the information gain? / 6

Motivation: Sensor placement We have k types of sensors and
n spots. What is an optimal placement of sensors to maximize the information gain? =⇒ k-submodular function maximization / 6

Motivation: Sensor Placement Data • Z , Z : target
(e.g., temperature and humidity) • A , ... , An, B , ... , Bn: sensor data Contraint • At most one type of sensors can be used from {Ai, Bi}. • Type A is used for Z , and type B is used for Z . A B A B A B Z Z f(x) = I(selected; Z) is -submod, if A, B are independent given Z. (general k is similar) / 6

k-submodular functions Submodular function f(x) + f(y) ≥ f(x ∨
y) + f(x ∧ y) (∀x, y ∈ { , }n) coord-wise max coord-wise min / 6

y) + f(x ∧ y) (∀x, y ∈ { , }n) coord-wise max coord-wise min k-submodular function f(x) + f(y) ≥ f(x y) + f(x y) (∀x, y ∈ { , , ... , k}n) / 6

y) + f(x ∧ y) (∀x, y ∈ { , }n) coord-wise max coord-wise min k-submodular function f(x) + f(y) ≥ f(x y) + f(x y) (∀x, y ∈ { , , ... , k}n)                 x                 y · · · k / 6

y) + f(x ∧ y) (∀x, y ∈ { , }n) coord-wise max coord-wise min k-submodular function f(x) + f(y) ≥ f(x y) + f(x y) (∀x, y ∈ { , , ... , k}n)                 x                 y                 x y                 x y · · · k / 6

y) + f(x ∧ y) (∀x, y ∈ { , }n) coord-wise max coord-wise min k-submodular function f(x) + f(y) ≥ f(x y) + f(x y) (∀x, y ∈ { , , ... , k}n)                 x                 y                 x y                 x y · · · k Note k = ... submod, k = ... bisubmod / 6

k-submodular function maximization f: k-submodular function max x f(x) Known
Results • NP-hard • / -approximation algorithm [Iwata, Tanigawa, and Yoshida 6] 6 / 6

Online k-submod maximization Objective f can change day by day.
For Day to Day T: player adversary xt / 6

For Day to Day T: player adversary xt feedback ft / 6

For Day to Day T: player adversary xt feedback ft We assume ft : { , , ... , k}n → [ , ]. / 6

Online k-submod maximization / -Regret regret / = max x∗
t∈[T] ft(x∗) − t∈[T] ft(xt) reward of best ﬁxed action actual reward of player 8 / 6

t∈[T] ft(x∗) − t∈[T] ft(xt) reward of best ﬁxed action actual reward of player Goal Find (xt)t s.t. E[regret / ] = o(T) (No-regret) 8 / 6

t∈[T] ft(x∗) − t∈[T] ft(xt) reward of best ﬁxed action actual reward of player Goal Find (xt)t s.t. E[regret / ] = o(T) (No-regret) No-regret =⇒ T E       max x∗ t∈[T] ft(x∗) − t∈[T] ft(xt)       → . 8 / 6

Related work and our result ofﬂine online k = -approx
[Buchbinder et al. ] -regret O(n √ T) [Roughgarden and Wang 8] k = -approx [Ward and Živný 6] -regret O(n √ T) general k -approx [Iwata, Tanigawa, and Yoshida 6] -regret O(nk √ T) / 6

Outline Online k-submod max k-submod selection game Another online opti-
mization Blackwell Approchability Useful theorem in online optimization / 6

Algorithm for online k-submod maximization / 6

Marginal gain characterization [Buchbinder et al. ] For x: x(j)
= , deﬁne marginal gain ∆j,if(x) := f(x + iej) − f(x), f is k-submod ⇐⇒ Pairwise monotonicity ∆j,if(x) + ∆j,i f(x) ≥ (i ≠ i ∈ [k], j ∉ supp(x)) Orthant submodularity ∆j,if(x) ≥ ∆j,if(y) (x ≤ y, i ∈ [k], j ∉ supp(y)) / 6

Ofﬂine k-submodular maximization [Iwata, Tanigawa, and Yoshida 6]  
      . . .         x( )

      . . .         x( )         . . .         x( ) ∼ p( ) prob dist on [k]

      . . .         x( )         . . .         x( ) ∼ p( ) prob dist on [k]         . . .         x( ) ∼ p( )

      . . .         x( )         . . .         x( ) ∼ p( ) prob dist on [k]         . . .         x( ) ∼ p( )         . . .         x(n) ∼ p(n) / 6

Ofﬂine k-submodular maximization [Iwata, Tanigawa, and Yoshida 6] : x(
) := : for j = , ... , n : : Compute a probability distribution p(j) ∈ ∆k : sample i ∼ p(j) and x(j) ← x(j− ) + iej : return x = x(n) / 6

Ofﬂine k-submodular maximization [Iwata, Tanigawa, and Yoshida 6] : x(
) := : for j = , ... , n : : Compute a probability distribution p(j) ∈ ∆k : sample i ∼ p(j) and x(j) ← x(j− ) + iej : return x = x(n) Lemma ([Iwata, Tanigawa, and Yoshida 6]) Assume that for j = , ... , n, p(j) satisﬁes max i∗∈[k] a(i∗) − E i∼p(j) [b(i) + a(i)] ≤ for ∀a ≤ b, a(i) + a(i ) ≥ (i ≠ i ), where b(i) = ∆j,if(x(j− )) (∀i). Then x is / -approx. Such p(j) can be found via only b. / 6

Bisubmodular case: k = Fix iteration j and let b(
) := ∆j, f(x(j− )) and b( ) := ∆j, f(x(j− )). By the pairwise monotonicity, b( ) + b( ) ≥ . Assume that b( ), b( ) > for simiplicity. / 6

) := ∆j, f(x(j− )) and b( ) := ∆j, f(x(j− )). By the pairwise monotonicity, b( ) + b( ) ≥ . Assume that b( ), b( ) > for simiplicity. Rule for p(j): p(j) ( ) = b( ) b( ) + b( ) , p(j) ( ) = b( ) b( ) + b( ) / 6

) := ∆j, f(x(j− )) and b( ) := ∆j, f(x(j− )). By the pairwise monotonicity, b( ) + b( ) ≥ . Assume that b( ), b( ) > for simiplicity. Rule for p(j): p(j) ( ) = b( ) b( ) + b( ) , p(j) ( ) = b( ) b( ) + b( ) Then, for any a ≤ b s.t. a( ) + a( ) ≥ , E[b(i) + a(i)] = b( ) + b( ) + a( )b( ) + a( )b( ) b( ) + b( ) ≥ b( )b( ) + a( )b( ) + a( )b( ) b( ) + b( ) ≥ a( )(b( ) + b( )) + (a( ) + a( ))b( ) b( ) + b( ) ≥ a( ). / 6

Bisubmodular case: k = Let o be an optimal solution.
Deﬁne o(j) := (o x(j)) x(j). Let a(i) = ∆i,j(s(j− )), b(i) = ∆i,j(x(j− )) for i ∈ [k]. Then a ≤ b and a(i) + a(i) ≥ (i ≠ i ).           . . .           x(j− )           ∗ ∗ . . . ∗           o(j− )           ∗ . . . ∗           s(j− ) 6 / 6

Bisubmodular case: k = Let o be an optimal solution.
Deﬁne o(j) := (o x(j)) x(j). Let a(i) = ∆i,j(s(j− )), b(i) = ∆i,j(x(j− )) for i ∈ [k]. Then a ≤ b and a(i) + a(i) ≥ (i ≠ i ).           . . .           x(j− )           ∗ ∗ . . . ∗           o(j− )           ∗ . . . ∗           s(j− ) Analysis Lemma yields E[f(o(j− )) − f(o(j))] ≤ E[f(x(j)) − f(x(j− ))] for j = , , ... , n. Summing it for j, we get E[f(o) − f(o(n))] ≤ E[f(x(n)) − f(x( ))]. Since x = x(n) = o(n), f(o) ≤ E[f(x)]. 6 / 6

Online k-submodular maximization We cannot compute p(j) because we do
not know b ...         . . .         x( ) t         . . .         x( ) t p( ) t         . . .         x( ) t p( ) t         . . .         x(n) t p(n) t

Online k-submodular maximization Use another online alg. to predict p(j)!
        . . .         x( ) t         . . .         x( ) t p( ) t         . . .         x( ) t p( ) t         . . .         x(n) t p(n) t A A An / 6

k-submodular selection game adversary player pt ∈ ∆k bt ∈
[− , ]k “regret” We aim max i∗∈[k] t∈[T] at(i∗) − t∈[T] E i∼pt [bt(i) + at(i)] ≤ O( √ T) for all at ∈ [− , ]k s.t. at ≤ bt, at(i) + at(i ) ≥ , bt(i) + bt(i ) ≥ (i ≠ i ). 8 / 6

Proposed algorithm : Set up online alg for k-submod selection
game Aj for j ∈ [n] : for t = , ... , T : : x( ) t := : for j = , ... , n : : Obtain a distribution p(j) t ∈ ∆k from Aj 6: Sample i ∼ p(j) t and x(j) t ← x(j− ) t + iej : Play xt = x(n) t and observe ft. 8: for j = , ... , n : : Feedback b(j) t (·) = ∆·,jf(x(j− ) t ) to Aj Lemma Each Aj achieves O( √ T) “regret” =⇒ (xt) achieves O(nk √ T) / -regret. / 6

Blackwell approchability / 6

Blackwell approachability von Neumann’s minimax theorem min y∈Y max x∈X
x Ay = max x∈X min y∈Y x Ay “It doesn’t matter which player plays ﬁrst” / 6

x Ay = max x∈X min y∈Y x Ay “It doesn’t matter which player plays ﬁrst” Blackwell approachability ... multiobjective generalization S ℓ(x, y) S: closed convex set ℓ: vector valued func. ℓ(x, y) ∈ S ⇐⇒ X-player wins / 6

x Ay = max x∈X min y∈Y x Ay “It doesn’t matter which player plays ﬁrst” Blackwell approachability ... multiobjective generalization S ℓ(x, y) S: closed convex set ℓ: vector valued func. ℓ(x, y) ∈ S ⇐⇒ X-player wins Application online learning with “nonstandard” regret (e.g., internal regret, swap regret, etc.) / 6

Blackwell approachability X ⊆ Rm, Y ⊆ Rn: convex sets
ℓ : X × Y → Rk : biafﬁne function S ⊆ Rk : closed convex set / 6

ℓ : X × Y → Rk : biafﬁne function S ⊆ Rk : closed convex set • satisﬁable ⇐⇒ ∃x ∈ X∀y ∈ Y : ℓ(x, y) ∈ S. / 6

ℓ : X × Y → Rk : biaffine function S ⊆ Rk : closed convex set • satisfiable ⇐⇒ ∃x ∈ X∀y ∈ Y : ℓ(x, y) ∈ S. • response-satisfiable ⇐⇒ ∀y ∈ Y∃x ∈ X : ℓ(x, y) ∈ S. / 6

ℓ : X × Y → Rk : biaffine function S ⊆ Rk : closed convex set • satisfiable ⇐⇒ ∃x ∈ X∀y ∈ Y : ℓ(x, y) ∈ S. • response-satisfiable ⇐⇒ ∀y ∈ Y∃x ∈ X : ℓ(x, y) ∈ S. • halfspace-satisfiable ⇐⇒ ∀ halfspace H: S ⊆ H is satisfiable. / 6

ℓ : X × Y → Rk : biaffine function S ⊆ Rk : closed convex set • satisfiable ⇐⇒ ∃x ∈ X∀y ∈ Y : ℓ(x, y) ∈ S. • response-satisfiable ⇐⇒ ∀y ∈ Y∃x ∈ X : ℓ(x, y) ∈ S. • halfspace-satisfiable ⇐⇒ ∀ halfspace H: S ⊆ H is satisfiable. • approachable ⇐⇒ ∃ algorithm A s.t. ∀(yt)t∈[T] ⊆ Y, xt := A(y , ... , yt− ) satisfies dist T t∈[T] ℓ(xt, yt), S → as T → ∞. / 6

ℓ : X × Y → Rk : biaffine function S ⊆ Rk : closed convex set • satisfiable ⇐⇒ ∃x ∈ X∀y ∈ Y : ℓ(x, y) ∈ S. • response-satisfiable ⇐⇒ ∀y ∈ Y∃x ∈ X : ℓ(x, y) ∈ S. • halfspace-satisfiable ⇐⇒ ∀ halfspace H: S ⊆ H is satisfiable. • approachable ⇐⇒ ∃ algorithm A s.t. ∀(yt)t∈[T] ⊆ Y, xt := A(y , ... , yt− ) satisfies dist T t∈[T] ℓ(xt, yt), S → as T → ∞. Theorem ([Blackwell 6]) response-satisfiable ⇐⇒ halfspace-satisfiable ⇐⇒ approachable / 6

Toy example X = Y = [ , ], ℓ(x,
y) := (x, y) ∈ R , S = {(x, x) : x ∈ [ , ]}. S X Y / 6

y) := (x, y) ∈ R , S = {(x, x) : x ∈ [ , ]}. S X Y This instance is / 6

y) := (x, y) ∈ R , S = {(x, x) : x ∈ [ , ]}. S X Y This instance is • repsonse satisﬁable (set x = y after seeing y) / 6

y) := (x, y) ∈ R , S = {(x, x) : x ∈ [ , ]}. H S X Y This instance is • repsonse satisﬁable (set x = y after seeing y) • halfspace-satisﬁable / 6

y) := (x, y) ∈ R , S = {(x, x) : x ∈ [ , ]}. H S X Y This instance is • repsonse satisfiable (set x = y after seeing y) • halfspace-satisfiable • but NOT satisfiable. / 6

y) := (x, y) ∈ R , S = {(x, x) : x ∈ [ , ]}. H S X Y This instance is • repsonse satisfiable (set x = y after seeing y) • halfspace-satisfiable • but NOT satisfiable. Define xt := yt− for t = , , ... , T. Then T T t= ℓ(x, y) = T y + · · · + yT− y + · · · + yT− + T x yT → S / 6

Algorithmic Blackwell Halfspace Oracle Input halfspace H = {z :
a z ≤ } with S ⊆ H Output x ∈ X s.t. ℓ(x, y) ∈ H (∀y ∈ Y) X, Y: polyhedra =⇒ halfspace oracle is LP (strong duality) Algorithmic Blackwell [Abernethy, Bartlett, and Hazan ] Given a halfspace oracle, one can construct an efﬁcient algorithm A for producing an approaching sequence (xt) s.t. dist T t∈[T] ℓ(xt, yt), S = O √ T . / 6

Application to k-submod max X = ∆k , Y =
{feasible(a, b)}, S = Rk − , ℓ(p, y)(i) = a(i) − E i ∼p [b(i ) + a(i )]. S This instance is response-satisﬁable! [Iwata, Tanigawa, and Yoshida 6] Lemma pt: approaching =⇒ sublinear “regret” in k-submod selection game. In particular, O( √ T) “regret” is achievable in polynomial time. / 6

Summary • Polytime algorithm for online k-submodular function maximization with
sublinear / -regret. • Application of algorithmic Blackwell approachability 6 / 6

No regret algorithms for online k-submodular fu...

No regret algorithms for online k-submodular function maximization

More Decks by Tasuku Soma

Featured

Transcript