Upgrade to Pro — share decks privately, control downloads, hide ads and more …

論文紹介: Supervised Principal Component Analysis

論文紹介: Supervised Principal Component Analysis

某所で発表した勉強会資料.
Barshan, E., Ghodsi, A., Azimifar, Z., Zolghadri Jahromi, M., 2011. Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds. Pattern Recognition 44, 1357–1371. https://doi.org/10.1016/j.patcog.2010.12.015

Takahiro Kawashima

November 25, 2019
Tweet

More Decks by Takahiro Kawashima

Other Decks in Research

Transcript

  1. ࿦จ঺հɿSupervised principal component analysis: Visualization, classification and regression on subspaces

    and submanifolds ઒ౡوେ November 25, 2019 ిؾ௨৴େֶ ঙ໺ݚڀࣨ M1
  2. Supervised PCA ͋Β·͠ X PCA ͷڭࢣ͋Γֶश΁ͷ֦ு X φΠʔϒͳ PCA ͸

    Supervised PCA ͷಛผͳ৔߹ͱΈͳͤΔ X ໨తม਺ʹؔ͢Δ৘ใΛͳΔ΂͘อͪͳ͕Βઆ໌ม਺Λ࣍ݩ ࡟ݮ X ໨తม਺͸ଟ࣍ݩͰ΋཭ࢄͰ΋ OK X RKHS ͷཧ࿦ʹجͮ͘ͷͰࣗવʹඇઢܗԽՄೳ X ෱ਫઌੜͷ 2005 ೥ͷΧʔωϧ࣍ݩ࡟ݮͱ΄΅ಉ͡Ͱ͠ ͨʜʜ ڧ͍ͯҧ͍Λڍ͛Δͱ X objective ͕ඍົʹҧ͍ɼPCA ͷҰൠԽͰ͋Δ͜ͱ͕໌֬ʹࣔ ͞Ε͍ͯΔ X Χʔωϧؔ਺͕ Gauss Ͱͳͯ͘΋ྑ͘ͳ͍ͬͯΔ X p ≫ n ͷͱ͖ʹ૒ରͳܗࣜʹΑΓܭࢉྔ࡟ݮ 2
  3. े෼࣍ݩ࡟ݮ े෼࣍ݩ࡟ݮ (SDR; Sufficient Dimensionality Reduction) Xɿp ࣍ݩ֬཰ม਺ Yɿl ࣍ݩ֬཰ม਺

    ͜͜Ͱ (X,Y) ∼ PX,Y Uɿp × q ߦྻ (q < p) s.t. U⊺U = Iq ʹରͯ͠ p(Y∣X) = p(Y∣U⊺X) ͱͳΔ U Λݟ͚ͭΔ͜ͱ 3
  4. ૬ޓڞ෼ࢄ࡞༻ૉ ૬ޓڞ෼ࢄ࡞༻ૉ F,Gɿ֬཰ม਺ X,Y ͦΕͧΕʹରԠ͢Δ RKHS f ∈ F,g ∈

    G ʹؔ͢Δ૬ޓڞ෼ࢄ࡞༻ૉ CX,Y ∶ G → F ͸ҎԼͷ ؔ܎ࣜͰఆٛ͞ΕΔɿ ⟨f,CX,Yg⟩ = EX,Y[f(X)g(Y)] − EX [f(X)]EY[g(Y)] ී௨ͷڞ෼ࢄͷࣗવͳ֦ு͕ͩɼΧʔωϧ๏ʹΑΓඇઢܗͷ૬ؔ ߏ଄Λଊ͑ΒΕΔ 4
  5. Hilbert-Schmidt ϊϧϜ Hilbert-Schmidt ϊϧϜ ઢܗ࡞༻ૉ C ∶ G → F

    ͷ Hilbert-Schmit ϊϧϜ (HS ϊϧϜ) ∥C∥2 HS ΛҎԼͰఆٛɿ ∥C∥2 HS ∶= ∞ ∑ i=1 ∞ ∑ j=1 ⟨uj,Cvi⟩ {uj},{vi} ͸ F,G ͷਖ਼ن௚ަܥ Frobenius ϊϧϜͷແݶ࣍ݩ൛ 5
  6. ಛੑతͳਖ਼ఆ஋Χʔωϧ ಛੑతͳਖ਼ఆ஋Χʔωϧ ֬཰ม਺ X ∼ PX ,Y ∼ PY ͕ͦΕͧΕਖ਼ఆ஋Χʔωϧ

    k ʹΑͬͯ Φ(X),Φ(Y) ͱ RKHS ʹࣸ૾͞ΕΔͱ͖ɼ EX∼PX [Φ(X)] = EY∼PY [Φ(Y)] ⇔ PX = PY Ͱ͋ΔͳΒɼk ͸ಛੑతͰ͋Δͱ͍͏ e.g.) Gauss ΧʔωϧɼLaplace Χʔωϧ ແݶ࣍ͷϞʔϝϯτ·ͰҰக͢Ε͹෼෍͕ಉҰɼͱ͍͏Πϝʔδ 6
  7. Hilbert-Schmidt Independence Criterion Gretton, et. al., (2015) X,Y ͕ͦΕͧΕಛੑతͳਖ਼ఆ஋Χʔωϧ k,l

    ʹΑͬͯಛ௃ۭؒ ʹࣸ૾͞ΕΔͱ͖ɼ ∥CX,Y∥2 HS = 0 ⇔ X Y {(xi,yi )}n i=1 ɿσʔλɼH ∶= I − n−1ee⊺ɿத৺Խߦྻ K ∶= (k(xi,xj))i,j,L ∶= (l(yi ,yj ))i,j ɿάϥϜߦྻ ڞ෼ࢄ࡞༻ૉͷܦݧతͳਪఆ஋ HSIC ∶= 1 (N − 1)2 tr(KHLH) ΛʢܦݧతʣHSIC (Hilbert-Schmidt Independence Criterion) ͱ͠ ͯఆٛʢখ͍͞΄Ͳಠཱੑߴ ˠ ࠷େԽ͢Ε͹৘ใ͕࢒Δʣ 7
  8. Supervised Principal Component Analysis X ∶= (xi)n i=1 ∈ Rp×n,Y

    ∶= (yi )n i=1 ∈ Rl×n ΧʔωϧΛ U⊺X (U⊺U = Iq),Y ͱઃఆ ˠ X ʹର͢Δઢܗ࣍ݩ࡟ݮ HSIC = 1 (N − 1)2 tr(KHLH) = 1 (N − 1)2 tr((U⊺X)⊺U⊺XHLH) = 1 (N − 1)2 tr(U⊺XHLHX⊺U) Λ࠷େʹ͢ΔΑ͏ͳ U ͕ओ੒෼࣠ ͜Ε͸ XHLHX⊺ ͷେ͖͍ q ݸͷݻ༗஋ʹରԠ͢Δݻ༗ϕΫτ ϧͱͯ͠ٻ·Δʂ 8
  9. Supervised PCA ͱ PCA ;ͭ͏ͷ PCA ͸ڭࢣͳ͠ɼͭ·Γ L = I

    ͷ৔߹ͱͯ͠ߟ͑ΒΕ Δɽ͜ͷͱ͖ 1 (N − 1)2 tr(U⊺XHLHX⊺U) = 1 (N − 1)2 tr{(U⊺XH)(U⊺XH)⊺} = 1 (N − 1)2 tr{(U⊺X(I − n−1ee⊺))(U⊺X(I − n−1ee⊺))⊺} = 1 (N − 1)2 tr{(U⊺X − µX)(U⊺X − µX)⊺} = 1 (N − 1)2 tr{Cov(X)} Λ࠷େԽ ⇔ PCA ͦͷ΋ͷ 9
  10. Dual Supervised PCA XHLHX⊺ ∈ Rp×p ͸ p ͕େ͖͍ͱͭΒ͍ ˠ

    p ≫ n ͷͱ͖͸૒ର໰୊ʹΑΓܭࢉྔΛݮΒͤΔ L ͸൒ਖ਼ఆ஋ͳͷͰ L = ∆⊺∆ ͱ෼ղՄೳ ˠ Ψ ∶= XH∆⊺ ʹΑΓ XHLHX⊺ = ΨΨ⊺ reduced SVD ʹΑΓ Ψ = ˆ U ˆ Σ ˆ V ⊺ ͔Βɼ ˆ U = Ψ ˆ V ˆ Σ−1 Ψ⊺Ψ = ( ˆ V ˆ Σ ˆ U⊺)( ˆ U ˆ Σ ˆ V ⊺) = ˆ V ˆ Σ2 ˆ V ⊺ ∈ Rn×n ˆ V , ˆ Σ ͕ n × n ߦྻͷݻ༗஋෼ղͰٻ·Γɼओ੒෼࣠ ˆ U ΋ͦͷ·· ܭࢉͰ͖Δ 10
  11. Kernel Supervised PCA HSIC ͷಋೖ͔Βߟ͑ͯ΋ Supervised PCA ͸͖ΘΊͯࣗવʹΧʔ ωϧԽՄೳ આ໌ม਺ߦྻ

    X ͕ಛ௃ۭؒ H ʹ Φ(X) Ͱࣸ૾͞ΕΔͱ͖ɼ objective ͸ max U tr(U⊺Φ(X)HLHΦ(X)⊺U) s.t. U⊺U = I. Representer ఆཧʹΑΓ͜ͷ Uʢʹ L2 ਖ਼ଇԽΛֻ͚ͨ৔߹ʣͷ࠷ ద஋͸ H ্ͷσʔλ఺ Φ(X) ʹΑΔઢܗ෦෼্ۭؒʹଘࡏ Αͬͯద౰ͳ܎਺ϕΫτϧ β ʹΑͬͯ U = Φ(X)β ͱͱΔ͜ͱ͕ڐ͞ΕΔ 11
  12. Kernel Supervised PCA ੍໿ U⊺U = I ͸ U⊺U =

    β⊺Φ(X)⊺Φ(X)β = β⊺Kβ ͱมܗͰ͖ɼ max β tr(β⊺Φ(X)⊺Φ(X)HLHΦ(X)⊺Φ(X)β) ⇋ max β tr(β⊺KHLHKβ) s.t. β⊺Kβ = I. ͳΔ໰୊ʹؼண͞ΕΔ ˠҰൠԽݻ༗஋໰୊ͱͯ͠ٻ·Δʂ 12
  13. ࣮ݧ ͳΜ͔৭ʑ࣮ݧ΍͍ͬͯΔ શମతʹఏҊख๏Ҏ֎ͷΧʔωϧ࣍ݩ࡟ݮͷ݁Ռ͕ѱ͗͢Δؾ͕ ͢Δʜʜ X SPCA: Supervised PCA X BPCA:

    Bair’s Supervised PCA X KDR: Kernel Dimensionality Reduction X KSPCA: Kernel Supervised PCA X mKDR: manifold Kernel Dimensinality Reduction 13
  14. ࣮ݧ (iris) Iris σʔληοτʢ4 ˠ 2 ࣍ݩɼΫϥε਺ 3ʣ (a): SPCA,

    (b): BPCA, (c): KDR, (d): KSPCA, (e): mKDR 14
  15. ࣮ݧ (XOR) XOR τΠσʔλʢ2 ˠ 2 ࣍ݩɼΫϥε਺ 2ʣ (a): SPCA,

    (b): BPCA, (c): KDR, (d): KSPCA, (e): mKDR 17
  16. ࣮ݧઃఆ (ճؼ) ϊΠζɿϵ ∼ N(0,1) (a) y = x1 0.5

    + (x2 + 1.5)2 + (1 + x2)2 + 0.5ϵ x ∼ N(0,I4) (b) y = sin2(πx2 + 1) + 0.5ϵ x ∼ Uniform([0,1]4/{x ∈ [0,1]4∣xi ≤ 0.7 (i = 1,...,4)}) (c) y = 1 2 x2 1 ϵ x ∼ N(0,I10) 20