McCann Villani ansport framework Sliced Wasserstein projection Applications lication to Color Transfer Source image (X) Style image (Y) Sliced Wasserstein projection of X to style image color statistics Y Source image after color transfer J. Rabin Wasserstein Regularization
Low dimension High dimension Generative Discriminative g✓ d⇠ g✓ d⇠ classification, z =class probability Supervised: ! Learn d⇠ from labeled data (xi, zi)i.
Low dimension High dimension Generative Discriminative g✓ d⇠ g✓ d⇠ classification, z =class probability Supervised: ! Learn d⇠ from labeled data (xi, zi)i. Compression: z = d⇠(x) is a representation. Generation: x = g✓(z) is a synthesis. Un-supervised: ! Learn (g✓, d⇠) from data (xi)i.
Low dimension High dimension Generative Discriminative g✓ d⇠ g✓ d⇠ classification, z =class probability Supervised: ! Learn d⇠ from labeled data (xi, zi)i. Compression: z = d⇠(x) is a representation. Generation: x = g✓(z) is a synthesis. Un-supervised: ! Learn (g✓, d⇠) from data (xi)i. Density fitting g✓({ zi }i) ⇡ { xi }i Auto-encoders g✓( d⇠( xi)) ⇡ xi
Low dimension High dimension Generative Discriminative g✓ d⇠ g✓ d⇠ classification, z =class probability Supervised: ! Learn d⇠ from labeled data (xi, zi)i. Compression: z = d⇠(x) is a representation. Generation: x = g✓(z) is a synthesis. Un-supervised: ! Learn (g✓, d⇠) from data (xi)i. Density fitting g✓({ zi }i) ⇡ { xi }i Auto-encoders g✓( d⇠( xi)) ⇡ xi Optimal transport map d⇠
µ✓ ✓ min ✓ c KL( µ✓ |⌫ ) def. = X j log( f✓( yj)) dµ✓(y) = f✓(y)dy Density fitting: Maximum likelihood (MLE) g✓ µ✓ X Z ⇣ Generative model fit: µ✓ = g✓,]⇣ c KL(µ✓ |⌫) = +1 ! MLE undefined. ! Need a weaker metric. Observations: ⌫ = 1 n P n i =1 xi
Sliced Wasserstein projection of X to style image color statistics Y Source image after color transfer J. Rabin Wasserstein Regularization ! images, vision, graphics and machine learning, . . . • Probability distributions and histograms
Sliced Wasserstein projection of X to style image color statistics Y Source image after color transfer J. Rabin Wasserstein Regularization ! images, vision, graphics and machine learning, . . . • Probability distributions and histograms Optimal transport mean L2 mean • Optimal transport
Sliced Wasserstein projection of X to style image color statistics Y Source image after color transfer J. Rabin Wasserstein Regularization ! images, vision, graphics and machine learning, . . . • Probability distributions and histograms Optimal transport mean L2 mean • Optimal transport ! well defined for discrete or singular distributions (“weak” metric). µ
µ, ⌫ ) def. = min ⇡ ⇢ h d p , ⇡ i = Z X⇥X d ( x, y )pd ⇡ ( x, y ) ; ⇡ 2 ⇧( µ, ⌫ ) ! Wp is a distance on M+( X ). Wp ( x, y ) = d ( x, y ) x ! y ! 0 ! Wp works for singular distributions. d ( x, y ) x y
µ, ⌫ ) def. = min ⇡ ⇢ h d p , ⇡ i = Z X⇥X d ( x, y )pd ⇡ ( x, y ) ; ⇡ 2 ⇧( µ, ⌫ ) ! Wp is a distance on M+( X ). Wp ( x, y ) = d ( x, y ) x ! y ! 0 ! Wp works for singular distributions. g✓ µ✓ X Z ⇣ Minimum Kantorovitch Estimator: min ✓ Wp(µ✓, ⌫) [Bassetti et al, 06] Algorithms: [Montavon et al 16], [Bernton et al 17], [Genevay et al 17] generative model min ✓ Wp(g✓,]⇣, ⌫) d ( x, y ) x y
Adversarial Networks: [Goodfellow et al, 14] ! also an OT-like problem! [Arjovsky et al, 17], Arxiv:1701.07875 Open problems: ! how far are VAE/GAN from “true” OT? ! using OT to improve VAE/GAN training.