Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Network-to-Network Translation with Conditional...

Udon
December 15, 2020

Network-to-Network Translation with Conditional Invertible Neural Networks

Network-to-Network Translation with Conditional Invertible Neural Networks
Robin Rombach, Patrick Esser, Björn Ommer
NeurIPS 2020 (oral). Code at this https URL
https://arxiv.org/abs/2005.13580

Udon

December 15, 2020
Tweet

More Decks by Udon

Other Decks in Technology

Transcript

  1. 2020/12/15 @udoooom Network-to-Network Translation with Conditional Invertible Neural Networks Robin

    Rombach∗, Patrick Esser∗, Björn Ommer IWR, HCI, Heidelberg University https://papers.nips.cc/paper/2020/file/1cfa81af29c6f2d8cacb44921722e753-Paper.pdf https://papers.nips.cc/paper/2020/file/1cfa81af29c6f2d8cacb44921722e753-Supplemental.pdf
  2. Problems • Supervised models have enough great success such tasks,

    which are • Image Classification, Segmentation (ResNet, DeepLab Series) • Question Answering (BERT, GPT-3) • Image Generation, Translation (BigGAN, StyleGAN) • Need to find new ways to reuse such expert models!!
  3. Problems • Pre-trained models have arbitrary fixed representations • StyleGAN:

    Image Generation • BERT: Sentence Embedding • Need domain (modal) translation with keeping the full capabilities!
  4. Contribution • Propose conditionally invertible network (cINN), which is a

    model that can relate between different existing representations without altering them. • cINN needs no gradients of expert models.
  5. Related Works Invertible Neural Networks(INN): Generative Models Figure: https://openai.com/blog/generative-models/ Base

    Distribution Target Distribution INN (e.g. Image2StyleGAN) Generation Conditions
  6. Related Works Invertible Neural Networks(INN): Generative Models Figure: https://openai.com/blog/generative-models/ Base

    Distribution Target Distribution INN (e.g. Image2StyleGAN) Generation Conditions Extends Network-to-Network
  7. Proposed Method Motivation • : Two target domains • :

    Desired output, • : Latent representation • • To realize domain translation, it needs to be described probabilistically as sampling from • Denote , translation func, residuals D x , D y f(x) x ∈ D x z Φ = Φ(x) f(x) = Ψ(Φ(x)), g(y) = Λ(Θ(y)) p(z Θ |z Φ ) z Θ = τ(v|z Φ ) τ : v : x ∈ D x y 1 ∈ D y 5IFEPHJTDVUF 5IFEPHJTMPWFMZ y 2 ∈ D y z Φ = Φ(x) Λ(z Φ ) Λ(z Φ ) v
  8. Proposed Method Learning a Domain Translation τ • must capture

    all information of not represented in , but no information that is already represented in • v z Θ z Φ z Φ v = τ−1(z Θ |z Φ ) cINN
  9. Proposed Method Learning a Domain Translation τ • discards all

    information of , if and are independent • Minimize v z Φ v z Φ KL(p(v|z Φ )|q(v)) standard normal distribution Achieve the goal of sampling from ɹɹ , sampled from p(z Θ |z Φ ) z Θ = τ(v|z Φ ) v q(v)
  10. Proposed Method Domain Transfer Between Fixed Models • • Algorithm

    • 1. Sample from • 2. Encode into • 3. Sample from • 4. Transform • 5. Decode into x p(x) x z Φ = Φ(x) v q(v) z Θ = τ(v|z Φ ) z Θ y = Λ(z Θ ) 2 3 4 5 1
  11. Experiments 1. BERT-to-BigGAN Translation • Compare IS and FID with

    baselines using COCO-stuff dataset CVPR19 CVPR18 ICCV17 CVPR19
  12. Experiments 2. Reusing a single target generator • Encoder: (a,

    b) DeepLab, (c, d) ResNet50 Super-Resolution with Auto-encoder
  13. Experiments 2. Reusing a single target generator • How the

    invariances increase with increasing layer depth for visualization
  14. Conclusion • Propose cINN technique for reusing pre-trained models •

    NLP-to-Image • Image-to-Image • Label-to-Image • Achieve eco-friendly method