Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
AI最新論文読み会2022年まとめ
Search
医療AI研究所@大阪公立大学
December 07, 2022
Science
600
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
AI最新論文読み会2022年まとめ
AI最新論文読み会2022年まとめ
医療AI研究所@大阪公立大学
December 07, 2022
More Decks by 医療AI研究所@大阪公立大学
See All by 医療AI研究所@大阪公立大学
GPTの解説:ミートアップ用
ailaboocu
0
510
AI最新論文読み会2022年12月
ailaboocu
0
640
AI最新論文読み会2022年11月
ailaboocu
0
620
AI最新論文読み会2022年8月
ailaboocu
0
680
AI最新論文読み会2022年7月
ailaboocu
0
700
AI最新論文読み会2022年6月
ailaboocu
0
740
AI最新論文読み会2022年5月11日
ailaboocu
0
770
AI最新論文読み会2022年4月
ailaboocu
1
790
AI最新論文読み会2022年3月
ailaboocu
0
710
Other Decks in Science
See All in Science
Bear-safety-running
akirun_run
0
160
チュートリアル:世界モデル
hf149
0
1.8k
Tensor Factorization Meets Deformed Information Geometry: Convex Relaxation under Deformed Algebra
gkazunii
0
110
Endel Tulvingとエピソード記憶
rmaruy
0
140
Algorithmic Aspects of Quiver Representations
tasusu
0
380
データベース08: 実体関連モデルとは?
trycycle
PRO
0
1.2k
アクシズを探せ! 各勢力の位置関係についての考察
miu_crescent
PRO
1
390
Understanding CVP Waveforms: Interpretation and Clinical Implications in Anesthesiology
taka88
0
590
フィードフォワードニューラルネットワークを用いた記号入出力制御系に対する制御器設計 / Controller Design for Augmented Systems with Symbolic Inputs and Outputs Using Feedforward Neural Network
konakalab
0
140
Inside the Mind of an LLM
baggiponte
0
180
Rashomon at the Sound: Reconstructing all possible paleoearthquake histories in the Puget Lowland through topological search
cossatot
0
1k
Testing the Longevity Bottleneck Hypothesis
chinson03
0
330
Featured
See All Featured
The Language of Interfaces
destraynor
162
27k
Avoiding the “Bad Training, Faster” Trap in the Age of AI
tmiket
0
180
Building the Perfect Custom Keyboard
takai
2
800
Making Projects Easy
brettharned
120
6.7k
Fashionably flexible responsive web design (full day workshop)
malarkey
408
66k
Context Engineering - Making Every Token Count
addyosmani
9
970
How to build a perfect <img>
jonoalderson
1
5.7k
A Soul's Torment
seathinner
6
3k
The SEO Collaboration Effect
kristinabergwall1
1
490
ラッコキーワード サービス紹介資料
rakko
1
3.7M
Leading Effective Engineering Teams in the AI Era
addyosmani
9
2.1k
Side Projects
sachag
455
43k
Transcript
େࡕެཱେֶɹ২ాେथ AI࠷৽จಡΈձ 20221·ͱΊ
2022·ͱΊ AI࠷৽จಡΈձ ɾϝΠϯ ConvNeXt (2݄ൃද): ͍͍͢࠷ۙͷߴੑೳϞσϧ GLIDE (1݄ൃද)ɹςΩετtoը૾ੜ Imagic (11݄ൃද)ɹࡉ͔ͳमਖ਼
AudioLM (10݄ൃද): ԻੜϞσϧ Socratic Models (5݄ൃද): ൚༻AI (AGI) ɾͦͷଞ Wav2Vec 2 (7݄ൃද): NeuroAI?Brain-inspired AI (AIͱਓؒͷͷؔΛ୳Δ) Algorithmic Imprint (7݄ൃද): AI࡞ऀͷྙཧ
2022·ͱΊ Text GLIDE, Imagic AGI ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓೳ(AGI)ͷνϟϨϯδͷ1ɻ CV ConvNeXt AudioLM Speech
Socratic model Diffusion
2022·Ͱ·ͱΊ Self-Attention 2017 2018 BERT 2020 DETR ViT GPT3 2021
CLIP wav2vec 2 w2v-BERT BigSSL 2019 GPT2 SwinT DDPM ADM
2022·ͱΊ Text GLIDE, Imagic AGI ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓೳ(AGI)ͷνϟϨϯδͷ1ɻ CV AudioLM Speech Socratic
model Diffusion ConvNeXt
ConvNeXt: CNN x SwinTransformer ը૾ྨϞσϧͷstate-of-the-art
ConvNeXt: CNN x SwinTransformer ͷ·ͱΊ: ϕʔεResNet
ConvNeXt ·ͣॳΊʹɻ
ConvNeXt ֤εςʔδͷ܁Γฦ͠ΛSwinTʹ͚ۙͮΔ
ConvNeXt 4×4 non-overlapping convolution ΈࠐΈͷύονԽ
ConvNeXt Depthwise convolutionಋೖޙɺ෯Λ͛Δ
ConvNeXt Inverted bottleneck(Narrow→Wide→Narrow)ߏͷಋೖ TransformerͰ֦େ4ഒΛ༻ɻ※MobileNetͰ֦େ6ഒɻ શମͱͯ͠ͷܭࢉྔݮΔ͕ɺConvͷԋࢉ૿Ճɻ SwinTͰίί
ConvNeXt Depthwise convolutionͷҠಈ ※Depthwise ConvolutionͰେ͖ͳΧʔωϧαΠζ͏ͨΊ Ұ࣌తʹConvͷԋࢉྔݮগͰੑೳѱԽɻ SwinTͰίί MSAϒϩοΫ͕FFNΑΓઌ಄ʹ͋Δ
ConvNeXt SwinTransformerͷΧʔωϧαΠζ(7)ΛਅࣅΔ Depthwise convolutionͷ ΧʔωϧαΠζେ͖͍ͯ͘͘͠ɻ 7Ͱੑೳ͕(SwinTͱಉ͡) ↓
ConvNeXt ࡉ͔ͳSwinT or ViTͷΛಋೖ ReLU→GELU NormalizationݮΒ͢ BN→LN μϯαϯϓϧΛΓ͠
ConvNeXt ݁Ռ
ConvNeXt ResNetΛSwinTransformerԽͯ͠ɺ CNN͚ͩͰState-of-the-artग़ͨΑɻ
None
2022·ͱΊ Text AGI ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓೳ(AGI)ͷνϟϨϯδͷ1ɻ CV ConvNeXt AudioLM Speech Socratic model
Diffusion GLIDE, Imagic
GLIDE Stable Di ff usionͷجૅϞσϧ
Diffusion model ੜϞσϧ
Diffusion modelͷྺ࢙ DDPM ADM GLIDE CLIP ↓ ҆ఆԽɺߴղ૾Խ ݴޠΛѻ͏
Diffusion modelͷྺ࢙ DDPM ADM GLIDE CLIP ↓
DDPM: diffusion modelͷ࢝·Γ DNN Image Noise Image +
Noise ਪͨ͠ Noise ࣌ࠁใ ೋޡࠩ ࠷খԽ
Diffusion modelͷྺ࢙ DDPM ADM GLIDE CLIP ↓
ADM: ϞσϧΛ2ͭʹ͚ͯɺߴղ૾Խʹޭɻ Base Upsampler ྨ ߴղ૾ Classi fi er guidance
(CNN)
GLIDE = CLIP x Diffusion model Di ff usion modelͷྺ࢙
DDPM ADM GLIDE CLIP ↓
CLIP: ը૾ͱςΩετͷڮ͠
CLIP: ը૾ͱςΩετͷڮ͠ ը૾ͱςΩετΛൺֱͰ͖ΔΑ͏ʹಛมͰ͖ΔϞσϧ ViT: Image Transformer: Text ίαΠϯྨࣅ
ADM Base Upsampler ྨ ߴղ૾ Classi fi er guidance (CNN)
GLIDE = ADM-basedʹCNNΛCLIPʹมߋ ADM-basedʹCNNΛCLIPʹมߋ Base Upsampler ྨ ߴղ૾ Classi fi
er guidance (CLIP)
Imagic: Stable DiffusionͷվྑςΫχοΫ Stable Di ff usionͷվྑςΫχοΫ
Imagic Overview
None
2022·ͱΊ Text GLIDE, Imagic AGI ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓೳ(AGI)ͷνϟϨϯδͷ1ɻ CV ConvNeXt Speech Socratic
model Diffusion AudioLM
AudioLM ԻͷੜϞσϧ
AudioLM = w2v-BERT x SoundStream Overview ɾจষͱΦʔσΟΦͷؒʹҰରଟͷ͕ؔ͋Δɻ ɾΦʔσΟΦςΩετʹൺͯ͠σʔλྔ͕ଟ͍ɻ
SoundStream ԻΛྔࢠԽ͢Δ
w2v-BERT Contrastive LearningͱMasked Language ModelingͷΈ߹Θͤ
None
2022·ͱΊ Text GLIDE, Imagic AGI ࣗવݴޠॲཧΛج൫ͱͨ͠൚ਓೳ(AGI)ͷνϟϨϯδͷ1ɻ CV ConvNeXt AudioLM Speech
Diffusion Socratic model
Socratic models طଘֶशࡁΈϞσϧΛΈ߹Θͤͨ(४ʁ)൚ਓೳϞσϧ
Socratic models Overview Language is an intermediate representation
Socratic models Overview طଘͷVLM (Visual Language Model)ɺLMs (Large Language Model)
ɺ ALMs (Audio Language Model)ͷಉ࢜ が ɺߏԽ͞ΕͨରΛߦ͏ɻ ͦͯ͠ɺ ビデ ΦαʔνɺΩϟ プ γϣϯੜɺ ビデ ΦQ&A (ະͷλεΫ)ɺকདྷͷߦಈ༧ଌΛ͜ͷରۭؒͷ৽͍͠ࢀՃऀͱͯ͠ѻ͏ ɻ
Socratic models ྫࣔ̍ɿجຊฤ
Socratic models ྫࣔ̎ɿԠ༻ฤ
Socratic models ιΫϥςεରͱʁ
None
Others: NeuroAIᶃ ͷػೳͱݴޠϞσϧͷରԠΛ୳Δ
Others: NeuroAIᶃ શମ૾: Wav2Vec 2Λֶश͠ɺͦͷ݁Ռ͔ΒfMRIͷBOLDΛ༧ଌ͢ΔWΛ࡞ɾ݁Ռݕূ
Others: NeuroAIᶃ ฏۉԽͨ͠ͷ׆ੑͷදݱɻ
Others: NeuroAIᶃ ϞσϧͷϨΠϠʔͷਂ͞ͱͷ෦ҐʹରԠ͕͋ͬͨɻ
Others: NeuroAIᶄ ͔ΒݴޠΛੜ͢Δ
Others: NeuroAIᶄ ϞσϧͷτϨʔχϯάηογϣϯ 81िؒʹΘͨΓ50ճͷηογϣϯ ݽཱޠλεΫͱจষλεΫ λʔήοτͷ୯ޠจষ͕ը໘্ͷจࣈͱͯ͠ ඃݧऀʹࢹ֮తʹఏࣔ͞Εඃݧऀ ͦͷ୯ޠจষΛੜ͠Α͏ͱͨ͠ɻ ݽཱޠλεΫͰɺ50ݸͷӳ୯ޠηοτ͔Βݸʑͷ୯ޠΛੜɻ จষλεΫͰɺ50୯ޠηοτ͔ΒͳΔӳޠจ͔Β୯ޠྻΛੜɻ
Others: NeuroAIᶄ Ϟσϧͷ݁Ռ จষ75%ͷਫ਼ ୯ޠ93%ͷਫ਼
None
Others: AI Ethics ྙཧ
Algorithmic Imprint Ξϧ ゴ Ϧ ズ ϜʹΑΔ が ൃੜͨ͠߹ͷҰൠత で
߹ཧతͳରࡦͱͯ͠ɺͦͷ༗ͳӨڹ が ͞Βʹൖ͢ΔͷΛ ぐ ͨΊʹ Ξϧ ゴ Ϧ ズ Ϝͷ༻ఀࢭ が Α͘ߦΘΕΔ が ɺఀࢭ͔ͨ͠Βͱݴͬͯެฏੑɺઆ໌ɺಁ໌ੑɺྙཧͷ が ͳ͘ͳΔ Θ͚ で ͳ͍ →͜ͷ༗ͳΞϧ ゴ Ϧ ズ ϜͷӨڹɺΞϧ ゴ Ϧ ズ ϜআҎ߱͘Өڹ͠ଓ͚Δ(Ξϧ ゴ Ϧ ズ Ϝͷࠟ) ྫ: ӳࠃΛڌͱ͢Δߴߍͷଔۀূॻࢼݧ で ͋ΔGCEࢼݧͷΞϧ ゴ Ϧ ズ ϜʹΑΔධՁΛऔΓר͘(2020) ▪ ど ͷΑ͏ͳࢼݧ͔? ɾ 160͔ࠃҎ্ で ࣮ࢪ͞Ε͍ͯΔ(ͦͷଟ͘ӳࠃͷݩ২ຽ)ࠃࡍతʹೝΊΒΕͨࢼݧ ɾ AϨ ベ ϧͷඞવత で ͋ΓɺେֶͷೖֶʹෆՄܽͳׂΛՌͨ͢ ▪ܦҢ ɾCOVID-19ͷେྲྀߦʹΑΓGCEࢼݧΛಜ͢ΔӳࠃʹຊڌΛஔ͘४ػؔ で ͋ΔOfqualର໘ࢼݧΛதࢭͨ͠ ɾࢼݧͷΘΓʹɺֶߍ で ͷੜెͷաڈͷɺڭࢣͷධՁΛ༻ͯ͠Ξϧ ゴ Ϧ ズ Ϝ で Λ࡞ͨ͠ →݁Ռɺੈքతͳ߅ٞߦಈ が ຄൃ͠ɺΞϧ ゴ Ϧ ズ Ϝআ͞Εͨ ɹڭࢣଆ: ͦͦաڈͷੜెͷධՁΛه͍ͯ͠ͳ͍ ɹੜెଆ: ʹରͯ͠ਅʹऔΓΜ で ͍ͳ͔ͬͨ(ࢼݧ が શͯͳͷ で લͷ30~60ʹษڧ͢Δੜె が ଟ͍) ɾΞϧ ゴ Ϧ ズ Ϝআ͞Εͨ が ɺֶੜͷ࠶ධՁߦΘΕͳ͔ͬͨɻ ͢ͳΘͪɺ࠾ํ๏มΘͬͨ が ɺΞϧ ゴ Ϧ ズ ϜͷӨڹΛେ͖͘ड͚͍ͯͨ(Ξϧ ゴ Ϧ ズ Ϝͷࠟ)
Algorithmic Imprint ▪Algorithmic Imprint(Ξϧ ゴ Ϧ ズ Ϝͷࠟ)Λҙࣝͨ͠Ξϧ ゴ Ϧ
ズ Ϝͷઃܭ ʮΞϧ ゴ Ϧ ズ ϜͷࠟʯΛҙࣝͨ͠ઃܭͷߟ͑ํʹΑΓɺΞϧ ゴ Ϧ ズ Ϝ։ൃ プ ϩηεΛΑΓެฏ で ࣾձٕज़తͳ ใʹج づ ͍ͨͷʹ͢Δ͜ͱ がで ͖Δɻ (1)Ξϧ ゴ Ϧ ズ ϜͷӨڹ Ξϧ ゴ Ϧ ズ Ϝআͨ͠ޙʹརؔऀʹӨڹΛٴ ぼ ͢ɻ։ൃऀͱӡӦऀΞϧ ゴ Ϧ ズ ϜΛআ ͢Δ だ ͚ で ͳ͘ɺΞϧ ゴ Ϧ ズ ϜʹΑΔةΛੋਖ਼͠ɺઆ໌ が ࣋ଓͯ͠ཁٻ͞ΕΔɻ (2)Ξϧ ゴ Ϧ ズ Ϝઃܭͷઆ໌ ։ൃऀʮΞϧ ゴ Ϧ ズ ϜͷࠟʯͷӨڹΛड͚ΔਓʹΛΑΓೝࣝ で ͖ΔΑ͏ʹ͢Δ べ ͖ で ͋Δɻ (3)AIྙཧ ガ バ φϯε で ิڧ͢Δ ٕज़తͳհೖ だ ͚ で Λݮ͢Δ͜ͱ で ͖ͳ͍ɻ ʮΞϧ ゴ Ϧ ズ ϜͷࠟʯΛҙࣝͨ͠Ξϧ ゴ Ϧ ズ ϜઃܭΛ దͳAI ྙཧ ガ バ φϯε で ิ͢Δɻ
None
2023ʹ͍ͭͯ ʮզʑͷݚڀࣨʹ͔͠Ͱ͖ͳ͍͜ͱʯΛɻ Ҿ͖ଓ͖ษڧձ։࠵͢Δɻ ҩֶͷൺॏΛॏ͘͢Δɻ ҩྍը૾ݚڀ༻ϞσϧͷνϡʔτϦΞϧɾϋϯζΦϯ