Training ← PickUp! 2. An Empirical Study of Training Self-Supervised Vision Transformers 3. Cross-validation: what does it estimate and how well does it do it? 4. GANcraft: Unsupervised 3D Neural Rendering of Minecraft Worlds 5. StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery 6. LocalViT: Bringing Locality to Vision Transformers 7. Keyword Transformer: A Self-Attention Model for Keyword Spotting 8. Multiscale Vision Transformers 9. SiT: Self-supervised vIsion Transformer 10. Self-supervised Video Object Segmentation by Motion Grouping
StyleGAN Imagery 3. RepVGG: Making VGG-style ConvNets Great Again 4. Representation Learning for Networks in Biology and Medicine: Advancements, Challenges, and Opportunities 5. Cross-validation: what does it estimate and how well does it do it? 6. Factors of In fl uence for Transfer Learning across Diverse Appearance Domains and Task Types 7. Why Do Local Methods Solve Nonconvex Problems? 8. Scaling Scaling Laws with Board Games 9. Vision Transformers for Dense Prediction 10. Ef fi cientNetV2: Smaller Models and Faster Training)
Models and Faster Training) ຊจͰɺैདྷͷϞσϧΑΓߴͳֶशͱ༏ΕͨύϥϝʔλޮΛ࣋ͭɺ৽͍͠ΈࠐΈωοτϫʔΫͷϑΝϛ ϦʔͰ͋ΔEf fi cientNetV2Λհ͠·͢ɻ͜ͷϞσϧ܈Λ։ൃ͢ΔͨΊʹɺզʑτϨʔχϯάΛߟྀͨ͠χϡʔϥϧɾ ΞʔΩςΫνϟͷ୳ࡧͱεέʔϦϯάͷΈ߹ΘͤΛ༻͍ͯɺτϨʔχϯάͱύϥϝʔλޮΛڞಉͰ࠷దԽͨ͠ɻ ͜ͷϞσϧɺFused-MBConvͳͲͷ৽͍͠ػೳͰڧԽ͞Εͨ୳ࡧۭ͔ؒΒ୳ࡧ͞Ε·ͨ͠ɻ࣮ݧͷ݁Ռɺ Ef fi cientNetV2Ϟσϧɺ࠷ઌͷϞσϧΑΓΔ͔ʹߴʹֶशͰ͖ΔҰํͰɺ࠷ྑͰ6.8ഒখ͍͞αΠζʹͳΔ͜ͱ ͕͔Γ·ͨ͠ɻ ֶशதʹը૾αΠζΛஈ֊తʹେ͖͘͢Δ͜ͱͰɺֶशΛ͞ΒʹߴԽ͢Δ͜ͱ͕Ͱ͖·͕͢ɺ͠ ͠ਫ਼ͷԼΛҾ͖ى͜͠·͢ɻ͜ͷਫ਼ԼΛิ͏ͨΊʹɺυϩοϓΞτσʔλ૿ڧͳͲͷਖ਼ଇԽΛదԠతʹௐ ͢Δ͜ͱΛఏҊ͠ɺߴͳֶशͱྑͳਫ਼ͷཱ྆Λ࣮ݱ͍ͯ͠·͢ɻ ϓϩάϨογϒֶशʹΑΓɺEf fi cientNetV2 ImageNet͓ΑͼCIFAR/Cars/Flowersσʔληοτʹ͓͍ͯɺैདྷͷϞσϧΛେ෯ʹ্ճΔ݁ՌΛಘ·ͨ͠ɻಉ͡ ImageNet21kͰࣄલֶशΛߦ͏͜ͱͰɺզʑͷEf fi cientNetV2ImageNet ILSVRC2012ʹ͓͍ͯ87.3%ͷτοϓ1ਫ਼Λ ୡ͠ɺ࠷ۙͷViTΛ2.0%্ճΔਫ਼Λୡ͠·ͨ͠ɻҰํͰɺಉ͡ܭࢉࢿݯΛ༻͍ͯ5ഒ͔Β11ഒͷͰֶशΛߦ͍ ·ͨ͠ɻίʔυ https://github.com/google/automl/ef fi cientnetv2 Ͱެ։͞Ε·͢ɻ http://arxiv.org/abs/2104.00298v1 Google Research, Brain Team. ˠ& ffi DJFOU/FU7ͷൃදɻ ɹ& ffi DJFOU/FUΛ࣮༻తʹֶशͰ͖ΔΑ͏ʹɺܰྔԽɾߴԽɻ
Learning across Diverse Appearance Domains and Task Types) సҠֶशͱɺݩͱͳΔλεΫͰֶशͨࣝ͠ΛɺରͱͳΔλεΫͷֶशʹ࠶ར༻͢Δ͜ͱͰ͢ɻ ILSVRCσʔληοτΛ༻͍ͯը૾ྨϞσϧΛࣄલʹֶश͠ɺͦͷޙɺҙͷλʔήοτλεΫͰඍௐ Λߦ͏ͱ͍ͬͨ୯७ͳܗͷసҠֶशɺݱࡏͷ࠷ઌͷίϯϐϡʔλϏδϣϯϞσϧͰҰൠతʹߦΘ Ε͍ͯΔɻ͔͠͠ɺ͜Ε·Ͱͷୡֶशʹؔ͢ΔମܥతͳݚڀݶΒΕ͓ͯΓɺୡֶश͕ͲͷΑ͏ͳঢ় گͰػೳ͢Δ͜ͱ͕ظ͞ΕΔͷ͔ɺेʹཧղ͞Ε͍ͯͳ͍ɻຊจͰɺඇৗʹҟͳΔը૾υϝΠϯ ʢফඅऀͷࣸਅɺࣗߦɺߤۭࣸਅɺਫதɺγʔϯɺ߹ɺΫϩʔζΞοϓʣͱλεΫλΠϓʢη ϚϯςΟοΫηάϝϯςʔγϣϯɺΦϒδΣΫτݕग़ɺਂਪఆɺΩʔϙΠϯτݕग़ʣΛରʹɺసҠֶ शͷൣͳ࣮ݧతௐࠪΛ࣮ࢪ͠·ͨ͠ɻॏཁͳͷɺ͜ΕΒͷλεΫͯ͢ɺݱͷίϯϐϡʔλϏ δϣϯΞϓϦέʔγϣϯʹؔ࿈͢ΔɺෳࡶͰߏԽ͞Εͨग़ྗλεΫͰ͋Δͱ͍͏͜ͱͰ͢ɻ߹ܭͰ 1200Ҏ্ͷసૹ࣮ݧΛߦ͍·ͨ͠ɻͦͷதʹɺιʔεͱλʔήοτ͕ҟͳΔը૾υϝΠϯɺλεΫλ Πϓɺ·ͨͦͷ྆ํ͔Βߏ͞Ε͍ͯΔͷଟؚ͘·Ε͍ͯ·͢ɻ͜ΕΒͷ࣮ݧΛମܥతʹੳ͠ɺ ը૾υϝΠϯɺλεΫλΠϓɺσʔληοτͷαΠζ͕ୡֶशͷύϑΥʔϚϯεʹ༩͑ΔӨڹΛཧղ͠ ·͢ɻ͜ͷݚڀʹΑΓɺ͍͔ͭ͘ͷಎ͕ಘΒΕɺ࣮ऀͷ۩ମతͳఏҊʹͭͳ͕Γ·ͨ͠ɻ http://arxiv.org/abs/2103.13318v1 Google Research ˠը૾υϝΠϯͷసҠֶशͷௐࠪจ