,@ VU ( ) Y Twitter: @ohnishi_ka n 8R Y 2014-41-2017-91: B4~M2.52<,9Computer VisionIJ MCO • 5N (;SEB) TQ: http://katsunoriohnishi.github.io/ Y CVPR2016 (spotlight oral, acceptance rate=9.7%): egocentric vision (wrist-mounted camera) Y ACMMM2016 (poster, acceptance rate=30%): action recognition (0W state-of-the-art) Y AAAI2018 (oral, acceptance rate=10.9%): video generation (FTGAN) Y 2017-101->D: DeNA AI "&*3 • FGDeNA)"$#%*=6:X9PA7 (+!'4/? Y → https://www.wantedly.com/projects/209980 Y LK.H 3
n #$ / -( Image classification#, / ) action recognition = &".*human action recognition • ! fine-grained egocentric '#+% 4 Fine-grained egocentric Dog-centric Action recognition RGBD Evaluation of video activity localizations integrating quality and quantity measurements [C. Wolf+, CVIU14] Recognizing Activities of Daily Living with a Wrist-mounted Camera [K. Ohnishi+, CVPR16] A Database for Fine Grained Activity Detection of Cooking Activities [M. Rohrbach+, CVPR12] First-Person Animal Activity Recognition from Egocentric Videos [Y. Iwashita+, ICPR14] Recognizing Human Actions: A Local SVM Approach [C. Schuldt+, ICPR04] HMDB: A Large Video Database for Human Motion Recognition [H. Kuehne+, ICCV11] Ucf101: A dataset of 101 human actions classes from videos in the wild [K. Soomro+, arXiv2012]
3D convolution n C3D, P3D #& ( + 3D conv " n $! + )' 3D conv % * [K. Hara+, CVPR18] 16 Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? [K. Hara+, CVPR18] 2012 2011 2015 2017
3D convolution n C3D, P3D #& ( + 3D conv " n $! + )' 3D conv % * [K. Hara+, CVPR18] 17 Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? [K. Hara+, CVPR18] 2012 2011 2015 2017 2017 Kinetics!
3D convolution n KineticsD. I A?H; ,<Chuman action dataset! I 3D convB)%(+"5F • Pre-train -UCF1014<C/= 18 The Kinetics human action video dataset [W. Kay+, arXiv17] • Youtube8M@ <C"!*& 80>3 • '$#%(E216097:G
3D convolution n I3D [J. Carreira +, ICCV17] D Kinetics dataset)*DeepMind( 95 D 3D conv4 .6?Inception D 64 GPUs for training, 16 GPUs for predict D ><+/308state-of-the-art • RGB@;$#"BC • Two-stream):optical flow-= %score&&!$ /31' 19 Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J. Carreira +, ICCV17] UCF101 HMDB51 RGB-I3D 95.6% 74.8% Flow-I3D 96.7% 77.1% Two-stream I3D 98.0% 80.7% 2,A7 …
3D convolution n I3D [J. Carreira +, ICCV17] D Kinetics dataset)*DeepMind( 95 D 3D conv4 .6?Inception D 64 GPUs for training, 16 GPUs for predict D ><+/308state-of-the-art • RGB@;$#"BC • Two-stream):optical flow-= %score&&!$ /31' 20 Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J. Carreira +, ICCV17] UCF101 HMDB51 RGB-I3D 95.6% 74.8% Flow-I3D 96.7% 77.1% Two-stream I3D 98.0% 80.7% 2,A7 … ?
3D convolution n 3D convolution F"N "/M [D.A. Huang+, CVPR18] O LE • 3D CNN 36 @J"N O =A • 6 @J1 9>C;47D→6 @J"N ! • I:%)' G?<8%)'2K"- B>9 0 • Two-stream I3D Optical flow"3D conv5H#*$*&( 9> + , . 22 What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets [D.A. Huang+, CVPR18]
3D convolution n 3D convCd)- 'D$E*)!%K N ;!g ;W h CVPR18H=$_A\MQ,8*"*!>a h FY, US,*#]?$CVPR/ICCV/ECCV'8 h *,eL)[b/2& 9^ !OJ h $+-3D convD,E )`?fG143,:<8'*)3D conv7@!RB • GPUG0.Vc- 23 DT 0. XZ, (N I6 5P
Optical flow n Optical flow 285&K $ )I [L Sevilla-Lara+, CVPR18] L HA • Optical flow(3285 K "%$ L .C7; • Optical flowF-(EPE)action recognitionF- L #0, • B6?<9flowF-action recognitionF- L B69F-*$40, $ L 7> • "285&1D$ !# Optical flowappearanceE=/GJ " @ • Optical flowF-!#2B6?& #K $+ ': 24 On the Integration of Optical Flow and Action Recognition [L Sevilla-Lara+, CVPR18]
-?:+1% s CNN`p' aTLEcoding • TLEActionVLAD(?.?9=RL s iDT!] • CNNi(?.?9=%Zc0-(jA % • FisherVector#iDT7)8<oqX^GI $SNJ" s Tips: PCA] (dim=64). K=256. FVpower norm • CPUB,<*5/03; W h % s 624>,nQ_EDo%kF@% s OUTVKmMd$ s X^g "Y\lHbP!Y[erf% 33 GIC &
Hierarchical Video Generation from Orthogonal Information: Optical Flow and Texture K. Ohnishi+, AAAI 2018 (oral presentation) https://arxiv.org/abs/1711.09618 38 Optical flow
IX#k >[ m MVd&@B"Z^jf%1 m HiGgY/* m XYXYT! QPK O "ACO(n2)→ O(n3)! 2:5 • IX(,?"B1 "D) #hl! n <7934;= m J+L1 ,?%''N n &_ ]U#\ S $ m -#WIX!`* n ce!.EF!.IX 86"ab0+R ,>T 40