NNEFを読めるようになろう

ΛಡΊΔΑ͏ʹͳΖ͏ ͸ Khronos Group Inc. ͷొ࿥঎ඪͰ͢ NAOMASA MATSUBAYASHI(@fadis_)

x0 x1 x2 y0 y1 y2 χϡʔϥϧωοτϫʔΫ

w01 w02 w03 w04 × × × × ∑ ׆
ੑ Խ ؔ ਺ ॏΈ ܗࣜχϡʔϩϯ ೖྗ0 ೖྗ1 ೖྗ2 ೖྗ3 ⋯ ⋮ ग़ྗ

x0 x1 x2

x0 x1 x2 ೖ ྗ ૚ શ݁߹૚ શ݁߹૚ શ݁߹૚

x0 x1 x2

Կͱͳ͘૬ؔ͸͋Γͦ͏ͳΜ͚ͩͲ Ͳ͏͍͏ؔ܎͔Α͘Θ͔Βͳ͍σʔλ ? 2 7 4 खॻ͖จࣈͷը૾ ॻ͔Ε͍ͯΔ਺ࣈ

ͱΓ͋͑ͣΑ͘Θ͔Βͳ͍ؔ܎Λ χϡʔϥϧωοτϫʔΫʹ͢Δ 2 7 4 w શ ݁ ߹ ૚
શ ݁ ߹ ૚ શ ݁ ߹ ૚ ೖྗ૚ w w

ਖ਼͍͠ग़ྗ͕ग़ͯ͘ΔΑ͏ͳ Λݟ͚ͭΔ ਺ཧ࠷దԽ໰୊ʹͳΔ w 2 7 4 w શ ݁
߹ ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ ೖྗ૚ w w ֶश ͜Ε͕ ग़ͯ͘ΔΑ͏ʹ ͜ΕΛௐ੔

4 w ૚ ૚ ૚ w w ͜Ε͕ ग़ͯ͘ΔΑ͏ʹ ͜ΕΛௐ੔
ֶशΛߦ͏ͨΊͷϑϨʔϜϫʔΫ TensorFlow PyTorch Caffe

TensorFlow ΑΜ 7 ≠ ͜ͷঢ়گͰ7͕ग़ΔΑ͏ʹ ॏΈΛमਖ਼ w w w ͜Ε͸ԿͰ͔͢
΍Γ௚͠ ΞϓϦέʔγϣϯ ͳͳ w w w ͜Ε͸ԿͰ͔͢ ਖ਼͍͠ग़ྗΛు͚ΔΑ͏ʹͳͬͨΒ ΞϓϦέʔγϣϯʹ૊ΈࠐΜͰ࢖͍͍ͨ

ϑϨʔϜϫʔΫͷ਺͚ͩอଘܗ͕ࣜ͋Δ TensorFlow PyTorch Caffe w w tf.saved_modelܗࣜ w w pickleܗࣜ
w w caffemodelܗࣜ

ΞϓϦέʔγϣϯ͔Βར༻͢Δͷ͕ਏ͍ w w tf.saved_modelܗࣜ w w pickleܗࣜ w w caffemodelܗࣜ
ΞϓϦέʔγϣϯ

Adobe Photoshop PSDܗࣜ GIMP XCFܗࣜ PNGܗࣜ ΞϓϦέʔγϣϯ ͜͏͍͏ͷແ͍ͷ? ม׵ ม׵
JPEGܗࣜ

w w tf.saved_modelܗࣜ TensorFlow w w caffemodelܗࣜ Caffe ม׵ ม׵
w w ΞϓϦέʔγϣϯ ͋Δ

ೖ ྗ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ
ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ ֶश݁ՌΛ࠶ݱ͢Δҝʹอଘ͓͔ͯ͠ͳ͚Ε͹ͳΒͳ͍෺ ͲͷΑ͏ͳ૚͕ͲͷΑ͏ͳॱͰܨ͕͍ͬͯΔ͔ 3x3ͷϑΟϧλͰ 3νϟωϧ͔Β64νϟωϧʹม׵ ύσΟϯά͸֤ล1ͮͭ dilationͱstride͸ͦΕͧΕ1 ֤ͦͯ͠૚ͷઃఆ

ೖ ྗ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ
ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ ֶश݁ՌΛ࠶ݱ͢Δҝʹอଘ͓͔ͯ͠ͳ͚Ε͹ͳΒͳ͍෺ ֤૚ͷॏΈ w w w w w w w w w w w w w w w w 3x3x3x64ͷ4֊ͷςϯιϧ ֤ཁૉͷ஋͸32bitුಈখ਺఺਺Ͱه࿥ ͦͯ͠ॏΈ͕ͲͷΑ͏ͳܕͰ ϑΝΠϧʹॻ͔Ε͍ͯΔ͔

w w w ුಈখ਺఺਺ͷ ςϯιϧ ුಈখ਺఺਺ͷ ςϯιϧ ੔਺ͷ ςϯιϧ ੔਺Λ
ුಈখ਺఺਺ʹ ม׵͢Δํ๏ ࢀর ࢀর min=-2.0 max=2.0 bit=8 ࢀর ωοτϫʔΫͷܗͱઃఆ

ුಈখ਺఺਺ʹ ม׵͢Δํ๏ ࢀর ࢀর min=-2.0 max=2.0 bit=8 ࢀর ωοτϫʔΫͷܗͱઃఆ ਓ͕ؒ ಡΈॻ͖Ͱ͖Δ ςΩετܗࣜ ૉૣ͘ύʔεͰ͖Δ όΠφϦܗࣜ

ුಈখ਺఺਺ʹ ม׵͢Δํ๏ ࢀর ࢀর min=-2.0 max=2.0 bit=8 ࢀর ωοτϫʔΫͷܗͱઃఆ ϔομ ਺஋ͷྻ 128όΠτݻఆ ϑΝΠϧΛύʔε͢Δલʹ σʔλͷసૹΛ࢝ΊΒΕΔ

ුಈখ਺఺਺ʹ ม׵͢Δํ๏ ࢀর ࢀর min=-2.0 max=2.0 bit=8 ࢀর ωοτϫʔΫͷܗͱઃఆ version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... } graph.nnef NNEF 1.0ͷ࢓༷ʹैͬͯॻ͔Ε͍ͯ·͢ ͜ͷωοτϫʔΫͷ໊લ

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... } graph.nnef
ωοτϫʔΫʹର͢Δೖྗ஋͸ ͜ͷൣғͷதͰఆٛ͞ΕΔ dataͱ͍͏όοϑΝʹॻ͖ࠐΈ·͢ ͜ͷൣғ ωοτϫʔΫͷग़ྗ͸ ͜ͷൣғͷதͰఆٛ͞ΕΔ probͱ͍͏όοϑΝ͔ΒಡΈ·͢

input convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU
maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax VGG16 SIMONYAN, Karen; ZISSERMAN, Andrew. Very deep convolutional networks for large- scale image recognition. arXiv preprint arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556

https://github.com/KhronosGroup/NNEF-Tools/tree/main/models#nnef-model-zoo ILSVRC༻ʹֶशΛߦͬͨ VGG16ͷcaffemodelΛ ʹม׵ͨ͠΋ͷ

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label
= 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef ೖྗ ग़ྗ

= 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef 224x224ͰRGB(3νϟωϧ)ͷը૾͕10ຕ·ͱΊͯ֎͔Βೖͬͯ͘Δ external - ֎͔Βೖͬͯ͘ΔσʔλΛఆٛ͢Δ external<scalar>(shape = [10, 3, 224, 224]);

= 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef linear - ઢܗม׵ linear(relu_13, variable_28, variable_29); ϕΫτϧrelu_13ʹߦྻvariable_28Λֻ͚ͯϕΫτϧvariable_29Λ଍͢

x0 x1 x2 x3 y0 y1 y2 y3 y4 ͱ
͕͜͏͍͏ؔ܎ʹ͋Δ࣌ x y શ݁߹૚ - શͯͷ૊Έ߹Θ͕ͤ઀ଓ͞Ε͍ͯΔ

x0 x1 x2 x3 y0 y1 y2 y3 y4 w00
w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 w40 w41 w42 w43 ֤χϡʔϩϯͷॏΈΛฒ΂ͯߦྻʹ͢Δͱ

ϕ w00 w01 w02 w03 w10 w11 w12 w13 w20
w21 w22 w23 w30 w31 w32 w33 w40 w41 w42 w43 x0 x1 x2 x3 = y0 y1 y2 y3 y4 ׆ੑԽؔ਺ શ݁߹૚=ߦྻͱϕΫτϧͷੵΛٻΊͯ׆ੑԽؔ਺ʹ௨͢

= 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef ReLU - ྲྀߦΓͷ׆ੑԽؔ਺ relu(linear_1);

= 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); ߦྻͱϕΫτϧͷ ੵΛٻΊͯ ׆ੑԽؔ਺ʹ௨͢ ͦͷ݁ՌΛ

= 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef શ݁߹૚

= 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef softmax - ग़ྗͷ૯࿨͕1ʹͳΔ׆ੑԽؔ਺ softmax(linear_2, axes = [1]);

= 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); ߦྻͱϕΫτϧͷ ੵΛٻΊͯ ׆ੑԽؔ਺ʹ௨͢ ͦͷ݁ՌΛ

= 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef શ݁߹૚ શ݁߹૚

= 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); variable_31 = variable<scalar>(label = 'fc8_blob2', shape = [1, 1000]); variable_30 = variable<scalar>(label = 'fc8_blob1', shape = [1000, 4096]); variable_29 = variable<scalar>(label = 'fc7_blob2', shape = [1, 4096]); variable_28 = variable<scalar>(label = 'fc7_blob1', shape = [4096, 4096]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef variable - ϑΝΠϧ͔Β஋ΛಡΈࠐΉ

= 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); variable_31 = variable<scalar>(label = 'fc8_blob2', shape = [1, 1000]); variable_30 = variable<scalar>(label = 'fc8_blob1', shape = [1000, 4096]); variable_29 = variable<scalar>(label = 'fc7_blob2', shape = [1, 4096]); variable_28 = variable<scalar>(label = 'fc7_blob1', shape = [4096, 4096]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef variable - ϑΝΠϧ͔Β஋ΛಡΈࠐΉ fc7_blob2.datͱ͍͏໊લͷϑΝΠϧ͔Β 4096ཁૉͷϕΫτϧΛಡΜͩ΋ͷΛ variable_29ͱݺͿࣄʹ͢Δ variable<scalar>(label = 'fc7_blob2', shape = [1, 4096]);

= 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); variable_31 = variable<scalar>(label = 'fc8_blob2', shape = [1, 1000]); variable_30 = variable<scalar>(label = 'fc8_blob1', shape = [1000, 4096]); variable_29 = variable<scalar>(label = 'fc7_blob2', shape = [1, 4096]); variable_28 = variable<scalar>(label = 'fc7_blob1', shape = [4096, 4096]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef variable - ϑΝΠϧ͔Β஋ΛಡΈࠐΉ fc7_blob1.datͱ͍͏໊લͷϑΝΠϧ͔Β 4096x4096ͷߦྻΛಡΜͩ΋ͷΛ variable_28ͱݺͿࣄʹ͢Δ variable<scalar>(label = 'fc7_blob1', shape = [4096, 4096]);

ReLU w00 w01 w02 w03 w10 w11 w12 w13 w20
w21 w22 w23 w30 w31 w32 w33 w40 w41 w42 w43 x0 x1 x2 x3 + aaaaa aaaaa aaaaa aaaaa aaaaa = y0 y1 y2 y3 y4 ׆ੑԽؔ਺ w fc7_blob1.dat w fc7_blob2.dat relu_13 relu_14

maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax VGG16 SIMONYAN, Karen; ZISSERMAN, Andrew. Very deep convolutional networks for large- scale image recognition. arXiv preprint arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556 ࠓݟͨ෦෼

maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax ৞ΈࠐΈ૚ VGG͸ը૾ͷ෼ྨͷҝʹ࡞ΒΕͨχϡʔϥϧωοτϫʔΫͳͷͰ ্ͷํʹ͸ը૾ॲཧ޲͖ͷ૚͕ஔ͔Ε͍ͯΔ

0.2 0.1 0.1 0.2 0.7 0.1 0.1 0.2 0.1 0.9
0 0 0 0 0 0.7 0.4 0 0.3 0 0 0 0.8 0 0 0.31 ೖྗ ग़ྗ ೖྗը૾ͷ࿈ଓ͢ΔϐΫηϧʹ ϑΟϧλΛ͔͚ͯ ग़ྗը૾ͷ஋ΛಘΔ ϑΟϧλ Α͋͘Δը૾ॲཧͷܗ

0.2 0.1 0.1 0.2 0.7 0.1 0.1 0.2 0.1 0.9
0 0 0 0 0 0.7 0.4 0 0.3 0 0 0 0.8 0 0 0.31 ೖྗ ग़ྗ ϑΟϧλΛॏΈ ͱͯ͠ w w ৞ΈࠐΈ૚ ֶशͰཉ͍͠৘ใΛऔΓग़ͤΔ ϑΟϧλΛ֫ಘ͢Δ

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 =
variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef conv( data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); conv - ৞ΈࠐΈ

conv( data, variable, variable_1, border = 'constant', dilation = [1,
1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); conv - ৞ΈࠐΈ ೖྗը૾ ϑΟϧλ νϟωϧຖʹՃ͑Δఆ਺

1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); dilation=1 dilation=2 ͜Ε

1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); padding=0 padding=1 ͜Ε

1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); stride=1 stride=2 ͜Ε

ೖྗ3νϟωϧ ग़ྗ4νϟωϧ ͷϑΟϧλ͕ ݸඞཁ 3 × 3 3 × 4

maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax RGBͷ3νϟωϧ 3νϟωϧΛ64νϟωϧʹ͢Δ 64νϟωϧΛ64νϟωϧʹ͢Δ

variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); ϑΝΠϧconv1_1_blob1.datʹॻ͔Εͨ ͷ4֊ͷςϯιϧΛϑΟϧλͱͯ͠࢖͏ 3 × 3 × 3 × 64

variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef 3νϟωϧ͔Β64νϟωϧ΁ͷ৞ΈࠐΈ૚

variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); ϑΝΠϧconv1_2_blob1.datʹॻ͔Εͨ ͷ4֊ͷςϯιϧΛϑΟϧλͱͯ͠࢖͏ 3 × 3 × 64 × 64

variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef 64νϟωϧ͔Β64νϟωϧ΁ͷ৞ΈࠐΈ૚

variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef max_pool( relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2] ); max_pool - ϑΟϧλൣғ಺ͷ࠷େ஋Λग़ྗ͢Δ

4 3 6 5 3 0 1 0 8 6
1 9 1 3 7 0 8 ೖྗ ग़ྗ Max Pooling૚ ϑΟϧλͷൣғ಺Ͱ࠷େͷ஋Λ ग़ྗը૾ͷରԠ͢ΔҐஔʹు͘ ͜ͷൣғ಺Ͱ ࠷େ ͜ͷ૚͸ॏΈ Λ࣋ͨͳ͍ w

max_pool( relu_1, border = 'ignore', padding = [(0, 0), (0,
0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2] ); padding=0 padding=1 ͜Ε

0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2] ); size=2 size=3 size=4 ͜Ε

0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2] ); stride=1 stride=2 ͜Ε

variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef Max PoolingͰը૾Λ224x224͔Β112x112ʹॖখ

maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax VGG16 SIMONYAN, Karen; ZISSERMAN, Andrew. Very deep convolutional networks for large- scale image recognition. arXiv preprint arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556 ࠓݟͨ෦෼

maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax ͷը૾͕10ຕग़ΔΑ 7 × 7 × 512 ཁૉͷϕΫτϧ͕10ݸཉ͍͠Α 25088 σʔλͱͯ͠͸ಉ͕ͩ͡ ܕͷม׵͕͍Δ

version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... relu_12 =
relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); reshape = reshape(max_pool_4, shape = [10, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); ... } graph.nnef reshape(max_pool_4, shape = [10, -1]); ͜ͷόοϑΝͷ಺༰Λ 10౳෼ʹͯ͠ɺ10ຊͷϕΫτϧʹ͢Δ reshape - σʔλͷղऍͷมߋ

= 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); variable_31 = variable<scalar>(label = 'fc8_blob2', shape = [1, 1000]); variable_30 = variable<scalar>(label = 'fc8_blob1', shape = [1000, 4096]); variable_29 = variable<scalar>(label = 'fc7_blob2', shape = [1, 4096]); variable_28 = variable<scalar>(label = 'fc7_blob1', shape = [4096, 4096]); variable_27 = variable<scalar>(label = 'fc6_blob2', shape = [1, 4096]); variable_26 = variable<scalar>(label = 'fc6_blob1', shape = [4096, 25088]); variable_25 = variable<scalar>(label = 'conv5_3_blob2', shape = [1, 512]); variable_24 = variable<scalar>(label = 'conv5_3_blob1', shape = [512, 512, 3, 3]); variable_23 = variable<scalar>(label = 'conv5_2_blob2', shape = [1, 512]); variable_22 = variable<scalar>(label = 'conv5_2_blob1', shape = [512, 512, 3, 3]); variable_21 = variable<scalar>(label = 'conv5_1_blob2', shape = [1, 512]); variable_20 = variable<scalar>(label = 'conv5_1_blob1', shape = [512, 512, 3, 3]); variable_19 = variable<scalar>(label = 'conv4_3_blob2', shape = [1, 512]); variable_18 = variable<scalar>(label = 'conv4_3_blob1', shape = [512, 512, 3, 3]); variable_17 = variable<scalar>(label = 'conv4_2_blob2', shape = [1, 512]); variable_16 = variable<scalar>(label = 'conv4_2_blob1', shape = [512, 512, 3, 3]); variable_12 = variable<scalar>(label = 'conv3_3_blob1', shape = [256, 256, 3, 3]); variable_10 = variable<scalar>(label = 'conv3_2_blob1', shape = [256, 256, 3, 3]); variable_9 = variable<scalar>(label = 'conv3_1_blob2', shape = [1, 256]); variable_8 = variable<scalar>(label = 'conv3_1_blob1', shape = [256, 128, 3, 3]); variable_6 = variable<scalar>(label = 'conv2_2_blob1', shape = [128, 128, 3, 3]); variable_11 = variable<scalar>(label = 'conv3_2_blob2', shape = [1, 256]); variable_5 = variable<scalar>(label = 'conv2_1_blob2', shape = [1, 128]); variable_4 = variable<scalar>(label = 'conv2_1_blob1', shape = [128, 64, 3, 3]); variable_2 = variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); variable_7 = variable<scalar>(label = 'conv2_2_blob2', shape = [1, 128]); variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_2 = conv(max_pool, variable_4, variable_5, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); graph.nnef ϑΝΠϧ͔Β஋ΛಡΉ

variable_6 = variable<scalar>(label = 'conv2_2_blob1', shape = [128, 128, 3,
3]); variable_11 = variable<scalar>(label = 'conv3_2_blob2', shape = [1, 256]); variable_5 = variable<scalar>(label = 'conv2_1_blob2', shape = [1, 128]); variable_4 = variable<scalar>(label = 'conv2_1_blob1', shape = [128, 64, 3, 3]); variable_2 = variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); variable_7 = variable<scalar>(label = 'conv2_2_blob2', shape = [1, 128]); variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_2 = conv(max_pool, variable_4, variable_5, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_2 = relu(conv_2); conv_3 = conv(relu_2, variable_6, variable_7, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_3 = relu(conv_3); max_pool_1 = max_pool(relu_3, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_4 = conv(max_pool_1, variable_8, variable_9, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_4 = relu(conv_4); conv_5 = conv(relu_4, variable_10, variable_11, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_5 = relu(conv_5); conv_6 = conv(relu_5, variable_12, variable_13, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_6 = relu(conv_6); max_pool_2 = max_pool(relu_6, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_7 = conv(max_pool_2, variable_14, variable_15, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_7 = relu(conv_7); conv_8 = conv(relu_7, variable_16, variable_17, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_8 = relu(conv_8); conv_9 = conv(relu_8, variable_18, variable_19, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_9 = relu(conv_9); max_pool_3 = max_pool(relu_9, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_10 = conv(max_pool_3, variable_20, variable_21, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_10 = relu(conv_10); conv_11 = conv(relu_10, variable_22, variable_23, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_11 = relu(conv_11); conv_12 = conv(relu_11, variable_24, variable_25, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_12 = relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); reshape = reshape(max_pool_4, shape = [10, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); linear_1 = linear(relu_13, variable_28, variable_29); input convolution ReLU maxpool convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU

variable_2 = variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3,
3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); variable_7 = variable<scalar>(label = 'conv2_2_blob2', shape = [1, 128]); variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_2 = conv(max_pool, variable_4, variable_5, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_2 = relu(conv_2); conv_3 = conv(relu_2, variable_6, variable_7, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_3 = relu(conv_3); max_pool_1 = max_pool(relu_3, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_4 = conv(max_pool_1, variable_8, variable_9, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_4 = relu(conv_4); conv_5 = conv(relu_4, variable_10, variable_11, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_5 = relu(conv_5); conv_6 = conv(relu_5, variable_12, variable_13, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_6 = relu(conv_6); max_pool_2 = max_pool(relu_6, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_7 = conv(max_pool_2, variable_14, variable_15, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_7 = relu(conv_7); conv_8 = conv(relu_7, variable_16, variable_17, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_8 = relu(conv_8); conv_9 = conv(relu_8, variable_18, variable_19, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_9 = relu(conv_9); max_pool_3 = max_pool(relu_9, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_10 = conv(max_pool_3, variable_20, variable_21, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_10 = relu(conv_10); conv_11 = conv(relu_10, variable_22, variable_23, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_11 = relu(conv_11); conv_12 = conv(relu_11, variable_24, variable_25, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_12 = relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); reshape = reshape(max_pool_4, shape = [10, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool

max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0),
(0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_2 = conv(max_pool, variable_4, variable_5, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_2 = relu(conv_2); conv_3 = conv(relu_2, variable_6, variable_7, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_3 = relu(conv_3); max_pool_1 = max_pool(relu_3, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_4 = conv(max_pool_1, variable_8, variable_9, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_4 = relu(conv_4); conv_5 = conv(relu_4, variable_10, variable_11, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_5 = relu(conv_5); conv_6 = conv(relu_5, variable_12, variable_13, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_6 = relu(conv_6); max_pool_2 = max_pool(relu_6, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_7 = conv(max_pool_2, variable_14, variable_15, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_7 = relu(conv_7); conv_8 = conv(relu_7, variable_16, variable_17, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_8 = relu(conv_8); conv_9 = conv(relu_8, variable_18, variable_19, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_9 = relu(conv_9); max_pool_3 = max_pool(relu_9, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_10 = conv(max_pool_3, variable_20, variable_21, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_10 = relu(conv_10); conv_11 = conv(relu_10, variable_22, variable_23, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_11 = relu(conv_11); conv_12 = conv(relu_11, variable_24, variable_25, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_12 = relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); reshape = reshape(max_pool_4, shape = [10, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear softmax ReLU convolution ReLU maxpool

constant<scalar>( shape = [ 3, 4 ], value = [42.0]
); constant - શͯͷཁૉ͕ಛఆͷ஋ʹͳ͍ͬͯΔόοϑΝΛ࡞Δ [ 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 ]

2], groups = 1, padding = [(0, 0), (0, 0)], stride = [1, 1] ); deconv - ٯ৞ΈࠐΈ deconv( data, variable, variable_1, border = 'constant', dilation = [2, 2], groups = 1, padding = [(0, 0), (0, 0)], stride = [1, 1] );

box - ϑΟϧλͷൣғ಺ͷ૯࿨Λฦ͢ box( data, size = [ 1, 3,
3 ], border = 'constant', dilation = [2, 2], groups = 1, padding = [(0, 0), (0, 0)], stride = [1, 1] ); debox( data, size = [ 1, 3, 3 ], border = 'constant', dilation = [2, 2], groups = 1, padding = [(0, 0), (0, 0)], stride = [1, 1] ); 9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 1

6 9 8 2 7 6 8 7 8 5
3 6 6 1 0 6 4 7 3 4 7 argmax_pool - ࠷େ஋Λ࣋ͭཁૉͷΠϯσοΫεΛฦ͢ sample - ΠϯσοΫεͰଞͷςϯιϧ͔Β஋Λरͬͯ͘Δ

22 9 2 4 7 9 9 2 4 7
2 9 2 4 7 0 9 2 4 7 1 9 2 4 7 0 1 0 1 1 1 1 0 1 1 reduce - ಛఆͷ࣠ํ޲ͷશͯͷ஋Λ1ͭͷ஋ʹू໿͢Δ sum_reduce max_reduce min_reduce argmax_reduce argmin_reduce all_reduce any_reduce

split - ಛఆͷ࣠ํ޲ʹςϯιϧΛ෼ׂ͢Δ split concat pad

3 9 5 4 gather - ಛఆͷ࣠ํ޲ͷཁૉ͔Β1ͭΛબͿ 1 0 3
2 3 9 5 4 9.0 2.0 7.0 7.0 3.0 6.0 0.0 8.0 8.0 5.0 7.0 6.0 2.0 0.0 7.0 9.0 9 2 7 7 3 6 0 8 8 5 7 6 2 0 7 9 cast - ςϯιϧͷཁૉͷܕΛม͑Δ

matmul - ߦྻੵ result = matmul( input1, input2 ); w00
w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 = w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 input1 input2 result result = transpose( input, [ 1, 0 ] ); transpose - సஔ w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 = w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 T result input

׆ੑԽؔ਺ sigmoid relu prelu leaky_relu elu gelu silu softmax softplus
ϓʔϦϯά max_pool_with_index max_pool avg_pool rms_pool ਖ਼نԽ local_response_normalization local_mean_normalization local_variance_normalization local_contrast_normalization l1_normalization l2_normalization batch_normalization ྔࢠԽ min_max_linear_quantize zero_point_linear_quantize logarithmetic_quantize

ؔ৺ྖҬϓʔϦϯά avg_roi_pool max_roi_pool roi_resample avg_roi_align max_roi_align ϦαΠζ nearest_downsample area_downsample nearest_upsample
multilinear_upsample ୯߲ԋࢉࢠ copy neg rcp exp log sin cos tan sinh cosh tanh asin acos atan asinh acosh atanh abs sign not floor ceil round ೋ߲ԋࢉࢠ add sub mul div pow lt gt le ge eq ne and or ࡾ߲ԋࢉࢠ select ͦͷଞͷؔ਺ sqr sqrt rsqr rsqrt log2 min max clamp

https://github.com/Fadis/gct GPU Computing Toolkit VulkanΛ࢖ͬͯ GPUΛ࢖͏ΞϓϦέʔγϣϯ͕ Α͘ߦ͏ॲཧΛ ؆ܿʹॻ͚ΔΑ͏ʹ͢Δ

https://github.com/Fadis/gct GPU Computing Toolkit VulkanΛ࢖ͬͯ GPUΛ࢖͏ΞϓϦέʔγϣϯ͕ Α͘ߦ͏ॲཧΛ ؆ܿʹॻ͚ΔΑ͏ʹ͢Δ Λ౉ͨ͠Β χϡʔϥϧωοτϫʔΫͷධՁΛ
GPUͰγϡοͱ࣮ߦͰ͖ΔΑ͏ʹ͍ͨ͠

ͷΦϖϨʔλͷ͏ͪ VGGΛಈ͔ͨ͢Ίʹ࣮૷͞Ε͍ͯͳ͚Ε͹ͳΒͳ͍෺ external variable conv relu max_pool reshape linear softmax

ͷΦϖϨʔλͷ͏ͪ VGGΛಈ͔ͨ͢Ίʹ࣮૷͞Ε͍ͯͳ͚Ε͹ͳΒͳ͍෺ external variable conv relu max_pool reshape linear softmax
GPUͷϝϞϦ֬อ͢Δ͚ͩ GPUͷϝϞϦ֬อͯ͠σʔλΛసૹ͢Δ͚ͩ ςϯιϧͷαΠζΛม͑Δ͚ͩ

#version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable
#extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 32, local_size_y = 32, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer weight { float weight_data[]; }; layout(std430, binding = 2) buffer output_vector { float output_data[]; }; layout(std430, binding = 3) buffer bias { float bias_data[]; }; layout(constant_id = 1) const int filter_size_x = 0; layout(constant_id = 2) const int filter_size_y = 0; ೖྗ όΠΞε ग़ྗ ϑΟϧλ conv

layout(constant_id = 1) const int filter_size_x = 0; layout(constant_id =
2) const int filter_size_y = 0; layout(constant_id = 3) const int filter_size_z = 0; layout(constant_id = 4) const int lpadding = 0; layout(constant_id = 5) const int rpadding = 0; layout(constant_id = 6) const int tpadding = 0; layout(constant_id = 7) const int bpadding = 0; layout(constant_id = 8) const int stride_x = 0; layout(constant_id = 9) const int stride_y = 0; layout(constant_id = 10) const int dilation_x = 0; layout(constant_id = 11) const int dilation_y = 0; layout(constant_id = 12) const int input_dim_x = 0; layout(constant_id = 13) const int input_dim_y = 0; layout(constant_id = 14) const int input_dim_z = 0; layout(constant_id = 15) const int output_dim_x = 0; layout(constant_id = 16) const int output_dim_y = 0; layout(constant_id = 17) const int output_dim_z = 0; layout(constant_id = 18) const float border_value = 0.0; layout(constant_id = 19) const int bias_mode = 0; layout(constant_id = 20) const float bias_value = 0; int get_filter_length() { return filter_size_x * filter_size_y * filter_size_z; } ϑΟϧλαΠζ όΠΞεͷ৐ͤํ ൣғ֎ͷ஋ padding stride dilation ೖྗͷαΠζ ग़ྗͷαΠζ

shared float[filter_size_x*filter_size_y*filter_size_z] filter_cache; void load_filter() { const uint local_id =
gl_LocalInvocationID.x + gl_LocalInvocationID.y * gl_WorkGroupSize.x + gl_LocalInvocationID.z * gl_WorkGroupSize.y * gl_WorkGroupSize.x; const uint local_size = gl_WorkGroupSize.x * gl_WorkGroupSize.y * gl_WorkGroupSize.z; const uint filter_size = uint( get_filter_length() ); const uint cycles = filter_size / local_size + ( bool( filter_size % local_size ) ? 1 : 0 ); for( uint c = 0u; c != cycles; c++ ) { uint i = c * local_size + local_id; const int filter_offset = get_filter_offset( int( i ) ); if( i < filter_size ) { filter_cache[ i ] = weight_data[ filter_offset ]; } } } float get_input( int i ) { const int input_offset = get_input_offset( i ); return ( input_offset < 0 ) ? 1εϨου͕ग़ྗͷ1ཁૉΛ୲౰͢Δ ෳ਺ͷεϨου͕ಉ͡ϑΟϧλΛ࢖͏ͷͰ ڞ༗ϝϞϦʹϑΟϧλΛϩʔυ͢Δ

const int output_channel = int( gl_GlobalInvocationID.z % output_dim_z ); return
( bias_mode == 0 ) ? bias_value : bias_data[ output_channel ]; } void set_output( float v ) { const int output_offset = get_output_offset(); if( output_offset >= 0 ) { output_data[ output_offset ] = v; } } void main() { load_filter(); barrier(); const int filter_length = get_filter_length(); float sum = 0.0; for( int i = 0; i != filter_length; i++ ) { sum += get_input( i ) * get_filter( i ); } sum += get_bias(); set_output( sum ); } ϑΟϧλͷൣғ಺ͷೖྗ஋ʹϑΟϧλΛ͔͚ͯ ૯࿨ΛऔΓɺόΠΞεΛՃ͑ͯग़ྗʹॻ͘

#extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 1024, local_size_y = 1, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer weight { float weight_data[]; }; layout(std430, binding = 2) buffer output_vector { float output_data[]; }; layout(std430, binding = 3) buffer bias { float bias_data[]; }; layout(constant_id = 1) const uint input_length = 32; layout(constant_id = 2) const uint bias_mode = 0; layout(constant_id = 3) const float bias_value = 0.0; linear ೖྗ όΠΞε ग़ྗ ϑΟϧλ

}; layout(constant_id = 1) const uint input_length = 32; layout(constant_id
= 2) const uint bias_mode = 0; layout(constant_id = 3) const float bias_value = 0.0; shared float[32] temp; float get_bias() { const uint output_index = gl_GlobalInvocationID.y; return ( bias_mode == 0 ) ? bias_value : bias_data[ output_index ]; } void main() { const uint x = gl_GlobalInvocationID.x; const uint y = gl_GlobalInvocationID.y; const uint batch = gl_GlobalInvocationID.z; float sum = float( 0 ); const uint x_blocks = input_length / gl_WorkGroupSize.x + ( bool( input_length % gl_WorkGroupSize.x ) ? 1 : 0 ); for( uint x_index = 0; x_index != x_blocks; x_index++ ) { const uint x_global = x_index * gl_WorkGroupSize.x + x; ೖྗͷαΠζ όΠΞεͷ৐ͤํ ਫฏՃࢉʹ࢖͏ڞ༗ϝϞϦ

const uint batch = gl_GlobalInvocationID.z; float sum = float( 0
); const uint x_blocks = input_length / gl_WorkGroupSize.x + ( bool( input_length % gl_WorkGroupSize.x ) ? 1 : 0 ); for( uint x_index = 0; x_index != x_blocks; x_index++ ) { const uint x_global = x_index * gl_WorkGroupSize.x + x; const uint batch_offset = input_length * batch; bool mask = ( x_global < input_length ); float result = float( 0 ); result = mask ? ( weight_data[ x_global + y * input_length ] * input_data[ batch_offset + x_global ] ) : float( 0 ); result = subgroupAdd( result ); if( gl_SubgroupInvocationID == 0 ) { temp[ gl_SubgroupID ] = result; } barrier(); mask = ( x < gl_NumSubgroups ); result = subgroupAdd( mask ? temp[ x ] : float( 0 ) ); sum += result; } if( x == 0 ) { const uint output_length = gl_WorkGroupSize.y * gl_NumWorkGroups.y; const uint batch_offset = output_length * batch; output_data[ batch_offset + y ] = sum + get_bias(); } ߦྻͷ1ཁૉʹ͖ͭ1εϨου ਫฏՃࢉ໋ྩͰ ೖྗͱॏΈ1ྻͷ಺ੵΛٻΊΔ 1ߦ໨Λ୲౰͢ΔεϨου͕ ܭࢉ݁ՌΛॻ͘

#extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 32, local_size_y = 32, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer output_vector { float output_data[]; }; layout(constant_id = 1) const int filter_size_x = 0; layout(constant_id = 2) const int filter_size_y = 0; layout(constant_id = 3) const int filter_size_z = 0; layout(constant_id = 4) const int filter_size_w = 0; layout(constant_id = 5) const int lpadding = 0; layout(constant_id = 6) const int rpadding = 0; layout(constant_id = 7) const int tpadding = 0; layout(constant_id = 8) const int bpadding = 0; layout(constant_id = 9) const int stride_x = 0; layout(constant_id = 10) const int stride_y = 0; layout(constant_id = 11) const int stride_z = 0; layout(constant_id = 12) const int stride_w = 0; max_pool ೖྗ ग़ྗ

}; layout(constant_id = 1) const int filter_size_x = 0; layout(constant_id
= 2) const int filter_size_y = 0; layout(constant_id = 3) const int filter_size_z = 0; layout(constant_id = 4) const int filter_size_w = 0; layout(constant_id = 5) const int lpadding = 0; layout(constant_id = 6) const int rpadding = 0; layout(constant_id = 7) const int tpadding = 0; layout(constant_id = 8) const int bpadding = 0; layout(constant_id = 9) const int stride_x = 0; layout(constant_id = 10) const int stride_y = 0; layout(constant_id = 11) const int stride_z = 0; layout(constant_id = 12) const int stride_w = 0; layout(constant_id = 13) const int input_dim_x = 0; layout(constant_id = 14) const int input_dim_y = 0; layout(constant_id = 15) const int input_dim_z = 0; layout(constant_id = 16) const int input_dim_w = 0; layout(constant_id = 17) const int output_dim_x = 0; layout(constant_id = 18) const int output_dim_y = 0; layout(constant_id = 19) const int output_dim_z = 0; layout(constant_id = 20) const int output_dim_w = 0; layout(constant_id = 21) const float border_value = 0.0; int get_filter_length() { return filter_size_x * filter_size_y * filter_size_z * filter_size_w; } int get_input_offset( int i ) { ϑΟϧλαΠζ ൣғ֎ͷ஋ padding stride ೖྗͷαΠζ ग़ྗͷαΠζ

} float get_input( int i ) { const int input_offset
= get_input_offset( i ); return ( input_offset < 0 ) ? border_value : input_data[ input_offset ]; } void set_output( float v ) { const int output_offset = get_output_offset(); if( output_offset >= 0 ) { output_data[ output_offset ] = v; } } void main() { const int filter_length = get_filter_length(); float v = -10000.0; for( int i = 0; i != filter_length; i++ ) { const int input_offset = get_input_offset( i ); if( input_offset != -1 ) { v = max( input_data[ input_offset ], v ); } } set_output( v ); } ϑΟϧλͷൣғ಺Ͱ ࠷େͷ஋Λग़ྗʹॻ͘

#extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 1024, local_size_y = 1, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer output_vector { float output_data[]; }; layout(constant_id = 1) const uint input_length = 32; void main() { const uint index = gl_GlobalInvocationID.x + gl_NumWorkGroups.x * gl_WorkGroupSize.x * gl_GlobalInvocationID.y + gl_NumWorkGroups.x * gl_WorkGroupSize.x * gl_NumWorkGroups.y * gl_WorkGroupSize.y * gl_GlobalInvocationID.z; if( index >= input_length ) return; float v = input_data[ index ]; output_data[ index ] = ( v >= 0.0 ) ? v : 0.0; } ೖྗ ग़ྗ ग़ྗ = max( ೖྗ, 0 ) relu

#extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 1024, local_size_y = 1, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer output_vector { float output_data[]; }; layout(constant_id = 1) const uint input_length = 32; shared float[32] temp; void main() { const uint offset = gl_LocalInvocationID.x; const uint batch = gl_GlobalInvocationID.z; float max = 0.0; for( uint i = offset; i < input_length; i += 1024 ) { softmax ೖྗ ग़ྗ

const uint offset = gl_LocalInvocationID.x; const uint batch = gl_GlobalInvocationID.z;
float max = 0.0; for( uint i = offset; i < input_length; i += 1024 ) { float v = input_data[ i + input_length * batch ]; if( v > max ) { max = v; } } const float smax = subgroupMax( max ); if( gl_SubgroupInvocationID == 0 ) { temp[ gl_SubgroupID ] = smax; } barrier(); const float gmax = subgroupMax( temp[ gl_SubgroupInvocationID ] ); float sum = 0.0; for( uint i = offset; i < input_length; i += 1024 ) { sum += exp( input_data[ i + input_length * batch ] - gmax ); } const float ssum = subgroupAdd( sum ); softmaxʹ͸ ೖྗ஋ͷexpͷ૯࿨͕ཁΔͷͰ શͯͷ஋Λڞ༗ϝϞϦ͕ಧ͘ 1024εϨουͷதͰย෇͚Δ ೖྗͷ࠷େ஋ΛٻΊΔ

const float smax = subgroupMax( max ); if( gl_SubgroupInvocationID ==
0 ) { temp[ gl_SubgroupID ] = smax; } barrier(); const float gmax = subgroupMax( temp[ gl_SubgroupInvocationID ] ); float sum = 0.0; for( uint i = offset; i < input_length; i += 1024 ) { sum += exp( input_data[ i + input_length * batch ] - gmax ); } const float ssum = subgroupAdd( sum ); if( gl_SubgroupInvocationID == 0 ) { temp[ gl_SubgroupID ] = ssum; } barrier(); const float gsum = subgroupAdd( temp[ gl_SubgroupInvocationID ] ); for( uint i = offset; i < input_length; i += 1024 ) { output_data[ i + input_length * batch ] = exp( input_data[ i + input_length * batch ] - gmax ) / gsum; } } ೖྗͷexpͷ૯࿨ΛٻΊΔ ೖྗͷexpΛೖྗͷexpͷ૯࿨ͰׂΔ

https://github.com/KhronosGroup/NNEF-Tools ͜Ε NNEFͷެࣜϥΠϒϥϦ NNEF-Toolsͷதʹ NNEFͷύʔαؚ͕·Ε͍ͯΔ Pythonͷ࣮૷ͱ C++ͷ࣮૷͕༻ҙ͞Ε͍ͯΔ

graph::graph( const std::shared_ptr< device_t > &device, const std::shared_ptr< allocator_t >
&allocator, const std::shared_ptr< descriptor_pool_t > &descriptor_pool, const std::shared_ptr< pipeline_cache_t > &pipeline_cache, const std::filesystem::path &dir, const std::filesystem::path &shader_dir, command_buffer_recorder_t &rec ) { nnef::Graph parsed; std::string error; if( !nnef::load_graph( dir.string(), parsed, error, "" ) ) { std::cerr << error << std::endl; throw -1; } if( !nnef::infer_shapes( parsed, error ) ) { std::cerr << error << std::endl; throw -1; } if( !nnef::allocate_buffers( parsed, error ) ) { std::cerr << error << std::endl; throw -1; } gct/src/gct/dnn/graph.cpp NNEF-Toolsͷ ύʔαͰ NNEFΛಡΈࠐΉ

for( const auto &o: parsed.operations ) { if( o.name ==
"variable" ) { const std::string name = get_output_name( o ); const auto label = std::find_if( o.attribs.begin(), o.attribs.end(), []( const auto &v ) { return v.first == "label"; } ); if( label == o.attribs.end() ) { throw -1; } const auto data_name = label->second.string(); const auto data_filename = dir / ( data_name + ".dat" ); auto nnef_data = rec.load_nnef_data( allocator, std::filesystem::absolute( data_filename ), vk::BufferUsageFlagBits::eStorageBuffer| vk::BufferUsageFlagBits::eTransferDst ); bufs.insert( std::make_pair( name, nnef_data ) ); } gct/src/gct/dnn/graph.cpp ඞཁͳϑΝΠϧͷ಺༰Λ GPUͷϝϞϦʹૹΔ

for( const auto &o: parsed.operations ) { if( o.name ==
"conv" ) { const std::string name = get_output_name( o ); auto op = std::make_shared< operation::convolution >( allocator, descriptor_pool, pipeline_cache, o, shaders, bufs ); bufs.insert( std::make_pair( name, op->get_output() ) ); ops.push_back( op ); } else if( o.name == "linear" ) { const std::string name = get_output_name( o ); auto op = std::make_shared< operation::linear >( allocator, descriptor_pool, pipeline_cache, o, shaders, bufs gct/src/gct/dnn/graph.cpp ֤૚ʹରԠ͢Δ ύΠϓϥΠϯΛ࡞Γ σεΫϦϓληοτʹ όοϑΝΛ݁ͼ͚ͭΔ

void convolution::operator()( command_buffer_recorder_t &rec ) { rec.compute_barrier( { input.buffer },
{} ); rec.bind_descriptor_set( vk::PipelineBindPoint::eCompute, pipeline_layout, descriptor_set ); rec.bind_pipeline( pipeline ); rec.dispatch_threads( exec_dim[ 0 ], exec_dim[ 1 ], exec_dim[ 2 ] ); } gct/src/gct/dnn/convolution.cpp ίϚϯυόοϑΝʹ ඞཁͳϝϞϦόϦΞͱίϯϐϡʔτύΠϓϥΠϯͷ࣮ߦΛੵΉ

std::vector< std::uint8_t > temp( dest.buffer->get_props().get_basic().size, 0u ); std::unordered_map< std::string, int
> channel_order{ { "R", 2 }, { "G", 1 }, { "B", 0 } }; for( int c = 0; c != spec.nchannels; ++c ) { const auto order = channel_order.find( spec.channelnames[ c ] ); if( order != channel_order.end() ) { file->read_image( c, c + 1u, type, reinterpret_cast< std::uint8_t* >( std::next( temp.data(), spec.width * spec.height * dest.type.depth/8u * order->second ) ) ); } } constexpr std::array< float, 3u > mean{ 123.68f, 116.779f, 103.939f }; for( int c = 0; c != spec.nchannels; ++c ) { for( unsigned int y = 0; y != spec.height; ++y ) { for( unsigned int x = 0; x != spec.width; ++x ) { const auto index = x + y * spec.width + c * spec.width * spec.height; reinterpret_cast< float* >( temp.data() )[ index ] = reinterpret_cast< float* >( temp.data() )[ index ] * 255.0f - mean[ c ]; } } } gct/src/gct/dnn/load_image.cpp AlexNetޓ׵ͷೖྗը૾ͷલॲཧ νϟωϧΛBGRͷॱʹฒ΂ସ͑ ImageNetͷֶशσʔλͷνϟωϧຖͷฏۉ஋ΛҾ͘ VGGؚΉଟ͘ͷը૾ॲཧܥͷϞσϧ͕͜ͷલॲཧΛ࠾༻͍ͯ͠Δ

--- /home/fadis/vgg16.orig/graph.nnef 2019-05-21 21:24:49.000000000 +0900 +++ /home/fadis/vgg16/graph.nnef 2023-07-17 20:32:06.938232809 +0900
@@ -34,7 +34,7 @@ variable_7 = variable<scalar>(label = 'conv2_2_blob2', shape = [1, 128]); variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); - data = external<scalar>(shape = [10, 3, 224, 224]); + data = external<scalar>(shape = [1, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); @@ -66,7 +66,7 @@ conv_12 = conv(relu_11, variable_24, variable_25, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_12 = relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); - reshape = reshape(max_pool_4, shape = [10, -1]); + reshape = reshape(max_pool_4, shape = [1, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); linear_1 = linear(relu_13, variable_28, variable_29); ධՁ͸ը૾1ຕͰߦ͍͍ͨͷͰόοναΠζΛ10͔Β1ʹมߋ

https://github.com/KhronosGroup/NNEF-Tools/tree/main/models#nnef-model-zoo ͜ͷϞσϧ͸ ೖྗͱͯ͠ը૾Λड͚औΓ ͦΕ͕ԿͰ͋Δ͔Λද͢ IDΛฦ͢Α͏ʹ ֶश͕ͳ͞Ε͍ͯΔ

https://www.kaggle.com/c/imagenet-object-localization-challenge/data?select=LOC_synset_mapping.txt ImageNetͷ഑෍ݩ͔Β ͲͷID͕Կͳͷ͔ͷ ରԠදΛरͬͯ͘Δ

Ϙ΢ϧʹཛԫɺคνʔζɺࠇމᑦΛೖΕͯͨ·͕͝ۉҰʹͳΔ·ͰࠞͥΔ ುʹਫͱԘΛೖΕͯ෸ಅͤ͞ɺύελΛାʹॻ͔Εͨ࣌ؒ௨ΓʹᣐͰΔ ϑϥΠύϯͰܰ͘ম͍ͨϕʔίϯΛϘ΢ϧʹՃ͑Δ ುͷத਎Λ͟Δʹ͋͛ɺ͟Δͷத਎ΛϘ΢ϧʹҠ͢ ༨೤Ͱคνʔζ͕ͱ͚Δ·ͰΑ͋͑͘Δ ΧϧϘφʔϥͷϏϧυखॱ ϐΫηϧͷΧϧϘφʔϥͷը૾ͷ׬੒ 224 × 224

$ dnn_eval -m ~/vgg16 -i ~/002.jpg -l ~/LOC_synset_mapping.txt 959 carbonara
0.997535 923 plate 0.00213662 940 spaghetti squash 0.000186059 937 broccoli 2.2913e-05 762 restaurant, eating house, eating place, eatery 1.6662e-05 809 soup bowl 1.57591e-05 935 mashed potato 1.37334e-05 962 meat loaf, meatloaf 1.27348e-05 934 hotdog, hot dog, red hot 1.23476e-05 925 consomme 6.84554e-06 ͜ͷ෺ମ͕ ΧϧϘφʔϥͰ͋ΔՄೳੑ 99.75%

$ dnn_eval -m ~/vgg16 -i ~/001.jpg -l ~/LOC_synset_mapping.txt 951 lemon
0.986053 950 orange 0.0096886 961 dough 0.001014 954 banana 0.00058848 928 ice cream, icecream 0.000531914 953 pineapple, ananas 0.000395049 949 strawberry 0.000154118 952 fig 0.000151808 940 spaghetti squash 0.000140983 948 Granny Smith 0.000132113 ͜ͷ෺ମ͕ ϨϞϯͰ͋ΔՄೳੑ98.61%

$ dnn_eval -m ~/vgg16 -i ~/003.jpg -l ~/LOC_synset_mapping.txt 784 screwdriver
0.937934 845 syringe 0.0574703 696 paintbrush 0.000626332 418 ballpoint, ballpoint pen, ballpen, Biro 0.000506295 840 swab, swob, mop 0.000496733 629 lipstick, lip rouge 0.000402968 749 quill, quill pen 0.000344308 731 plunger, plumber's helper 0.000300347 813 spatula 0.000254442 623 letter opener, paper knife, paperknife 0.000160844 ͜ͷ෺ମ͕ υϥΠόʔͰ͋ΔՄೳੑ93.79%

·ͱΊ ֶशࡁΈͷχϡʔϥϧωοτϫʔΫΛ ΤΫεϙʔτ͢ΔϑΝΠϧܗࣜ άϥϑఆٛ෦෼͸ςΩετܗࣜͳͷͰ ਓ͕ؒ௚઀ಡΊΔ

NNEFを読めるようになろう

NNEFを読めるようになろう

More Decks by Fadis

Other Decks in Programming

Featured

Transcript