Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NNEFを読めるようになろう

Fadis
July 22, 2023

 NNEFを読めるようになろう

学習済みのニューラルネットワークをエクスポートするためのファイル形式NNEFを解説します
ついでにNNEFの公式にサンプルとして転がっているVGG16を動かします
これは2023年7月22日に行われた Kernel/VM探検隊@東京 No16 での発表資料です

カーネル/VM探検隊について : https://www.kernelvm.org/
カーネル/VM探検隊東京 No16について : https://kernelvm.connpass.com/event/287261/
この発表の動画 : https://www.youtube.com/watch?v=pYNu5e6vMVg
この発表のソースコード : https://github.com/Fadis/gct

Fadis

July 22, 2023
Tweet

More Decks by Fadis

Other Decks in Programming

Transcript

  1. w01 w02 w03 w04 × × × × ∑ ׆

    ੑ Խ ؔ ਺ ॏΈ ܗࣜχϡʔϩϯ ೖྗ0 ೖྗ1 ೖྗ2 ೖྗ3 ⋯ ⋮ ग़ྗ
  2. ਖ਼͍͠ग़ྗ͕ग़ͯ͘ΔΑ͏ͳ Λݟ͚ͭΔ ਺ཧ࠷దԽ໰୊ʹͳΔ w 2 7 4 w શ ݁

    ߹ ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ ೖྗ૚ w w ֶश ͜Ε͕ ग़ͯ͘ΔΑ͏ʹ ͜ΕΛௐ੔
  3. 4 w ૚ ૚ ૚ w w ͜Ε͕ ग़ͯ͘ΔΑ͏ʹ ͜ΕΛௐ੔

    ֶशΛߦ͏ͨΊͷϑϨʔϜϫʔΫ TensorFlow PyTorch Caffe
  4. TensorFlow ΑΜ 7 ≠ ͜ͷঢ়گͰ7͕ग़ΔΑ͏ʹ ॏΈΛमਖ਼ w w w ͜Ε͸ԿͰ͔͢

    ΍Γ௚͠ ΞϓϦέʔγϣϯ ͳͳ w w w ͜Ε͸ԿͰ͔͢ ਖ਼͍͠ग़ྗΛు͚ΔΑ͏ʹͳͬͨΒ ΞϓϦέʔγϣϯʹ૊ΈࠐΜͰ࢖͍͍ͨ
  5. ೖ ྗ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ

    ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ ֶश݁ՌΛ࠶ݱ͢Δҝʹอଘ͓͔ͯ͠ͳ͚Ε͹ͳΒͳ͍෺ ͲͷΑ͏ͳ૚͕ͲͷΑ͏ͳॱͰܨ͕͍ͬͯΔ͔ 3x3ͷϑΟϧλͰ 3νϟωϧ͔Β64νϟωϧʹม׵ ύσΟϯά͸֤ล1ͮͭ dilationͱstride͸ͦΕͧΕ1 ֤ͦͯ͠૚ͷઃఆ
  6. ೖ ྗ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ

    ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ ৞ Έ ࠐ Έ ૚ maxpool ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ શ ݁ ߹ ૚ ֶश݁ՌΛ࠶ݱ͢Δҝʹอଘ͓͔ͯ͠ͳ͚Ε͹ͳΒͳ͍෺ ֤૚ͷॏΈ w w w w w w w w w w w w w w w w 3x3x3x64ͷ4֊ͷςϯιϧ ֤ཁૉͷ஋͸32bitුಈখ਺఺਺Ͱه࿥ ͦͯ͠ॏΈ͕ͲͷΑ͏ͳܕͰ ϑΝΠϧʹॻ͔Ε͍ͯΔ͔
  7. w w w ුಈখ਺఺਺ͷ ςϯιϧ ුಈখ਺఺਺ͷ ςϯιϧ ੔਺ͷ ςϯιϧ ੔਺Λ

    ුಈখ਺఺਺ʹ ม׵͢Δํ๏ ࢀর ࢀর min=-2.0 max=2.0 bit=8 ࢀর ωοτϫʔΫͷܗͱઃఆ
  8. w w w ුಈখ਺఺਺ͷ ςϯιϧ ුಈখ਺఺਺ͷ ςϯιϧ ੔਺ͷ ςϯιϧ ੔਺Λ

    ුಈখ਺఺਺ʹ ม׵͢Δํ๏ ࢀর ࢀর min=-2.0 max=2.0 bit=8 ࢀর ωοτϫʔΫͷܗͱઃఆ ਓ͕ؒ ಡΈॻ͖Ͱ͖Δ ςΩετܗࣜ ૉૣ͘ύʔεͰ͖Δ όΠφϦܗࣜ
  9. w w w ුಈখ਺఺਺ͷ ςϯιϧ ුಈখ਺఺਺ͷ ςϯιϧ ੔਺ͷ ςϯιϧ ੔਺Λ

    ුಈখ਺఺਺ʹ ม׵͢Δํ๏ ࢀর ࢀর min=-2.0 max=2.0 bit=8 ࢀর ωοτϫʔΫͷܗͱઃఆ ϔομ ਺஋ͷྻ 128όΠτݻఆ ϑΝΠϧΛύʔε͢Δલʹ σʔλͷసૹΛ࢝ΊΒΕΔ
  10. w w w ුಈখ਺఺਺ͷ ςϯιϧ ුಈখ਺఺਺ͷ ςϯιϧ ੔਺ͷ ςϯιϧ ੔਺Λ

    ුಈখ਺఺਺ʹ ม׵͢Δํ๏ ࢀর ࢀর min=-2.0 max=2.0 bit=8 ࢀর ωοτϫʔΫͷܗͱઃఆ version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... } graph.nnef NNEF 1.0ͷ࢓༷ʹैͬͯॻ͔Ε͍ͯ·͢ ͜ͷωοτϫʔΫͷ໊લ
  11. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... } graph.nnef

    ωοτϫʔΫʹର͢Δೖྗ஋͸ ͜ͷൣғͷதͰఆٛ͞ΕΔ dataͱ͍͏όοϑΝʹॻ͖ࠐΈ·͢ ͜ͷൣғ ωοτϫʔΫͷग़ྗ͸ ͜ͷൣғͷதͰఆٛ͞ΕΔ probͱ͍͏όοϑΝ͔ΒಡΈ·͢
  12. input convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU

    maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax VGG16 SIMONYAN, Karen; ZISSERMAN, Andrew. Very deep convolutional networks for large- scale image recognition. arXiv preprint arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556
  13. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label

    = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef ೖྗ ग़ྗ
  14. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label

    = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef 224x224ͰRGB(3νϟωϧ)ͷը૾͕10ຕ·ͱΊͯ֎͔Βೖͬͯ͘Δ external - ֎͔Βೖͬͯ͘ΔσʔλΛఆٛ͢Δ external<scalar>(shape = [10, 3, 224, 224]);
  15. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label

    = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef linear - ઢܗม׵ linear(relu_13, variable_28, variable_29); ϕΫτϧrelu_13ʹߦྻvariable_28Λֻ͚ͯϕΫτϧvariable_29Λ଍͢
  16. x0 x1 x2 x3 y0 y1 y2 y3 y4 ͱ

    ͕͜͏͍͏ؔ܎ʹ͋Δ࣌ x y શ݁߹૚ - શͯͷ૊Έ߹Θ͕ͤ઀ଓ͞Ε͍ͯΔ
  17. x0 x1 x2 x3 y0 y1 y2 y3 y4 w00

    w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 w40 w41 w42 w43 ֤χϡʔϩϯͷॏΈΛฒ΂ͯߦྻʹ͢Δͱ
  18. ϕ w00 w01 w02 w03 w10 w11 w12 w13 w20

    w21 w22 w23 w30 w31 w32 w33 w40 w41 w42 w43 x0 x1 x2 x3 = y0 y1 y2 y3 y4 ׆ੑԽؔ਺ શ݁߹૚=ߦྻͱϕΫτϧͷੵΛٻΊͯ׆ੑԽؔ਺ʹ௨͢
  19. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label

    = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef ReLU - ྲྀߦΓͷ׆ੑԽؔ਺ relu(linear_1);
  20. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label

    = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); ߦྻͱϕΫτϧͷ ੵΛٻΊͯ ׆ੑԽؔ਺ʹ௨͢ ͦͷ݁ՌΛ
  21. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label

    = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef શ݁߹૚
  22. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label

    = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef softmax - ग़ྗͷ૯࿨͕1ʹͳΔ׆ੑԽؔ਺ softmax(linear_2, axes = [1]);
  23. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label

    = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); ߦྻͱϕΫτϧͷ ੵΛٻΊͯ ׆ੑԽؔ਺ʹ௨͢ ͦͷ݁ՌΛ
  24. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label

    = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef શ݁߹૚ શ݁߹૚
  25. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label

    = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); variable_31 = variable<scalar>(label = 'fc8_blob2', shape = [1, 1000]); variable_30 = variable<scalar>(label = 'fc8_blob1', shape = [1000, 4096]); variable_29 = variable<scalar>(label = 'fc7_blob2', shape = [1, 4096]); variable_28 = variable<scalar>(label = 'fc7_blob1', shape = [4096, 4096]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef variable - ϑΝΠϧ͔Β஋ΛಡΈࠐΉ
  26. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label

    = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); variable_31 = variable<scalar>(label = 'fc8_blob2', shape = [1, 1000]); variable_30 = variable<scalar>(label = 'fc8_blob1', shape = [1000, 4096]); variable_29 = variable<scalar>(label = 'fc7_blob2', shape = [1, 4096]); variable_28 = variable<scalar>(label = 'fc7_blob1', shape = [4096, 4096]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef variable - ϑΝΠϧ͔Β஋ΛಡΈࠐΉ fc7_blob2.datͱ͍͏໊લͷϑΝΠϧ͔Β 4096ཁૉͷϕΫτϧΛಡΜͩ΋ͷΛ variable_29ͱݺͿࣄʹ͢Δ variable<scalar>(label = 'fc7_blob2', shape = [1, 4096]);
  27. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label

    = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); variable_31 = variable<scalar>(label = 'fc8_blob2', shape = [1, 1000]); variable_30 = variable<scalar>(label = 'fc8_blob1', shape = [1000, 4096]); variable_29 = variable<scalar>(label = 'fc7_blob2', shape = [1, 4096]); variable_28 = variable<scalar>(label = 'fc7_blob1', shape = [4096, 4096]); ... variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); ... linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } graph.nnef variable - ϑΝΠϧ͔Β஋ΛಡΈࠐΉ fc7_blob1.datͱ͍͏໊લͷϑΝΠϧ͔Β 4096x4096ͷߦྻΛಡΜͩ΋ͷΛ variable_28ͱݺͿࣄʹ͢Δ variable<scalar>(label = 'fc7_blob1', shape = [4096, 4096]);
  28. ReLU w00 w01 w02 w03 w10 w11 w12 w13 w20

    w21 w22 w23 w30 w31 w32 w33 w40 w41 w42 w43 x0 x1 x2 x3 + aaaaa aaaaa aaaaa aaaaa aaaaa = y0 y1 y2 y3 y4 ׆ੑԽؔ਺ w fc7_blob1.dat w fc7_blob2.dat relu_13 relu_14
  29. input convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU

    maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax VGG16 SIMONYAN, Karen; ZISSERMAN, Andrew. Very deep convolutional networks for large- scale image recognition. arXiv preprint arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556 ࠓݟͨ෦෼
  30. input convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU

    maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax ৞ΈࠐΈ૚ VGG͸ը૾ͷ෼ྨͷҝʹ࡞ΒΕͨχϡʔϥϧωοτϫʔΫͳͷͰ ্ͷํʹ͸ը૾ॲཧ޲͖ͷ૚͕ஔ͔Ε͍ͯΔ
  31. 0.2 0.1 0.1 0.2 0.7 0.1 0.1 0.2 0.1 0.9

    0 0 0 0 0 0.7 0.4 0 0.3 0 0 0 0.8 0 0 0.31 ೖྗ ग़ྗ ೖྗը૾ͷ࿈ଓ͢ΔϐΫηϧʹ ϑΟϧλΛ͔͚ͯ ग़ྗը૾ͷ஋ΛಘΔ ϑΟϧλ Α͋͘Δը૾ॲཧͷܗ
  32. 0.2 0.1 0.1 0.2 0.7 0.1 0.1 0.2 0.1 0.9

    0 0 0 0 0 0.7 0.4 0 0.3 0 0 0 0.8 0 0 0.31 ೖྗ ग़ྗ ϑΟϧλΛॏΈ ͱͯ͠ w w ৞ΈࠐΈ૚ ֶशͰཉ͍͠৘ใΛऔΓग़ͤΔ ϑΟϧλΛ֫ಘ͢Δ
  33. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 =

    variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef conv( data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); conv - ৞ΈࠐΈ
  34. conv( data, variable, variable_1, border = 'constant', dilation = [1,

    1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); conv - ৞ΈࠐΈ ೖྗը૾ ϑΟϧλ νϟωϧຖʹՃ͑Δఆ਺
  35. conv( data, variable, variable_1, border = 'constant', dilation = [1,

    1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); dilation=1 dilation=2 ͜Ε
  36. conv( data, variable, variable_1, border = 'constant', dilation = [1,

    1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); padding=0 padding=1 ͜Ε
  37. conv( data, variable, variable_1, border = 'constant', dilation = [1,

    1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1] ); stride=1 stride=2 ͜Ε
  38. input convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU

    maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax RGBͷ3νϟωϧ 3νϟωϧΛ64νϟωϧʹ͢Δ 64νϟωϧΛ64νϟωϧʹ͢Δ
  39. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 =

    variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); ϑΝΠϧconv1_1_blob1.datʹॻ͔Εͨ ͷ4֊ͷςϯιϧΛϑΟϧλͱͯ͠࢖͏ 3 × 3 × 3 × 64
  40. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 =

    variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef 3νϟωϧ͔Β64νϟωϧ΁ͷ৞ΈࠐΈ૚
  41. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 =

    variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); ϑΝΠϧconv1_2_blob1.datʹॻ͔Εͨ ͷ4֊ͷςϯιϧΛϑΟϧλͱͯ͠࢖͏ 3 × 3 × 64 × 64
  42. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 =

    variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef 64νϟωϧ͔Β64νϟωϧ΁ͷ৞ΈࠐΈ૚
  43. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 =

    variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef max_pool( relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2] ); max_pool - ϑΟϧλൣғ಺ͷ࠷େ஋Λग़ྗ͢Δ
  44. 4 3 6 5 3 0 1 0 8 6

    1 9 1 3 7 0 8 ೖྗ ग़ྗ Max Pooling૚ ϑΟϧλͷൣғ಺Ͱ࠷େͷ஋Λ ग़ྗը૾ͷରԠ͢ΔҐஔʹు͘ ͜ͷൣғ಺Ͱ ࠷େ ͜ͷ૚͸ॏΈ Λ࣋ͨͳ͍ w
  45. max_pool( relu_1, border = 'ignore', padding = [(0, 0), (0,

    0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2] ); padding=0 padding=1 ͜Ε
  46. max_pool( relu_1, border = 'ignore', padding = [(0, 0), (0,

    0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2] ); size=2 size=3 size=4 ͜Ε
  47. max_pool( relu_1, border = 'ignore', padding = [(0, 0), (0,

    0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2] ); stride=1 stride=2 ͜Ε
  48. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... variable_2 =

    variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); ... variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); ... } graph.nnef Max PoolingͰը૾Λ224x224͔Β112x112ʹॖখ
  49. input convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU

    maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax VGG16 SIMONYAN, Karen; ZISSERMAN, Andrew. Very deep convolutional networks for large- scale image recognition. arXiv preprint arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556 ࠓݟͨ෦෼
  50. input convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU

    maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear ReLU linear softmax ͷը૾͕10ຕग़ΔΑ 7 × 7 × 512 ཁૉͷϕΫτϧ͕10ݸཉ͍͠Α 25088 σʔλͱͯ͠͸ಉ͕ͩ͡ ܕͷม׵͕͍Δ
  51. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { ... relu_12 =

    relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); reshape = reshape(max_pool_4, shape = [10, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); ... } graph.nnef reshape(max_pool_4, shape = [10, -1]); ͜ͷόοϑΝͷ಺༰Λ 10౳෼ʹͯ͠ɺ10ຊͷϕΫτϧʹ͢Δ reshape - σʔλͷղऍͷมߋ
  52. version 1.0; graph VGG_ILSVRC_16_layers(data) -> (prob) { variable_15 = variable<scalar>(label

    = 'conv4_1_blob2', shape = [1, 512]); variable_14 = variable<scalar>(label = 'conv4_1_blob1', shape = [512, 256, 3, 3]); variable_13 = variable<scalar>(label = 'conv3_3_blob2', shape = [1, 256]); variable_31 = variable<scalar>(label = 'fc8_blob2', shape = [1, 1000]); variable_30 = variable<scalar>(label = 'fc8_blob1', shape = [1000, 4096]); variable_29 = variable<scalar>(label = 'fc7_blob2', shape = [1, 4096]); variable_28 = variable<scalar>(label = 'fc7_blob1', shape = [4096, 4096]); variable_27 = variable<scalar>(label = 'fc6_blob2', shape = [1, 4096]); variable_26 = variable<scalar>(label = 'fc6_blob1', shape = [4096, 25088]); variable_25 = variable<scalar>(label = 'conv5_3_blob2', shape = [1, 512]); variable_24 = variable<scalar>(label = 'conv5_3_blob1', shape = [512, 512, 3, 3]); variable_23 = variable<scalar>(label = 'conv5_2_blob2', shape = [1, 512]); variable_22 = variable<scalar>(label = 'conv5_2_blob1', shape = [512, 512, 3, 3]); variable_21 = variable<scalar>(label = 'conv5_1_blob2', shape = [1, 512]); variable_20 = variable<scalar>(label = 'conv5_1_blob1', shape = [512, 512, 3, 3]); variable_19 = variable<scalar>(label = 'conv4_3_blob2', shape = [1, 512]); variable_18 = variable<scalar>(label = 'conv4_3_blob1', shape = [512, 512, 3, 3]); variable_17 = variable<scalar>(label = 'conv4_2_blob2', shape = [1, 512]); variable_16 = variable<scalar>(label = 'conv4_2_blob1', shape = [512, 512, 3, 3]); variable_12 = variable<scalar>(label = 'conv3_3_blob1', shape = [256, 256, 3, 3]); variable_10 = variable<scalar>(label = 'conv3_2_blob1', shape = [256, 256, 3, 3]); variable_9 = variable<scalar>(label = 'conv3_1_blob2', shape = [1, 256]); variable_8 = variable<scalar>(label = 'conv3_1_blob1', shape = [256, 128, 3, 3]); variable_6 = variable<scalar>(label = 'conv2_2_blob1', shape = [128, 128, 3, 3]); variable_11 = variable<scalar>(label = 'conv3_2_blob2', shape = [1, 256]); variable_5 = variable<scalar>(label = 'conv2_1_blob2', shape = [1, 128]); variable_4 = variable<scalar>(label = 'conv2_1_blob1', shape = [128, 64, 3, 3]); variable_2 = variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); variable_7 = variable<scalar>(label = 'conv2_2_blob2', shape = [1, 128]); variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_2 = conv(max_pool, variable_4, variable_5, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); graph.nnef ϑΝΠϧ͔Β஋ΛಡΉ
  53. variable_6 = variable<scalar>(label = 'conv2_2_blob1', shape = [128, 128, 3,

    3]); variable_11 = variable<scalar>(label = 'conv3_2_blob2', shape = [1, 256]); variable_5 = variable<scalar>(label = 'conv2_1_blob2', shape = [1, 128]); variable_4 = variable<scalar>(label = 'conv2_1_blob1', shape = [128, 64, 3, 3]); variable_2 = variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3, 3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); variable_7 = variable<scalar>(label = 'conv2_2_blob2', shape = [1, 128]); variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_2 = conv(max_pool, variable_4, variable_5, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_2 = relu(conv_2); conv_3 = conv(relu_2, variable_6, variable_7, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_3 = relu(conv_3); max_pool_1 = max_pool(relu_3, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_4 = conv(max_pool_1, variable_8, variable_9, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_4 = relu(conv_4); conv_5 = conv(relu_4, variable_10, variable_11, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_5 = relu(conv_5); conv_6 = conv(relu_5, variable_12, variable_13, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_6 = relu(conv_6); max_pool_2 = max_pool(relu_6, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_7 = conv(max_pool_2, variable_14, variable_15, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_7 = relu(conv_7); conv_8 = conv(relu_7, variable_16, variable_17, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_8 = relu(conv_8); conv_9 = conv(relu_8, variable_18, variable_19, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_9 = relu(conv_9); max_pool_3 = max_pool(relu_9, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_10 = conv(max_pool_3, variable_20, variable_21, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_10 = relu(conv_10); conv_11 = conv(relu_10, variable_22, variable_23, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_11 = relu(conv_11); conv_12 = conv(relu_11, variable_24, variable_25, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_12 = relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); reshape = reshape(max_pool_4, shape = [10, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); linear_1 = linear(relu_13, variable_28, variable_29); input convolution ReLU maxpool convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU
  54. variable_2 = variable<scalar>(label = 'conv1_2_blob1', shape = [64, 64, 3,

    3]); variable_1 = variable<scalar>(label = 'conv1_1_blob2', shape = [1, 64]); variable_7 = variable<scalar>(label = 'conv2_2_blob2', shape = [1, 128]); variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); data = external<scalar>(shape = [10, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_1 = relu(conv_1); max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_2 = conv(max_pool, variable_4, variable_5, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_2 = relu(conv_2); conv_3 = conv(relu_2, variable_6, variable_7, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_3 = relu(conv_3); max_pool_1 = max_pool(relu_3, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_4 = conv(max_pool_1, variable_8, variable_9, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_4 = relu(conv_4); conv_5 = conv(relu_4, variable_10, variable_11, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_5 = relu(conv_5); conv_6 = conv(relu_5, variable_12, variable_13, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_6 = relu(conv_6); max_pool_2 = max_pool(relu_6, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_7 = conv(max_pool_2, variable_14, variable_15, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_7 = relu(conv_7); conv_8 = conv(relu_7, variable_16, variable_17, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_8 = relu(conv_8); conv_9 = conv(relu_8, variable_18, variable_19, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_9 = relu(conv_9); max_pool_3 = max_pool(relu_9, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_10 = conv(max_pool_3, variable_20, variable_21, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_10 = relu(conv_10); conv_11 = conv(relu_10, variable_22, variable_23, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_11 = relu(conv_11); conv_12 = conv(relu_11, variable_24, variable_25, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_12 = relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); reshape = reshape(max_pool_4, shape = [10, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } convolution ReLU convolution ReLU convolution ReLU maxpool convolution ReLU convolution ReLU convolution ReLU maxpool
  55. max_pool = max_pool(relu_1, border = 'ignore', padding = [(0, 0),

    (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_2 = conv(max_pool, variable_4, variable_5, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_2 = relu(conv_2); conv_3 = conv(relu_2, variable_6, variable_7, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_3 = relu(conv_3); max_pool_1 = max_pool(relu_3, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_4 = conv(max_pool_1, variable_8, variable_9, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_4 = relu(conv_4); conv_5 = conv(relu_4, variable_10, variable_11, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_5 = relu(conv_5); conv_6 = conv(relu_5, variable_12, variable_13, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_6 = relu(conv_6); max_pool_2 = max_pool(relu_6, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_7 = conv(max_pool_2, variable_14, variable_15, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_7 = relu(conv_7); conv_8 = conv(relu_7, variable_16, variable_17, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_8 = relu(conv_8); conv_9 = conv(relu_8, variable_18, variable_19, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_9 = relu(conv_9); max_pool_3 = max_pool(relu_9, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); conv_10 = conv(max_pool_3, variable_20, variable_21, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_10 = relu(conv_10); conv_11 = conv(relu_10, variable_22, variable_23, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_11 = relu(conv_11); conv_12 = conv(relu_11, variable_24, variable_25, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_12 = relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); reshape = reshape(max_pool_4, shape = [10, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); linear_1 = linear(relu_13, variable_28, variable_29); relu_14 = relu(linear_1); linear_2 = linear(relu_14, variable_30, variable_31); prob = softmax(linear_2, axes = [1]); } convolution ReLU convolution ReLU convolution ReLU maxpool linear ReLU linear softmax ReLU convolution ReLU maxpool
  56. constant<scalar>( shape = [ 3, 4 ], value = [42.0]

    ); constant - શͯͷཁૉ͕ಛఆͷ஋ʹͳ͍ͬͯΔόοϑΝΛ࡞Δ [ 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 42.0 ]
  57. conv( data, variable, variable_1, border = 'constant', dilation = [2,

    2], groups = 1, padding = [(0, 0), (0, 0)], stride = [1, 1] ); deconv - ٯ৞ΈࠐΈ deconv( data, variable, variable_1, border = 'constant', dilation = [2, 2], groups = 1, padding = [(0, 0), (0, 0)], stride = [1, 1] );
  58. box - ϑΟϧλͷൣғ಺ͷ૯࿨Λฦ͢ box( data, size = [ 1, 3,

    3 ], border = 'constant', dilation = [2, 2], groups = 1, padding = [(0, 0), (0, 0)], stride = [1, 1] ); debox( data, size = [ 1, 3, 3 ], border = 'constant', dilation = [2, 2], groups = 1, padding = [(0, 0), (0, 0)], stride = [1, 1] ); 9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 1
  59. 6 9 8 2 7 6 8 7 8 5

    3 6 6 1 0 6 4 7 3 4 7 argmax_pool - ࠷େ஋Λ࣋ͭཁૉͷΠϯσοΫεΛฦ͢ sample - ΠϯσοΫεͰଞͷςϯιϧ͔Β஋Λरͬͯ͘Δ
  60. 22 9 2 4 7 9 9 2 4 7

    2 9 2 4 7 0 9 2 4 7 1 9 2 4 7 0 1 0 1 1 1 1 0 1 1 reduce - ಛఆͷ࣠ํ޲ͷશͯͷ஋Λ1ͭͷ஋ʹू໿͢Δ sum_reduce max_reduce min_reduce argmax_reduce argmin_reduce all_reduce any_reduce
  61. 3 9 5 4 gather - ಛఆͷ࣠ํ޲ͷཁૉ͔Β1ͭΛબͿ 1 0 3

    2 3 9 5 4 9.0 2.0 7.0 7.0 3.0 6.0 0.0 8.0 8.0 5.0 7.0 6.0 2.0 0.0 7.0 9.0 9 2 7 7 3 6 0 8 8 5 7 6 2 0 7 9 cast - ςϯιϧͷཁૉͷܕΛม͑Δ
  62. matmul - ߦྻੵ result = matmul( input1, input2 ); w00

    w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 = w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 input1 input2 result result = transpose( input, [ 1, 0 ] ); transpose - సஔ w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 = w00 w01 w02 w03 w10 w11 w12 w13 w20 w21 w22 w23 w30 w31 w32 w33 T result input
  63. ׆ੑԽؔ਺ sigmoid relu prelu leaky_relu elu gelu silu softmax softplus

    ϓʔϦϯά max_pool_with_index max_pool avg_pool rms_pool ਖ਼نԽ local_response_normalization local_mean_normalization local_variance_normalization local_contrast_normalization l1_normalization l2_normalization batch_normalization ྔࢠԽ min_max_linear_quantize zero_point_linear_quantize logarithmetic_quantize
  64. ؔ৺ྖҬϓʔϦϯά avg_roi_pool max_roi_pool roi_resample avg_roi_align max_roi_align ϦαΠζ nearest_downsample area_downsample nearest_upsample

    multilinear_upsample ୯߲ԋࢉࢠ copy neg rcp exp log sin cos tan sinh cosh tanh asin acos atan asinh acosh atanh abs sign not floor ceil round ೋ߲ԋࢉࢠ add sub mul div pow lt gt le ge eq ne and or ࡾ߲ԋࢉࢠ select ͦͷଞͷؔ਺ sqr sqrt rsqr rsqrt log2 min max clamp
  65. ͷΦϖϨʔλͷ͏ͪ VGGΛಈ͔ͨ͢Ίʹ࣮૷͞Ε͍ͯͳ͚Ε͹ͳΒͳ͍෺ external variable conv relu max_pool reshape linear softmax

    GPUͷϝϞϦ֬อ͢Δ͚ͩ GPUͷϝϞϦ֬อͯ͠σʔλΛసૹ͢Δ͚ͩ ςϯιϧͷαΠζΛม͑Δ͚ͩ
  66. #version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable

    #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 32, local_size_y = 32, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer weight { float weight_data[]; }; layout(std430, binding = 2) buffer output_vector { float output_data[]; }; layout(std430, binding = 3) buffer bias { float bias_data[]; }; layout(constant_id = 1) const int filter_size_x = 0; layout(constant_id = 2) const int filter_size_y = 0; ೖྗ όΠΞε ग़ྗ ϑΟϧλ conv
  67. layout(constant_id = 1) const int filter_size_x = 0; layout(constant_id =

    2) const int filter_size_y = 0; layout(constant_id = 3) const int filter_size_z = 0; layout(constant_id = 4) const int lpadding = 0; layout(constant_id = 5) const int rpadding = 0; layout(constant_id = 6) const int tpadding = 0; layout(constant_id = 7) const int bpadding = 0; layout(constant_id = 8) const int stride_x = 0; layout(constant_id = 9) const int stride_y = 0; layout(constant_id = 10) const int dilation_x = 0; layout(constant_id = 11) const int dilation_y = 0; layout(constant_id = 12) const int input_dim_x = 0; layout(constant_id = 13) const int input_dim_y = 0; layout(constant_id = 14) const int input_dim_z = 0; layout(constant_id = 15) const int output_dim_x = 0; layout(constant_id = 16) const int output_dim_y = 0; layout(constant_id = 17) const int output_dim_z = 0; layout(constant_id = 18) const float border_value = 0.0; layout(constant_id = 19) const int bias_mode = 0; layout(constant_id = 20) const float bias_value = 0; int get_filter_length() { return filter_size_x * filter_size_y * filter_size_z; } ϑΟϧλαΠζ όΠΞεͷ৐ͤํ ൣғ֎ͷ஋ padding stride dilation ೖྗͷαΠζ ग़ྗͷαΠζ
  68. shared float[filter_size_x*filter_size_y*filter_size_z] filter_cache; void load_filter() { const uint local_id =

    gl_LocalInvocationID.x + gl_LocalInvocationID.y * gl_WorkGroupSize.x + gl_LocalInvocationID.z * gl_WorkGroupSize.y * gl_WorkGroupSize.x; const uint local_size = gl_WorkGroupSize.x * gl_WorkGroupSize.y * gl_WorkGroupSize.z; const uint filter_size = uint( get_filter_length() ); const uint cycles = filter_size / local_size + ( bool( filter_size % local_size ) ? 1 : 0 ); for( uint c = 0u; c != cycles; c++ ) { uint i = c * local_size + local_id; const int filter_offset = get_filter_offset( int( i ) ); if( i < filter_size ) { filter_cache[ i ] = weight_data[ filter_offset ]; } } } float get_input( int i ) { const int input_offset = get_input_offset( i ); return ( input_offset < 0 ) ? 1εϨου͕ग़ྗͷ1ཁૉΛ୲౰͢Δ ෳ਺ͷεϨου͕ಉ͡ϑΟϧλΛ࢖͏ͷͰ ڞ༗ϝϞϦʹϑΟϧλΛϩʔυ͢Δ
  69. const int output_channel = int( gl_GlobalInvocationID.z % output_dim_z ); return

    ( bias_mode == 0 ) ? bias_value : bias_data[ output_channel ]; } void set_output( float v ) { const int output_offset = get_output_offset(); if( output_offset >= 0 ) { output_data[ output_offset ] = v; } } void main() { load_filter(); barrier(); const int filter_length = get_filter_length(); float sum = 0.0; for( int i = 0; i != filter_length; i++ ) { sum += get_input( i ) * get_filter( i ); } sum += get_bias(); set_output( sum ); } ϑΟϧλͷൣғ಺ͷೖྗ஋ʹϑΟϧλΛ͔͚ͯ ૯࿨ΛऔΓɺόΠΞεΛՃ͑ͯग़ྗʹॻ͘
  70. #version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable

    #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 1024, local_size_y = 1, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer weight { float weight_data[]; }; layout(std430, binding = 2) buffer output_vector { float output_data[]; }; layout(std430, binding = 3) buffer bias { float bias_data[]; }; layout(constant_id = 1) const uint input_length = 32; layout(constant_id = 2) const uint bias_mode = 0; layout(constant_id = 3) const float bias_value = 0.0; linear ೖྗ όΠΞε ग़ྗ ϑΟϧλ
  71. }; layout(constant_id = 1) const uint input_length = 32; layout(constant_id

    = 2) const uint bias_mode = 0; layout(constant_id = 3) const float bias_value = 0.0; shared float[32] temp; float get_bias() { const uint output_index = gl_GlobalInvocationID.y; return ( bias_mode == 0 ) ? bias_value : bias_data[ output_index ]; } void main() { const uint x = gl_GlobalInvocationID.x; const uint y = gl_GlobalInvocationID.y; const uint batch = gl_GlobalInvocationID.z; float sum = float( 0 ); const uint x_blocks = input_length / gl_WorkGroupSize.x + ( bool( input_length % gl_WorkGroupSize.x ) ? 1 : 0 ); for( uint x_index = 0; x_index != x_blocks; x_index++ ) { const uint x_global = x_index * gl_WorkGroupSize.x + x; ೖྗͷαΠζ όΠΞεͷ৐ͤํ ਫฏՃࢉʹ࢖͏ڞ༗ϝϞϦ
  72. const uint batch = gl_GlobalInvocationID.z; float sum = float( 0

    ); const uint x_blocks = input_length / gl_WorkGroupSize.x + ( bool( input_length % gl_WorkGroupSize.x ) ? 1 : 0 ); for( uint x_index = 0; x_index != x_blocks; x_index++ ) { const uint x_global = x_index * gl_WorkGroupSize.x + x; const uint batch_offset = input_length * batch; bool mask = ( x_global < input_length ); float result = float( 0 ); result = mask ? ( weight_data[ x_global + y * input_length ] * input_data[ batch_offset + x_global ] ) : float( 0 ); result = subgroupAdd( result ); if( gl_SubgroupInvocationID == 0 ) { temp[ gl_SubgroupID ] = result; } barrier(); mask = ( x < gl_NumSubgroups ); result = subgroupAdd( mask ? temp[ x ] : float( 0 ) ); sum += result; } if( x == 0 ) { const uint output_length = gl_WorkGroupSize.y * gl_NumWorkGroups.y; const uint batch_offset = output_length * batch; output_data[ batch_offset + y ] = sum + get_bias(); } ߦྻͷ1ཁૉʹ͖ͭ1εϨου ਫฏՃࢉ໋ྩͰ ೖྗͱॏΈ1ྻͷ಺ੵΛٻΊΔ 1ߦ໨Λ୲౰͢ΔεϨου͕ ܭࢉ݁ՌΛॻ͘
  73. #version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable

    #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 32, local_size_y = 32, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer output_vector { float output_data[]; }; layout(constant_id = 1) const int filter_size_x = 0; layout(constant_id = 2) const int filter_size_y = 0; layout(constant_id = 3) const int filter_size_z = 0; layout(constant_id = 4) const int filter_size_w = 0; layout(constant_id = 5) const int lpadding = 0; layout(constant_id = 6) const int rpadding = 0; layout(constant_id = 7) const int tpadding = 0; layout(constant_id = 8) const int bpadding = 0; layout(constant_id = 9) const int stride_x = 0; layout(constant_id = 10) const int stride_y = 0; layout(constant_id = 11) const int stride_z = 0; layout(constant_id = 12) const int stride_w = 0; max_pool ೖྗ ग़ྗ
  74. }; layout(constant_id = 1) const int filter_size_x = 0; layout(constant_id

    = 2) const int filter_size_y = 0; layout(constant_id = 3) const int filter_size_z = 0; layout(constant_id = 4) const int filter_size_w = 0; layout(constant_id = 5) const int lpadding = 0; layout(constant_id = 6) const int rpadding = 0; layout(constant_id = 7) const int tpadding = 0; layout(constant_id = 8) const int bpadding = 0; layout(constant_id = 9) const int stride_x = 0; layout(constant_id = 10) const int stride_y = 0; layout(constant_id = 11) const int stride_z = 0; layout(constant_id = 12) const int stride_w = 0; layout(constant_id = 13) const int input_dim_x = 0; layout(constant_id = 14) const int input_dim_y = 0; layout(constant_id = 15) const int input_dim_z = 0; layout(constant_id = 16) const int input_dim_w = 0; layout(constant_id = 17) const int output_dim_x = 0; layout(constant_id = 18) const int output_dim_y = 0; layout(constant_id = 19) const int output_dim_z = 0; layout(constant_id = 20) const int output_dim_w = 0; layout(constant_id = 21) const float border_value = 0.0; int get_filter_length() { return filter_size_x * filter_size_y * filter_size_z * filter_size_w; } int get_input_offset( int i ) { ϑΟϧλαΠζ ൣғ֎ͷ஋ padding stride ೖྗͷαΠζ ग़ྗͷαΠζ
  75. } float get_input( int i ) { const int input_offset

    = get_input_offset( i ); return ( input_offset < 0 ) ? border_value : input_data[ input_offset ]; } void set_output( float v ) { const int output_offset = get_output_offset(); if( output_offset >= 0 ) { output_data[ output_offset ] = v; } } void main() { const int filter_length = get_filter_length(); float v = -10000.0; for( int i = 0; i != filter_length; i++ ) { const int input_offset = get_input_offset( i ); if( input_offset != -1 ) { v = max( input_data[ input_offset ], v ); } } set_output( v ); } ϑΟϧλͷൣғ಺Ͱ ࠷େͷ஋Λग़ྗʹॻ͘
  76. #version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable

    #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 1024, local_size_y = 1, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer output_vector { float output_data[]; }; layout(constant_id = 1) const uint input_length = 32; void main() { const uint index = gl_GlobalInvocationID.x + gl_NumWorkGroups.x * gl_WorkGroupSize.x * gl_GlobalInvocationID.y + gl_NumWorkGroups.x * gl_WorkGroupSize.x * gl_NumWorkGroups.y * gl_WorkGroupSize.y * gl_GlobalInvocationID.z; if( index >= input_length ) return; float v = input_data[ index ]; output_data[ index ] = ( v >= 0.0 ) ? v : 0.0; } ೖྗ ग़ྗ ग़ྗ = max( ೖྗ, 0 ) relu
  77. #version 450 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable

    #extension GL_KHR_shader_subgroup_basic : enable #extension GL_KHR_shader_subgroup_arithmetic : enable #extension GL_KHR_memory_scope_semantics : enable layout(local_size_x = 1024, local_size_y = 1, local_size_z = 1 ) in; layout(std430, binding = 0) buffer input_vector { float input_data[]; }; layout(std430, binding = 1) buffer output_vector { float output_data[]; }; layout(constant_id = 1) const uint input_length = 32; shared float[32] temp; void main() { const uint offset = gl_LocalInvocationID.x; const uint batch = gl_GlobalInvocationID.z; float max = 0.0; for( uint i = offset; i < input_length; i += 1024 ) { softmax ೖྗ ग़ྗ
  78. const uint offset = gl_LocalInvocationID.x; const uint batch = gl_GlobalInvocationID.z;

    float max = 0.0; for( uint i = offset; i < input_length; i += 1024 ) { float v = input_data[ i + input_length * batch ]; if( v > max ) { max = v; } } const float smax = subgroupMax( max ); if( gl_SubgroupInvocationID == 0 ) { temp[ gl_SubgroupID ] = smax; } barrier(); const float gmax = subgroupMax( temp[ gl_SubgroupInvocationID ] ); float sum = 0.0; for( uint i = offset; i < input_length; i += 1024 ) { sum += exp( input_data[ i + input_length * batch ] - gmax ); } const float ssum = subgroupAdd( sum ); softmaxʹ͸ ೖྗ஋ͷexpͷ૯࿨͕ཁΔͷͰ શͯͷ஋Λڞ༗ϝϞϦ͕ಧ͘ 1024εϨουͷதͰย෇͚Δ ೖྗͷ࠷େ஋ΛٻΊΔ
  79. const float smax = subgroupMax( max ); if( gl_SubgroupInvocationID ==

    0 ) { temp[ gl_SubgroupID ] = smax; } barrier(); const float gmax = subgroupMax( temp[ gl_SubgroupInvocationID ] ); float sum = 0.0; for( uint i = offset; i < input_length; i += 1024 ) { sum += exp( input_data[ i + input_length * batch ] - gmax ); } const float ssum = subgroupAdd( sum ); if( gl_SubgroupInvocationID == 0 ) { temp[ gl_SubgroupID ] = ssum; } barrier(); const float gsum = subgroupAdd( temp[ gl_SubgroupInvocationID ] ); for( uint i = offset; i < input_length; i += 1024 ) { output_data[ i + input_length * batch ] = exp( input_data[ i + input_length * batch ] - gmax ) / gsum; } } ೖྗͷexpͷ૯࿨ΛٻΊΔ ೖྗͷexpΛೖྗͷexpͷ૯࿨ͰׂΔ
  80. graph::graph( const std::shared_ptr< device_t > &device, const std::shared_ptr< allocator_t >

    &allocator, const std::shared_ptr< descriptor_pool_t > &descriptor_pool, const std::shared_ptr< pipeline_cache_t > &pipeline_cache, const std::filesystem::path &dir, const std::filesystem::path &shader_dir, command_buffer_recorder_t &rec ) { nnef::Graph parsed; std::string error; if( !nnef::load_graph( dir.string(), parsed, error, "" ) ) { std::cerr << error << std::endl; throw -1; } if( !nnef::infer_shapes( parsed, error ) ) { std::cerr << error << std::endl; throw -1; } if( !nnef::allocate_buffers( parsed, error ) ) { std::cerr << error << std::endl; throw -1; } gct/src/gct/dnn/graph.cpp NNEF-Toolsͷ ύʔαͰ NNEFΛಡΈࠐΉ
  81. for( const auto &o: parsed.operations ) { if( o.name ==

    "variable" ) { const std::string name = get_output_name( o ); const auto label = std::find_if( o.attribs.begin(), o.attribs.end(), []( const auto &v ) { return v.first == "label"; } ); if( label == o.attribs.end() ) { throw -1; } const auto data_name = label->second.string(); const auto data_filename = dir / ( data_name + ".dat" ); auto nnef_data = rec.load_nnef_data( allocator, std::filesystem::absolute( data_filename ), vk::BufferUsageFlagBits::eStorageBuffer| vk::BufferUsageFlagBits::eTransferDst ); bufs.insert( std::make_pair( name, nnef_data ) ); } gct/src/gct/dnn/graph.cpp ඞཁͳϑΝΠϧͷ಺༰Λ GPUͷϝϞϦʹૹΔ
  82. for( const auto &o: parsed.operations ) { if( o.name ==

    "conv" ) { const std::string name = get_output_name( o ); auto op = std::make_shared< operation::convolution >( allocator, descriptor_pool, pipeline_cache, o, shaders, bufs ); bufs.insert( std::make_pair( name, op->get_output() ) ); ops.push_back( op ); } else if( o.name == "linear" ) { const std::string name = get_output_name( o ); auto op = std::make_shared< operation::linear >( allocator, descriptor_pool, pipeline_cache, o, shaders, bufs gct/src/gct/dnn/graph.cpp ֤૚ʹରԠ͢Δ ύΠϓϥΠϯΛ࡞Γ σεΫϦϓληοτʹ όοϑΝΛ݁ͼ͚ͭΔ
  83. void convolution::operator()( command_buffer_recorder_t &rec ) { rec.compute_barrier( { input.buffer },

    {} ); rec.bind_descriptor_set( vk::PipelineBindPoint::eCompute, pipeline_layout, descriptor_set ); rec.bind_pipeline( pipeline ); rec.dispatch_threads( exec_dim[ 0 ], exec_dim[ 1 ], exec_dim[ 2 ] ); } gct/src/gct/dnn/convolution.cpp ίϚϯυόοϑΝʹ ඞཁͳϝϞϦόϦΞͱίϯϐϡʔτύΠϓϥΠϯͷ࣮ߦΛੵΉ
  84. std::vector< std::uint8_t > temp( dest.buffer->get_props().get_basic().size, 0u ); std::unordered_map< std::string, int

    > channel_order{ { "R", 2 }, { "G", 1 }, { "B", 0 } }; for( int c = 0; c != spec.nchannels; ++c ) { const auto order = channel_order.find( spec.channelnames[ c ] ); if( order != channel_order.end() ) { file->read_image( c, c + 1u, type, reinterpret_cast< std::uint8_t* >( std::next( temp.data(), spec.width * spec.height * dest.type.depth/8u * order->second ) ) ); } } constexpr std::array< float, 3u > mean{ 123.68f, 116.779f, 103.939f }; for( int c = 0; c != spec.nchannels; ++c ) { for( unsigned int y = 0; y != spec.height; ++y ) { for( unsigned int x = 0; x != spec.width; ++x ) { const auto index = x + y * spec.width + c * spec.width * spec.height; reinterpret_cast< float* >( temp.data() )[ index ] = reinterpret_cast< float* >( temp.data() )[ index ] * 255.0f - mean[ c ]; } } } gct/src/gct/dnn/load_image.cpp AlexNetޓ׵ͷೖྗը૾ͷલॲཧ νϟωϧΛBGRͷॱʹฒ΂ସ͑ ImageNetͷֶशσʔλͷνϟωϧຖͷฏۉ஋ΛҾ͘ VGGؚΉଟ͘ͷը૾ॲཧܥͷϞσϧ͕͜ͷલॲཧΛ࠾༻͍ͯ͠Δ
  85. --- /home/fadis/vgg16.orig/graph.nnef 2019-05-21 21:24:49.000000000 +0900 +++ /home/fadis/vgg16/graph.nnef 2023-07-17 20:32:06.938232809 +0900

    @@ -34,7 +34,7 @@ variable_7 = variable<scalar>(label = 'conv2_2_blob2', shape = [1, 128]); variable = variable<scalar>(label = 'conv1_1_blob1', shape = [64, 3, 3, 3]); variable_3 = variable<scalar>(label = 'conv1_2_blob2', shape = [1, 64]); - data = external<scalar>(shape = [10, 3, 224, 224]); + data = external<scalar>(shape = [1, 3, 224, 224]); conv = conv(data, variable, variable_1, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu = relu(conv); conv_1 = conv(relu, variable_2, variable_3, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); @@ -66,7 +66,7 @@ conv_12 = conv(relu_11, variable_24, variable_25, border = 'constant', dilation = [1, 1], groups = 1, padding = [(1, 1), (1, 1)], stride = [1, 1]); relu_12 = relu(conv_12); max_pool_4 = max_pool(relu_12, border = 'ignore', padding = [(0, 0), (0, 0), (0, 0), (0, 0)], size = [1, 1, 2, 2], stride = [1, 1, 2, 2]); - reshape = reshape(max_pool_4, shape = [10, -1]); + reshape = reshape(max_pool_4, shape = [1, -1]); linear = linear(reshape, variable_26, variable_27); relu_13 = relu(linear); linear_1 = linear(relu_13, variable_28, variable_29); ධՁ͸ը૾1ຕͰߦ͍͍ͨͷͰόοναΠζΛ10͔Β1ʹมߋ
  86. $ dnn_eval -m ~/vgg16 -i ~/002.jpg -l ~/LOC_synset_mapping.txt 959 carbonara

    0.997535 923 plate 0.00213662 940 spaghetti squash 0.000186059 937 broccoli 2.2913e-05 762 restaurant, eating house, eating place, eatery 1.6662e-05 809 soup bowl 1.57591e-05 935 mashed potato 1.37334e-05 962 meat loaf, meatloaf 1.27348e-05 934 hotdog, hot dog, red hot 1.23476e-05 925 consomme 6.84554e-06 ͜ͷ෺ମ͕ ΧϧϘφʔϥͰ͋ΔՄೳੑ 99.75%
  87. $ dnn_eval -m ~/vgg16 -i ~/001.jpg -l ~/LOC_synset_mapping.txt 951 lemon

    0.986053 950 orange 0.0096886 961 dough 0.001014 954 banana 0.00058848 928 ice cream, icecream 0.000531914 953 pineapple, ananas 0.000395049 949 strawberry 0.000154118 952 fig 0.000151808 940 spaghetti squash 0.000140983 948 Granny Smith 0.000132113 ͜ͷ෺ମ͕ ϨϞϯͰ͋ΔՄೳੑ98.61%
  88. $ dnn_eval -m ~/vgg16 -i ~/003.jpg -l ~/LOC_synset_mapping.txt 784 screwdriver

    0.937934 845 syringe 0.0574703 696 paintbrush 0.000626332 418 ballpoint, ballpoint pen, ballpen, Biro 0.000506295 840 swab, swob, mop 0.000496733 629 lipstick, lip rouge 0.000402968 749 quill, quill pen 0.000344308 731 plunger, plumber's helper 0.000300347 813 spatula 0.000254442 623 letter opener, paper knife, paperknife 0.000160844 ͜ͷ෺ମ͕ υϥΠόʔͰ͋ΔՄೳੑ93.79%