Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
20170829_iOSLT_機械学習とVision.framework
Search
shtnkgm
September 20, 2017
Programming
97
0
Share
20170829_iOSLT_機械学習とVision.framework
機械学習の基礎的な内容を交えつつ、iOS11で追加されたVision.frameworkの説明とデモ
shtnkgm
September 20, 2017
More Decks by shtnkgm
See All by shtnkgm
Combine入門
shtnkgm
2
310
Property Wrappers
shtnkgm
0
370
Saliency Detection
shtnkgm
0
84
パフォーマンス改善とユニットテスト
shtnkgm
4
1.7k
iOSのコードベースレイアウト
shtnkgm
2
820
20190117_iOSLT_CBLinSwift.pdf
shtnkgm
0
120
SwiftとFunctional Reactive Programming
shtnkgm
0
200
20180710_iOSLT_iOSでDarkModeを実装する
shtnkgm
0
120
20180410_iOSLT_SwiftとProtocol-OrientedProgramming
shtnkgm
0
130
Other Decks in Programming
See All in Programming
なぜあなたのコードには「コシ」がないのか?〜AI時代に問う、最後まで美味しい設計と戦略〜 #phpconkagawa / phpconkagawa2026
shogogg
0
120
20年以上続くプロダクトでも使い続けられる静的解析ツールを求めて
matsuo_atsushi
0
130
リセットCSSを1行消したらアクセシビリティが向上した話
pvcresin
4
450
My daily life on Ruby
a_matsuda
2
180
UIの境界線をデザインする | React Tokyo #15 メイントーク
sasagar
2
420
From Formal Specification to Property Based Test
ohbarye
0
680
Terraform言語の静的解析 / static analysis of Terraform language
wata727
1
130
GitHubCopilotCLIをはじめよう.pdf
htkym
0
320
過去のレビュー知見をSkillsで資産化した話
pkshadeck
PRO
0
830
mruby on C#: From VM Implementation to Game Scripting (RubyKaigi 2026)
hadashia
2
1.5k
ローカルLLMでどこまでコードが書けるか / How much code can be written on a local LLM
kishida
2
260
WebAssembly を読み込むベストプラクティス 2026年春版 / Best Practices for Loading WebAssembly (Spring 2026)
petamoriken
5
1k
Featured
See All Featured
技術選定の審美眼(2025年版) / Understanding the Spiral of Technologies 2025 edition
twada
PRO
118
110k
Information Architects: The Missing Link in Design Systems
soysaucechin
0
910
Technical Leadership for Architectural Decision Making
baasie
3
350
Amusing Abliteration
ianozsvald
1
160
Avoiding the “Bad Training, Faster” Trap in the Age of AI
tmiket
0
140
Pawsitive SEO: Lessons from My Dog (and Many Mistakes) on Thriving as a Consultant in the Age of AI
davidcarrasco
0
130
The Mindset for Success: Future Career Progression
greggifford
PRO
0
320
Design in an AI World
tapps
1
210
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
28
3.5k
Java REST API Framework Comparison - PWX 2021
mraible
34
9.3k
SERP Conf. Vienna - Web Accessibility: Optimizing for Inclusivity and SEO
sarafernandez
2
1.4k
Context Engineering - Making Every Token Count
addyosmani
9
860
Transcript
ػցֶशͱVision.framework Shota Nakagami / @shtnkgm 2017/8/29
͢༰ — Vision.frameworkͷجຊతͳઆ໌ — ػցֶशͷ֓ཁ — VisionΛ༻͍ͨΧϝϥը૾Λผ͢ΔαϯϓϧΞϓϦ
Vision.frameworkͱ — iOS11͔ΒՃ͞Εͨը૾ೝࣝAPIΛఏڙ͢ΔϑϨʔϜϫʔ Ϋ — ಉ͘͡iOS11͔ΒՃ͞ΕͨػցֶशϑϨʔϜϫʔΫͷCore MLΛநԽ
ػցֶशελοΫ
χϡʔϥϧωοτϫʔΫͱ — ػցֶशख๏ͷҰछ — ਓؒͷͷਆܦճ࿏Λ ࣜϞσϧͰදͨ͠ͷ — NNͱུ͞ΕΔ ʢDNN1ɺRNN2ɺCNN3ͳͲʣ 3
Convolutional Neural NetworkʢΈࠐΈχϡʔϥϧωο τϫʔΫʣ 2 Recurrent Neural Networkʢ࠶ؼܕχϡʔϥϧωοτϫ ʔΫʣ 1 Deep Neural NetworkʢσΟʔϓχϡʔϥϧωοτϫʔ Ϋʣ
VisionͰೝࣝͰ͖Δͷ
VisionͰೝࣝͰ͖Δͷᶃ — إݕग़ / Face Detection and Recognition — όʔίʔυݕग़
/ Barcode Detection — ը૾ͷҐஔ߹Θͤ / Image Alignment Analysis — ςΩετݕग़ / Text Detection — ਫฏઢݕग़ / Horizon Detection
VisionͰೝࣝͰ͖Δͷᶄ ػցֶशϞσϧͷ༻ҙ͕ඞཁͳͷ — ΦϒδΣΫτݕग़ͱτϥοΩϯά / Object Detection and Tracking —
ػցֶशʹΑΔը૾ੳ / Machine Learning Image Analysis
Χϝϥը૾Λผ͢ΔαϯϓϧΞ ϓϦΛͭ͘Δ
αϯϓϧΞϓϦ֓ཁ — VisionͷʮػցֶशʹΑΔը૾ੳʯػೳΛར༻ — ΧϝϥͰөͨ͠ը૾Λผ͠ɺϞϊͷ໊લΛग़ྗ
ػցֶशʹΑΔը૾ೝࣝͷྲྀΕ 1. ֶशͷͨΊը૾σʔλΛऩूʢڭࡐΛूΊΔʣ 2. ֶश༻σʔλ͔ΒɺػցֶशΞϧΰϦζϜʹΑΓϞσϧΛ࡞ ※Ϟσϧɾɾɾ͑Λग़ͯ͘͠ΕΔϩδοΫ ྨɿ͜ͷը૾ݘʁೣʁ ճؼɿ༧ଌʢ໌ͷגՁʁʣ 3.
ֶशࡁΈϞσϧΛ༻͍ͯະͷը૾Λผʢ࣮ફʣ
Ϟσϧ࡞ׂѪ — ֶशσʔλͷऩूɾܗׂΓͱେม — ͦΕͳΓͷϚγϯεϖοΫɺܭࢉ͕࣌ؒඞཁ — ػցֶशʹؔ͢Δ͕ࣝඞཁ
Ϟσϧͷ༻ҙ ؆୯ͷͨΊɺֶशࡁΈϞσϧΛར༻ AppleͷαΠτͰ͞Ε͍ͯΔʢ.mlmodelܗࣜʣ https://developer.apple.com/machine-learning/
ϞσϧҰཡ ϞσϧʹΑͬͯಘҙͳը૾ͷछྨ༰ྔ͕ҟͳΔ ʢ5MBʙ553.5MBʣ — MobileNets — SqueezeNet — Places205-GoogLeNet —
ResNet50 — Inception v3 — VGG16
ࠓճResNet50Λར༻ — थɺಈɺ৯ɺΓɺਓͳͲͷ1000छྨͷΧςΰϦ — αΠζ102.6 MB — MITϥΠηϯε
ϞσϧΛϓϩδΣΫτʹࠐΉ
Xcodeʹυϥοά&υϩοϓ
ϞσϧΫϥε͕ࣗಈੜ͞ΕΔ ࣗಈͰϞσϧ໊.swiftͱ͍͏໊લͰϞσϧΫϥε͕࡞͞ΕΔ ྫ) Resnet50.swiftʢҰ෦ൈਮʣ
Χϝϥը૾ͷΩϟϓνϟॲཧ
private func startCapture() { let captureSession = AVCaptureSession() captureSession.sessionPreset =
AVCaptureSessionPresetPhoto // ೖྗͷࢦఆ let captureDevice = AVCaptureDevice.defaultDevice(withMediaType: AVMediaTypeVideo) guard let input = try? AVCaptureDeviceInput(device: captureDevice) else { return } guard captureSession.canAddInput(input) else { return } captureSession.addInput(input) // ग़ྗͷࢦఆ let output: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() output.setSampleBufferDelegate(self, queue: DispatchQueue(label: "VideoQueue")) guard captureSession.canAddOutput(output) else { return } captureSession.addOutput(output) // ϓϨϏϡʔͷࢦఆ guard let previewLayer = AVCaptureVideoPreviewLayer(session: captureSession) else { return } previewLayer.videoGravity = AVLayerVideoGravityResizeAspectFill previewLayer.frame = view.bounds view.layer.insertSublayer(previewLayer, at: 0) // Ωϟϓνϟ։࢝ captureSession.startRunning() }
ࡱӨϑϨʔϜຖʹݺΕΔDeleate extension ViewController: AVCaptureVideoDataOutputSampleBufferDelegate { func captureOutput(_ output: AVCaptureOutput!, didOutputSampleBuffer
sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) { // CMSampleBufferΛCVPixelBufferʹม guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } // ͜ͷதʹVision.frameworkͷॲཧΛॻ͍͍ͯ͘ʢը૾ೝࣝ෦ʣ } }
ը૾ೝࣝ෦ͷॲཧ
VisionͰར༻͢ΔओͳΫϥε — VNCoreMLModel — VNCoreMLRequest — VNImageRequestHandler — VNObservation
VNCoreMLModel — CoreMLͷϞσϧΛVisionͰѻ͏ͨΊͷίϯςφΫϥε
VNCoreMLRequest — CoreMLʹը૾ೝࣝΛཁٻ͢ΔͨΊͷΫϥε — ೝࣝ݁ՌϞσϧͷग़ྗܗࣜʹΑΓܾ·Δ — ը૾→Ϋϥεʢྨ݁Ռʣ — ը૾→ಛྔ —
ը૾→ը૾
VNImageRequestHandler — Ұͭͷը૾ʹର͠ɺҰͭҎ্ͷը૾ೝࣝॲཧ ʢVNCoreMLRequestʣΛ࣮ߦ͢ΔͨΊͷΫϥε — ॳظԽ࣌ʹೝࣝରͷը૾ܗࣜΛࢦఆ͢Δ — CVPixelBuffer — CIImage
— CGImage
VNObservation — ը૾ೝࣝ݁ՌͷநΫϥε — ݁Ռͱͯ͜͠ͷΫϥεͷαϒΫϥεͷ͍ͣΕ͔͕ฦ͞ΕΔ — ೝࣝͷ֬৴Λද͢confidenceϓϩύςΟΛ࣋ͭ ʢVNConfidence=FloatͷΤΠϦΞεʣ
VNObservationαϒΫϥε — VNClassificationObservation ྨ໊ͱͯ͠identifierϓϩύςΟΛ࣋ͭ — VNCoreMLFeatureValueObservation ಛྔσʔλͱͯ͠featureValueϓϩύςΟΛ࣋ͭ — VNPixelBufferObservation ը૾σʔλͱͯ͠pixelBufferϓϩύςΟΛ࣋ͭ
·ͱΊΔͱ… — VNCoreMLModelʢΈࠐΜͩϞσϧʣ — VNCoreMLRequestʢը૾ೝࣝͷϦΫΤετʣ — VNImageRequestHandlerʢϦΫΤετͷ࣮ߦʣ — VNObservationʢೝࣝ݁Ռʣ
۩ମతͳ࣮ίʔυ
ϞσϧΫϥεͷॳظԽ // CoreMLͷϞσϧΫϥεͷॳظԽ guard let model = try? VNCoreMLModel(for: Resnet50().model)
else { return }
ը૾ೝࣝϦΫΤετΛ࡞ // ը૾ೝࣝϦΫΤετΛ࡞ʢҾϞσϧͱϋϯυϥʣ let request = VNCoreMLRequest(model: model) { [weak
self] (request: VNRequest, error: Error?) in guard let results = request.results as? [VNClassificationObservation] else { return } // ผ݁Ռͱͦͷ֬৴Λ্Ґ3݅·Ͱදࣔ // identifierΧϯϚ۠ΓͰෳॻ͔Ε͍ͯΔ͜ͱ͕͋ΔͷͰɺ࠷ॳͷ୯ޠͷΈऔಘ͢Δ let displayText = results.prefix(3) .flatMap { "\(Int($0.confidence * 100))% \($0.identifier.components(separatedBy: ", ")[0])" } .joined(separator: "\n") DispatchQueue.main.async { self?.textView.text = displayText } }
ը૾ೝࣝϦΫΤετΛ࣮ߦ // CVPixelBufferʹର͠ɺը૾ೝࣝϦΫΤετΛ࣮ߦ try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])
ը૾ೝࣝ෦ͷܗ guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
guard let model = try? VNCoreMLModel(for: Resnet50().model) else { return } let request = VNCoreMLRequest(model: model) { [weak self] (request: VNRequest, error: Error?) in guard let results = request.results as? [VNClassificationObservation] else { return } let displayText = results.prefix(3) .flatMap { "\(Int($0.confidence * 100))% \($0.identifier.components(separatedBy: ", ")[0])" } .joined(separator: "\n") DispatchQueue.main.async { self?.textView.text = displayText } } try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])
σϞಈը
None
tabbyͬͯԿʁ
tabby = τϥωίʂ τϥωίͱɺτϥͷΑ͏ͳࣶ༷Λ࣋ͭωίͷ͜ͱͰ͋ΔɻλϏʔͱݺΕΔɻτϥೣ ετϥΠϓͷଞʹɺ్ࣶ༷͕Εͯɺൗ༷ɺᤳᤶ൝ɺࡉ͔ࣶ༷͘Λ్Εͤͯͨ͞ ͷ͕͋Γɺଟ༷Ͱ͋ΔɻʢҾ༻: ΟΩϖσΟΞʣ
·ͱΊ
— ֶशࡁΈϞσϧ͕͋Εɺ࣮ࣗମ؆୯! — ωίͷछྨڭ͑ͯ͘ΕΔ" — ͋ͱϞσϧࣗͰ࡞ΕΔΑ͏ʹͳΕͬͱ෯͕͕ Δ
ॳΓ͔ͨͬͨ͜ͱ — ΠϯελάϥϜ༻ͷࣗಈϋογϡλά͚ΞϓϦ — ϋογϡλάΛ͢ΩϟϓγϣϯAPIطʹഇࢭʘ(^o^)ʗ
αϯϓϧίʔυ ࠓճ͝հͨ͠αϯϓϧίʔυͪ͜Βʹஔ͍ͯ͋Γ·͢ɻ https://github.com/shtnkgm/VisionFrameworkSample ※εΫϦʔϯγϣοτͷެ։ʹNDAҙ
͓ΘΓ
ࢀߟࢿྉᶃ — Build more intelligent apps with machine learning. /
Apple — Vision / Apple Developer Documentation — ʲWWDC2017ʳVision.framework ͷςΩετݕग़Λࢼ͠ ͯΈ·ͨ͠ʲiOS11ʳ — Keras + iOS11 CoreML + Vision Framework ʹΑΔɺ ΫϩإࣝผΞϓϦͷ։ൃ — [Core ML] .mlmodel ϑΝΠϧΛ࡞͢Δ / ϑΣϯϦϧ
ࢀߟࢿྉᶄ — [iOS 11] CoreMLͰը૾ͷࣝผΛࢼͯ͠Έ·ͨ͠ ʢVision.FrameworkΛΘͳ͍ύλʔϯʣ #WWDC2017 — Places205-GoogLeNetͰॴͷఆ /
fabo.io — iOSDCͷϦδΣΫτίϯͰʰiOSͱσΟʔϓϥʔχϯάʱʹ ͍ͭͯ͠·ͨ͠Add Star — [iOS 10][χϡʔϥϧωοτϫʔΫ] OSSͰAccelerateʹՃ ͞ΕͨBNNSΛཧղ͢Δ ~XORฤ~