Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
20170829_iOSLT_機械学習とVision.framework
Search
shtnkgm
September 20, 2017
Programming
100
0
Share
20170829_iOSLT_機械学習とVision.framework
機械学習の基礎的な内容を交えつつ、iOS11で追加されたVision.frameworkの説明とデモ
shtnkgm
September 20, 2017
More Decks by shtnkgm
See All by shtnkgm
Combine入門
shtnkgm
2
310
Property Wrappers
shtnkgm
0
370
Saliency Detection
shtnkgm
0
85
パフォーマンス改善とユニットテスト
shtnkgm
4
1.7k
iOSのコードベースレイアウト
shtnkgm
2
820
20190117_iOSLT_CBLinSwift.pdf
shtnkgm
0
120
SwiftとFunctional Reactive Programming
shtnkgm
0
200
20180710_iOSLT_iOSでDarkModeを実装する
shtnkgm
0
120
20180410_iOSLT_SwiftとProtocol-OrientedProgramming
shtnkgm
0
130
Other Decks in Programming
See All in Programming
Why Laravel apps break—Mastering the fundamentals to keep them maintainable
kentaroutakeda
1
300
新規プロダクトを高速で生み出すハーネスエンジニアリング
seanchas116
12
6k
Modding RubyKaigi for Myself
yui_knk
0
540
[KCD Czech] eBPF Meets the GPU: Future of AI Infra Observability
doniacld
0
110
TSKaigi2026-静的解析への投資がAI時代のコード品質を支える ── カスタムESLintルールの設計と運用
hayatokudou
6
1.2k
ReactとSvelteのその先、Ripple-TS / Beyond React and Svelte: Ripple-TS
ssssota
3
1.3k
自動レビューエンジンの実装と運用 ~レビューのない世界へ~
kurukuru1999
2
280
不変条件と整合性境界—ビジネスが決める設計判断と実現パターン / Invariants and Consistency Boundaries
nrslib
10
2.8k
Sans tests, vos agents ne sont pas fiables
nabondance
0
160
Transactional Change Stream Processing With Debezium and Apache Flink
gunnarmorling
1
140
タクシーアプリ『GO』の バックエンド開発のおける AI利活用と若者のすべて
pyama86
3
1.7k
Oxlintはいかにしてtsgolintのlint ruleを呼び出しているのか
syumai
2
940
Featured
See All Featured
Docker and Python
trallard
47
3.8k
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
180
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
65
55k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
2k
The browser strikes back
jonoalderson
0
1.1k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.4k
Producing Creativity
orderedlist
PRO
348
40k
Designing for Performance
lara
611
70k
Introduction to Domain-Driven Design and Collaborative software design
baasie
1
800
Automating Front-end Workflow
addyosmani
1370
210k
Rebuilding a faster, lazier Slack
samanthasiow
85
9.5k
Transcript
ػցֶशͱVision.framework Shota Nakagami / @shtnkgm 2017/8/29
͢༰ — Vision.frameworkͷجຊతͳઆ໌ — ػցֶशͷ֓ཁ — VisionΛ༻͍ͨΧϝϥը૾Λผ͢ΔαϯϓϧΞϓϦ
Vision.frameworkͱ — iOS11͔ΒՃ͞Εͨը૾ೝࣝAPIΛఏڙ͢ΔϑϨʔϜϫʔ Ϋ — ಉ͘͡iOS11͔ΒՃ͞ΕͨػցֶशϑϨʔϜϫʔΫͷCore MLΛநԽ
ػցֶशελοΫ
χϡʔϥϧωοτϫʔΫͱ — ػցֶशख๏ͷҰछ — ਓؒͷͷਆܦճ࿏Λ ࣜϞσϧͰදͨ͠ͷ — NNͱུ͞ΕΔ ʢDNN1ɺRNN2ɺCNN3ͳͲʣ 3
Convolutional Neural NetworkʢΈࠐΈχϡʔϥϧωο τϫʔΫʣ 2 Recurrent Neural Networkʢ࠶ؼܕχϡʔϥϧωοτϫ ʔΫʣ 1 Deep Neural NetworkʢσΟʔϓχϡʔϥϧωοτϫʔ Ϋʣ
VisionͰೝࣝͰ͖Δͷ
VisionͰೝࣝͰ͖Δͷᶃ — إݕग़ / Face Detection and Recognition — όʔίʔυݕग़
/ Barcode Detection — ը૾ͷҐஔ߹Θͤ / Image Alignment Analysis — ςΩετݕग़ / Text Detection — ਫฏઢݕग़ / Horizon Detection
VisionͰೝࣝͰ͖Δͷᶄ ػցֶशϞσϧͷ༻ҙ͕ඞཁͳͷ — ΦϒδΣΫτݕग़ͱτϥοΩϯά / Object Detection and Tracking —
ػցֶशʹΑΔը૾ੳ / Machine Learning Image Analysis
Χϝϥը૾Λผ͢ΔαϯϓϧΞ ϓϦΛͭ͘Δ
αϯϓϧΞϓϦ֓ཁ — VisionͷʮػցֶशʹΑΔը૾ੳʯػೳΛར༻ — ΧϝϥͰөͨ͠ը૾Λผ͠ɺϞϊͷ໊લΛग़ྗ
ػցֶशʹΑΔը૾ೝࣝͷྲྀΕ 1. ֶशͷͨΊը૾σʔλΛऩूʢڭࡐΛूΊΔʣ 2. ֶश༻σʔλ͔ΒɺػցֶशΞϧΰϦζϜʹΑΓϞσϧΛ࡞ ※Ϟσϧɾɾɾ͑Λग़ͯ͘͠ΕΔϩδοΫ ྨɿ͜ͷը૾ݘʁೣʁ ճؼɿ༧ଌʢ໌ͷגՁʁʣ 3.
ֶशࡁΈϞσϧΛ༻͍ͯະͷը૾Λผʢ࣮ફʣ
Ϟσϧ࡞ׂѪ — ֶशσʔλͷऩूɾܗׂΓͱେม — ͦΕͳΓͷϚγϯεϖοΫɺܭࢉ͕࣌ؒඞཁ — ػցֶशʹؔ͢Δ͕ࣝඞཁ
Ϟσϧͷ༻ҙ ؆୯ͷͨΊɺֶशࡁΈϞσϧΛར༻ AppleͷαΠτͰ͞Ε͍ͯΔʢ.mlmodelܗࣜʣ https://developer.apple.com/machine-learning/
ϞσϧҰཡ ϞσϧʹΑͬͯಘҙͳը૾ͷछྨ༰ྔ͕ҟͳΔ ʢ5MBʙ553.5MBʣ — MobileNets — SqueezeNet — Places205-GoogLeNet —
ResNet50 — Inception v3 — VGG16
ࠓճResNet50Λར༻ — थɺಈɺ৯ɺΓɺਓͳͲͷ1000छྨͷΧςΰϦ — αΠζ102.6 MB — MITϥΠηϯε
ϞσϧΛϓϩδΣΫτʹࠐΉ
Xcodeʹυϥοά&υϩοϓ
ϞσϧΫϥε͕ࣗಈੜ͞ΕΔ ࣗಈͰϞσϧ໊.swiftͱ͍͏໊લͰϞσϧΫϥε͕࡞͞ΕΔ ྫ) Resnet50.swiftʢҰ෦ൈਮʣ
Χϝϥը૾ͷΩϟϓνϟॲཧ
private func startCapture() { let captureSession = AVCaptureSession() captureSession.sessionPreset =
AVCaptureSessionPresetPhoto // ೖྗͷࢦఆ let captureDevice = AVCaptureDevice.defaultDevice(withMediaType: AVMediaTypeVideo) guard let input = try? AVCaptureDeviceInput(device: captureDevice) else { return } guard captureSession.canAddInput(input) else { return } captureSession.addInput(input) // ग़ྗͷࢦఆ let output: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() output.setSampleBufferDelegate(self, queue: DispatchQueue(label: "VideoQueue")) guard captureSession.canAddOutput(output) else { return } captureSession.addOutput(output) // ϓϨϏϡʔͷࢦఆ guard let previewLayer = AVCaptureVideoPreviewLayer(session: captureSession) else { return } previewLayer.videoGravity = AVLayerVideoGravityResizeAspectFill previewLayer.frame = view.bounds view.layer.insertSublayer(previewLayer, at: 0) // Ωϟϓνϟ։࢝ captureSession.startRunning() }
ࡱӨϑϨʔϜຖʹݺΕΔDeleate extension ViewController: AVCaptureVideoDataOutputSampleBufferDelegate { func captureOutput(_ output: AVCaptureOutput!, didOutputSampleBuffer
sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) { // CMSampleBufferΛCVPixelBufferʹม guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } // ͜ͷதʹVision.frameworkͷॲཧΛॻ͍͍ͯ͘ʢը૾ೝࣝ෦ʣ } }
ը૾ೝࣝ෦ͷॲཧ
VisionͰར༻͢ΔओͳΫϥε — VNCoreMLModel — VNCoreMLRequest — VNImageRequestHandler — VNObservation
VNCoreMLModel — CoreMLͷϞσϧΛVisionͰѻ͏ͨΊͷίϯςφΫϥε
VNCoreMLRequest — CoreMLʹը૾ೝࣝΛཁٻ͢ΔͨΊͷΫϥε — ೝࣝ݁ՌϞσϧͷग़ྗܗࣜʹΑΓܾ·Δ — ը૾→Ϋϥεʢྨ݁Ռʣ — ը૾→ಛྔ —
ը૾→ը૾
VNImageRequestHandler — Ұͭͷը૾ʹର͠ɺҰͭҎ্ͷը૾ೝࣝॲཧ ʢVNCoreMLRequestʣΛ࣮ߦ͢ΔͨΊͷΫϥε — ॳظԽ࣌ʹೝࣝରͷը૾ܗࣜΛࢦఆ͢Δ — CVPixelBuffer — CIImage
— CGImage
VNObservation — ը૾ೝࣝ݁ՌͷநΫϥε — ݁Ռͱͯ͜͠ͷΫϥεͷαϒΫϥεͷ͍ͣΕ͔͕ฦ͞ΕΔ — ೝࣝͷ֬৴Λද͢confidenceϓϩύςΟΛ࣋ͭ ʢVNConfidence=FloatͷΤΠϦΞεʣ
VNObservationαϒΫϥε — VNClassificationObservation ྨ໊ͱͯ͠identifierϓϩύςΟΛ࣋ͭ — VNCoreMLFeatureValueObservation ಛྔσʔλͱͯ͠featureValueϓϩύςΟΛ࣋ͭ — VNPixelBufferObservation ը૾σʔλͱͯ͠pixelBufferϓϩύςΟΛ࣋ͭ
·ͱΊΔͱ… — VNCoreMLModelʢΈࠐΜͩϞσϧʣ — VNCoreMLRequestʢը૾ೝࣝͷϦΫΤετʣ — VNImageRequestHandlerʢϦΫΤετͷ࣮ߦʣ — VNObservationʢೝࣝ݁Ռʣ
۩ମతͳ࣮ίʔυ
ϞσϧΫϥεͷॳظԽ // CoreMLͷϞσϧΫϥεͷॳظԽ guard let model = try? VNCoreMLModel(for: Resnet50().model)
else { return }
ը૾ೝࣝϦΫΤετΛ࡞ // ը૾ೝࣝϦΫΤετΛ࡞ʢҾϞσϧͱϋϯυϥʣ let request = VNCoreMLRequest(model: model) { [weak
self] (request: VNRequest, error: Error?) in guard let results = request.results as? [VNClassificationObservation] else { return } // ผ݁Ռͱͦͷ֬৴Λ্Ґ3݅·Ͱදࣔ // identifierΧϯϚ۠ΓͰෳॻ͔Ε͍ͯΔ͜ͱ͕͋ΔͷͰɺ࠷ॳͷ୯ޠͷΈऔಘ͢Δ let displayText = results.prefix(3) .flatMap { "\(Int($0.confidence * 100))% \($0.identifier.components(separatedBy: ", ")[0])" } .joined(separator: "\n") DispatchQueue.main.async { self?.textView.text = displayText } }
ը૾ೝࣝϦΫΤετΛ࣮ߦ // CVPixelBufferʹର͠ɺը૾ೝࣝϦΫΤετΛ࣮ߦ try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])
ը૾ೝࣝ෦ͷܗ guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
guard let model = try? VNCoreMLModel(for: Resnet50().model) else { return } let request = VNCoreMLRequest(model: model) { [weak self] (request: VNRequest, error: Error?) in guard let results = request.results as? [VNClassificationObservation] else { return } let displayText = results.prefix(3) .flatMap { "\(Int($0.confidence * 100))% \($0.identifier.components(separatedBy: ", ")[0])" } .joined(separator: "\n") DispatchQueue.main.async { self?.textView.text = displayText } } try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])
σϞಈը
None
tabbyͬͯԿʁ
tabby = τϥωίʂ τϥωίͱɺτϥͷΑ͏ͳࣶ༷Λ࣋ͭωίͷ͜ͱͰ͋ΔɻλϏʔͱݺΕΔɻτϥೣ ετϥΠϓͷଞʹɺ్ࣶ༷͕Εͯɺൗ༷ɺᤳᤶ൝ɺࡉ͔ࣶ༷͘Λ్Εͤͯͨ͞ ͷ͕͋Γɺଟ༷Ͱ͋ΔɻʢҾ༻: ΟΩϖσΟΞʣ
·ͱΊ
— ֶशࡁΈϞσϧ͕͋Εɺ࣮ࣗମ؆୯! — ωίͷछྨڭ͑ͯ͘ΕΔ" — ͋ͱϞσϧࣗͰ࡞ΕΔΑ͏ʹͳΕͬͱ෯͕͕ Δ
ॳΓ͔ͨͬͨ͜ͱ — ΠϯελάϥϜ༻ͷࣗಈϋογϡλά͚ΞϓϦ — ϋογϡλάΛ͢ΩϟϓγϣϯAPIطʹഇࢭʘ(^o^)ʗ
αϯϓϧίʔυ ࠓճ͝հͨ͠αϯϓϧίʔυͪ͜Βʹஔ͍ͯ͋Γ·͢ɻ https://github.com/shtnkgm/VisionFrameworkSample ※εΫϦʔϯγϣοτͷެ։ʹNDAҙ
͓ΘΓ
ࢀߟࢿྉᶃ — Build more intelligent apps with machine learning. /
Apple — Vision / Apple Developer Documentation — ʲWWDC2017ʳVision.framework ͷςΩετݕग़Λࢼ͠ ͯΈ·ͨ͠ʲiOS11ʳ — Keras + iOS11 CoreML + Vision Framework ʹΑΔɺ ΫϩإࣝผΞϓϦͷ։ൃ — [Core ML] .mlmodel ϑΝΠϧΛ࡞͢Δ / ϑΣϯϦϧ
ࢀߟࢿྉᶄ — [iOS 11] CoreMLͰը૾ͷࣝผΛࢼͯ͠Έ·ͨ͠ ʢVision.FrameworkΛΘͳ͍ύλʔϯʣ #WWDC2017 — Places205-GoogLeNetͰॴͷఆ /
fabo.io — iOSDCͷϦδΣΫτίϯͰʰiOSͱσΟʔϓϥʔχϯάʱʹ ͍ͭͯ͠·ͨ͠Add Star — [iOS 10][χϡʔϥϧωοτϫʔΫ] OSSͰAccelerateʹՃ ͞ΕͨBNNSΛཧղ͢Δ ~XORฤ~