Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
20170829_iOSLT_機械学習とVision.framework
Search
shtnkgm
September 20, 2017
Programming
0
90
20170829_iOSLT_機械学習とVision.framework
機械学習の基礎的な内容を交えつつ、iOS11で追加されたVision.frameworkの説明とデモ
shtnkgm
September 20, 2017
Tweet
Share
More Decks by shtnkgm
See All by shtnkgm
Combine入門
shtnkgm
2
290
Property Wrappers
shtnkgm
0
350
Saliency Detection
shtnkgm
0
55
パフォーマンス改善とユニットテスト
shtnkgm
4
1.7k
iOSのコードベースレイアウト
shtnkgm
2
780
20190117_iOSLT_CBLinSwift.pdf
shtnkgm
0
97
SwiftとFunctional Reactive Programming
shtnkgm
0
180
20180710_iOSLT_iOSでDarkModeを実装する
shtnkgm
0
96
20180410_iOSLT_SwiftとProtocol-OrientedProgramming
shtnkgm
0
120
Other Decks in Programming
See All in Programming
レイトレZ世代に捧ぐ、今からレイトレを始めるための小径
ichi_raven
0
350
開発生産性が組織文化になるまでの軌跡
tonegawa07
0
170
競馬で学ぶ機械学習の基本と実践 / Machine Learning with Horse Racing
shoheimitani
13
12k
Honoを技術選定したAI要件定義プラットフォームAcsimでの意思決定
codenote
0
230
OSS開発者の憂鬱
yusukebe
12
4.2k
最新のDirectX12で使えるレイトレ周りの機能追加について
projectasura
0
240
CloudNative Days Winter 2025: 一週間で作る低レイヤコンテナランタイム
ternbusty
4
930
JEP 496 と JEP 497 から学ぶ耐量子計算機暗号入門 / Learning Post-Quantum Crypto Basics from JEP 496 & 497
mackey0225
2
280
複数チーム並行開発下でのコード移行アプローチ ~手動 Codemod から「生成AI 活用」への進化
andpad
0
170
MCPサーバー「モディフィウス」で変更容易性の向上をスケールする / modifius
minodriven
8
1.5k
2026年向け会社紹介資料
misu
0
190
FlutterKaigi 2025 システム裏側
yumnumm
0
1.1k
Featured
See All Featured
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
118
20k
Being A Developer After 40
akosma
91
590k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
359
30k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
140
34k
4 Signs Your Business is Dying
shpigford
186
22k
Automating Front-end Workflow
addyosmani
1371
200k
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
Designing for Performance
lara
610
69k
The Power of CSS Pseudo Elements
geoffreycrofte
80
6.1k
Documentation Writing (for coders)
carmenintech
76
5.1k
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
Art, The Web, and Tiny UX
lynnandtonic
303
21k
Transcript
ػցֶशͱVision.framework Shota Nakagami / @shtnkgm 2017/8/29
͢༰ — Vision.frameworkͷجຊతͳઆ໌ — ػցֶशͷ֓ཁ — VisionΛ༻͍ͨΧϝϥը૾Λผ͢ΔαϯϓϧΞϓϦ
Vision.frameworkͱ — iOS11͔ΒՃ͞Εͨը૾ೝࣝAPIΛఏڙ͢ΔϑϨʔϜϫʔ Ϋ — ಉ͘͡iOS11͔ΒՃ͞ΕͨػցֶशϑϨʔϜϫʔΫͷCore MLΛநԽ
ػցֶशελοΫ
χϡʔϥϧωοτϫʔΫͱ — ػցֶशख๏ͷҰछ — ਓؒͷͷਆܦճ࿏Λ ࣜϞσϧͰදͨ͠ͷ — NNͱུ͞ΕΔ ʢDNN1ɺRNN2ɺCNN3ͳͲʣ 3
Convolutional Neural NetworkʢΈࠐΈχϡʔϥϧωο τϫʔΫʣ 2 Recurrent Neural Networkʢ࠶ؼܕχϡʔϥϧωοτϫ ʔΫʣ 1 Deep Neural NetworkʢσΟʔϓχϡʔϥϧωοτϫʔ Ϋʣ
VisionͰೝࣝͰ͖Δͷ
VisionͰೝࣝͰ͖Δͷᶃ — إݕग़ / Face Detection and Recognition — όʔίʔυݕग़
/ Barcode Detection — ը૾ͷҐஔ߹Θͤ / Image Alignment Analysis — ςΩετݕग़ / Text Detection — ਫฏઢݕग़ / Horizon Detection
VisionͰೝࣝͰ͖Δͷᶄ ػցֶशϞσϧͷ༻ҙ͕ඞཁͳͷ — ΦϒδΣΫτݕग़ͱτϥοΩϯά / Object Detection and Tracking —
ػցֶशʹΑΔը૾ੳ / Machine Learning Image Analysis
Χϝϥը૾Λผ͢ΔαϯϓϧΞ ϓϦΛͭ͘Δ
αϯϓϧΞϓϦ֓ཁ — VisionͷʮػցֶशʹΑΔը૾ੳʯػೳΛར༻ — ΧϝϥͰөͨ͠ը૾Λผ͠ɺϞϊͷ໊લΛग़ྗ
ػցֶशʹΑΔը૾ೝࣝͷྲྀΕ 1. ֶशͷͨΊը૾σʔλΛऩूʢڭࡐΛूΊΔʣ 2. ֶश༻σʔλ͔ΒɺػցֶशΞϧΰϦζϜʹΑΓϞσϧΛ࡞ ※Ϟσϧɾɾɾ͑Λग़ͯ͘͠ΕΔϩδοΫ ྨɿ͜ͷը૾ݘʁೣʁ ճؼɿ༧ଌʢ໌ͷגՁʁʣ 3.
ֶशࡁΈϞσϧΛ༻͍ͯະͷը૾Λผʢ࣮ફʣ
Ϟσϧ࡞ׂѪ — ֶशσʔλͷऩूɾܗׂΓͱେม — ͦΕͳΓͷϚγϯεϖοΫɺܭࢉ͕࣌ؒඞཁ — ػցֶशʹؔ͢Δ͕ࣝඞཁ
Ϟσϧͷ༻ҙ ؆୯ͷͨΊɺֶशࡁΈϞσϧΛར༻ AppleͷαΠτͰ͞Ε͍ͯΔʢ.mlmodelܗࣜʣ https://developer.apple.com/machine-learning/
ϞσϧҰཡ ϞσϧʹΑͬͯಘҙͳը૾ͷछྨ༰ྔ͕ҟͳΔ ʢ5MBʙ553.5MBʣ — MobileNets — SqueezeNet — Places205-GoogLeNet —
ResNet50 — Inception v3 — VGG16
ࠓճResNet50Λར༻ — थɺಈɺ৯ɺΓɺਓͳͲͷ1000छྨͷΧςΰϦ — αΠζ102.6 MB — MITϥΠηϯε
ϞσϧΛϓϩδΣΫτʹࠐΉ
Xcodeʹυϥοά&υϩοϓ
ϞσϧΫϥε͕ࣗಈੜ͞ΕΔ ࣗಈͰϞσϧ໊.swiftͱ͍͏໊લͰϞσϧΫϥε͕࡞͞ΕΔ ྫ) Resnet50.swiftʢҰ෦ൈਮʣ
Χϝϥը૾ͷΩϟϓνϟॲཧ
private func startCapture() { let captureSession = AVCaptureSession() captureSession.sessionPreset =
AVCaptureSessionPresetPhoto // ೖྗͷࢦఆ let captureDevice = AVCaptureDevice.defaultDevice(withMediaType: AVMediaTypeVideo) guard let input = try? AVCaptureDeviceInput(device: captureDevice) else { return } guard captureSession.canAddInput(input) else { return } captureSession.addInput(input) // ग़ྗͷࢦఆ let output: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() output.setSampleBufferDelegate(self, queue: DispatchQueue(label: "VideoQueue")) guard captureSession.canAddOutput(output) else { return } captureSession.addOutput(output) // ϓϨϏϡʔͷࢦఆ guard let previewLayer = AVCaptureVideoPreviewLayer(session: captureSession) else { return } previewLayer.videoGravity = AVLayerVideoGravityResizeAspectFill previewLayer.frame = view.bounds view.layer.insertSublayer(previewLayer, at: 0) // Ωϟϓνϟ։࢝ captureSession.startRunning() }
ࡱӨϑϨʔϜຖʹݺΕΔDeleate extension ViewController: AVCaptureVideoDataOutputSampleBufferDelegate { func captureOutput(_ output: AVCaptureOutput!, didOutputSampleBuffer
sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) { // CMSampleBufferΛCVPixelBufferʹม guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } // ͜ͷதʹVision.frameworkͷॲཧΛॻ͍͍ͯ͘ʢը૾ೝࣝ෦ʣ } }
ը૾ೝࣝ෦ͷॲཧ
VisionͰར༻͢ΔओͳΫϥε — VNCoreMLModel — VNCoreMLRequest — VNImageRequestHandler — VNObservation
VNCoreMLModel — CoreMLͷϞσϧΛVisionͰѻ͏ͨΊͷίϯςφΫϥε
VNCoreMLRequest — CoreMLʹը૾ೝࣝΛཁٻ͢ΔͨΊͷΫϥε — ೝࣝ݁ՌϞσϧͷग़ྗܗࣜʹΑΓܾ·Δ — ը૾→Ϋϥεʢྨ݁Ռʣ — ը૾→ಛྔ —
ը૾→ը૾
VNImageRequestHandler — Ұͭͷը૾ʹର͠ɺҰͭҎ্ͷը૾ೝࣝॲཧ ʢVNCoreMLRequestʣΛ࣮ߦ͢ΔͨΊͷΫϥε — ॳظԽ࣌ʹೝࣝରͷը૾ܗࣜΛࢦఆ͢Δ — CVPixelBuffer — CIImage
— CGImage
VNObservation — ը૾ೝࣝ݁ՌͷநΫϥε — ݁Ռͱͯ͜͠ͷΫϥεͷαϒΫϥεͷ͍ͣΕ͔͕ฦ͞ΕΔ — ೝࣝͷ֬৴Λද͢confidenceϓϩύςΟΛ࣋ͭ ʢVNConfidence=FloatͷΤΠϦΞεʣ
VNObservationαϒΫϥε — VNClassificationObservation ྨ໊ͱͯ͠identifierϓϩύςΟΛ࣋ͭ — VNCoreMLFeatureValueObservation ಛྔσʔλͱͯ͠featureValueϓϩύςΟΛ࣋ͭ — VNPixelBufferObservation ը૾σʔλͱͯ͠pixelBufferϓϩύςΟΛ࣋ͭ
·ͱΊΔͱ… — VNCoreMLModelʢΈࠐΜͩϞσϧʣ — VNCoreMLRequestʢը૾ೝࣝͷϦΫΤετʣ — VNImageRequestHandlerʢϦΫΤετͷ࣮ߦʣ — VNObservationʢೝࣝ݁Ռʣ
۩ମతͳ࣮ίʔυ
ϞσϧΫϥεͷॳظԽ // CoreMLͷϞσϧΫϥεͷॳظԽ guard let model = try? VNCoreMLModel(for: Resnet50().model)
else { return }
ը૾ೝࣝϦΫΤετΛ࡞ // ը૾ೝࣝϦΫΤετΛ࡞ʢҾϞσϧͱϋϯυϥʣ let request = VNCoreMLRequest(model: model) { [weak
self] (request: VNRequest, error: Error?) in guard let results = request.results as? [VNClassificationObservation] else { return } // ผ݁Ռͱͦͷ֬৴Λ্Ґ3݅·Ͱදࣔ // identifierΧϯϚ۠ΓͰෳॻ͔Ε͍ͯΔ͜ͱ͕͋ΔͷͰɺ࠷ॳͷ୯ޠͷΈऔಘ͢Δ let displayText = results.prefix(3) .flatMap { "\(Int($0.confidence * 100))% \($0.identifier.components(separatedBy: ", ")[0])" } .joined(separator: "\n") DispatchQueue.main.async { self?.textView.text = displayText } }
ը૾ೝࣝϦΫΤετΛ࣮ߦ // CVPixelBufferʹର͠ɺը૾ೝࣝϦΫΤετΛ࣮ߦ try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])
ը૾ೝࣝ෦ͷܗ guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
guard let model = try? VNCoreMLModel(for: Resnet50().model) else { return } let request = VNCoreMLRequest(model: model) { [weak self] (request: VNRequest, error: Error?) in guard let results = request.results as? [VNClassificationObservation] else { return } let displayText = results.prefix(3) .flatMap { "\(Int($0.confidence * 100))% \($0.identifier.components(separatedBy: ", ")[0])" } .joined(separator: "\n") DispatchQueue.main.async { self?.textView.text = displayText } } try? VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:]).perform([request])
σϞಈը
None
tabbyͬͯԿʁ
tabby = τϥωίʂ τϥωίͱɺτϥͷΑ͏ͳࣶ༷Λ࣋ͭωίͷ͜ͱͰ͋ΔɻλϏʔͱݺΕΔɻτϥೣ ετϥΠϓͷଞʹɺ్ࣶ༷͕Εͯɺൗ༷ɺᤳᤶ൝ɺࡉ͔ࣶ༷͘Λ్Εͤͯͨ͞ ͷ͕͋Γɺଟ༷Ͱ͋ΔɻʢҾ༻: ΟΩϖσΟΞʣ
·ͱΊ
— ֶशࡁΈϞσϧ͕͋Εɺ࣮ࣗମ؆୯! — ωίͷछྨڭ͑ͯ͘ΕΔ" — ͋ͱϞσϧࣗͰ࡞ΕΔΑ͏ʹͳΕͬͱ෯͕͕ Δ
ॳΓ͔ͨͬͨ͜ͱ — ΠϯελάϥϜ༻ͷࣗಈϋογϡλά͚ΞϓϦ — ϋογϡλάΛ͢ΩϟϓγϣϯAPIطʹഇࢭʘ(^o^)ʗ
αϯϓϧίʔυ ࠓճ͝հͨ͠αϯϓϧίʔυͪ͜Βʹஔ͍ͯ͋Γ·͢ɻ https://github.com/shtnkgm/VisionFrameworkSample ※εΫϦʔϯγϣοτͷެ։ʹNDAҙ
͓ΘΓ
ࢀߟࢿྉᶃ — Build more intelligent apps with machine learning. /
Apple — Vision / Apple Developer Documentation — ʲWWDC2017ʳVision.framework ͷςΩετݕग़Λࢼ͠ ͯΈ·ͨ͠ʲiOS11ʳ — Keras + iOS11 CoreML + Vision Framework ʹΑΔɺ ΫϩإࣝผΞϓϦͷ։ൃ — [Core ML] .mlmodel ϑΝΠϧΛ࡞͢Δ / ϑΣϯϦϧ
ࢀߟࢿྉᶄ — [iOS 11] CoreMLͰը૾ͷࣝผΛࢼͯ͠Έ·ͨ͠ ʢVision.FrameworkΛΘͳ͍ύλʔϯʣ #WWDC2017 — Places205-GoogLeNetͰॴͷఆ /
fabo.io — iOSDCͷϦδΣΫτίϯͰʰiOSͱσΟʔϓϥʔχϯάʱʹ ͍ͭͯ͠·ͨ͠Add Star — [iOS 10][χϡʔϥϧωοτϫʔΫ] OSSͰAccelerateʹՃ ͞ΕͨBNNSΛཧղ͢Δ ~XORฤ~