Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥

macOS でできる リアルタイム動画像処理

Biacco42
November 02, 2024

macOS でできる リアルタイム動画像処理

macOS native Symposium #10

macOS は強力な画像処理機能を備えたプラットフォームです。本発表では、実際の業務での事例を参考に、macOS における動画像処理の全体像を紹介します。

入力からリアルタイム処理、出力に至るまでの流れの中で、Core Image、Metal、Video Toolboxなど数多くの要素技術がどのように連携し、効率的な画像処理を実現するかを解説します。macOSには動画像処理に使える多くのFrameworkがありますが、それらの役割と関連性を明確にすることで、macOSならではの高性能な画像処理の知見を共有します。

Biacco42

November 02, 2024
Tweet

More Decks by Biacco42

Other Decks in Programming

Transcript

  1. Works ࣗݾ঺հ @Biacco42 ͨͷ͍͠ਓੜ a.k.a. ͼ͋ͬ͜ Self−Made Keyboards in Japan

    Evangelist 􀈿 YouTube ΄΅िץΩʔϘʔυχϡʔε / ITmedia / HHKB Life 􀇳 Ergo42 / meishi2
  2. • ύϦΦϦϯϐοΫɾύϥϦϯϐοΫ 2024 • JGTO / JLPGA / ถPGAπΞʔ •

    ϓϩ໺ٿ2024ʢετϥΠΫκʔϯ௨ա఺දࣔʣ • ೔ຊγϦʔζ • WBC2023ʢଧٿղੳɺ౤ٿي੻ʣ • ੈքਫӭ • ੈքϑΟΪϡΞબखݖ • ϑΟΪϡΞεέʔτશ೔ຊɾશ೔ຊδϡχΞબखݖ • ໺ٿࣆδϟύϯ • ϓϩ໺ٿΦʔϧελʔ • ཮্೔ຊબखݖ 100m / 200m / ෯௓ͼ / ࡾஈ௓ͼ / ΍Γ౤͛.... • େ૬๾ཱձ͍ղੳ ͳͲ Qoncept ͷ์ૹ޲͚ࣄۀ
  3. • Apple Silicon ͷੑೳʢͱͦͷੑೳ޲্଎౓ʣ • CoreML ͱ Apple Neural Engine

    ʹΑΔ AI ΞΫηϥϨʔγϣϯ • Unified Memory (CPU ↔︎ GPU ͷσʔλసૹʣ • Metal ʹΑΔ GPU ࣮૷ (MTLTexture ↔︎ CVPixelBuffer ม׵) • ProRes / h.264 ϋʔυ΢ΣΞΤϯίʔμɾσίʔμඪ४૷උ • ։ൃ؀ڥ (Xcode / Swift) → ίʔυΛͦͷ·· iOS / iPadOS ΁Ҡ২Մೳ Qoncept ʹͱͬͯ macOS Ͱͷ։ൃ͸ѹ౗తʹίετύϑΥʔϚϯε͕ྑ͍ ͳͥ macOS Λ࢖͏͔
  4. macOS ͷը૾ॲཧϑϨʔϜϫʔΫ AVFoundation Core Media Video Toolbox Metal Core Video

    Core Image Core Animation Core Graphics Vision Core ML AppKit SceneKit SpriteKit MetalKit I/O Video Data Model Image Processing Accelerate Renderer / Presenter
  5. macOS ͷը૾ॲཧϑϨʔϜϫʔΫ AVFoundation Core Media Video Toolbox Metal Core Video

    Core Image Core Animation Core Graphics Vision Core ML AppKit SceneKit SpriteKit MetalKit I/O Video Data Model Image Processing Accelerate Renderer / Presenter
  6. AVFoundation ಈը૾ͷճస ୺຤Λճసͤ͞Δͱ UI ͕ճస͢Δ͕ɺΧϝϥ͸୺຤ʹݻఆ͞Ε͍ͯΔͨΊ ճస͠ͳ͍ɻͦͷͨΊΧϝϥೖྗը૾͕ҙਤ͠ͳ͍ํ޲ʹͳΔ͕ɺ ϋʔυ΢ΣΞࢧԉ෇͖Ͱ؆୯ʹճసͰ͖Δɻ AVCaptureConnection Λ࢖͏ if

    let connection = videoOutput.connection(with: .video), connection.isVideoOrientationSupported { connection.videoRotationAngle = rotation // ճస֯౓ΛࢦఆͰ͖Δ (iOS 17+) connection.isVideoMirrored = false // ΠϯΧϝϥʹ͓͚Δڸࣸ͠΋ઃఆͰ͖Δ }
  7. Video Toolbox AVFoundation ΑΓ௿Ϩϕϧʹ௚઀తʹϋʔυ΢ΣΞΤϯίʔμɾσίʔμΛ ੍ޚͰ͖ΔɻH.264 / HEVC ಈը૾Λ NALU ύέοτʹࡌͤͯૹड৴͢Δͱ͖

    ͳͲʹ࢖͑Δɻ ͪ͜Β΋ϋʔυ΢ΣΞΤϯίʔμɾσίʔμ͕ඪ४౥ࡌ͞Ε͍ͯΔͨΊखܰʹ ࢖͑Δɻ 00 00 00 01 SPS NALU Header 00 00 00 01 PPS NALU Header 00 00 00 01 NALU Header
  8. Video Toolbox Τϯίʔυͷྫ func encode( frame sampleBuffer: CMSampleBuffer, with handler:

    @escaping OutputHandler ) { guard let buffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } var tm = CMSampleBufferGetPresentationTimeStamp(sampleBuffer) let dur = CMSampleBufferGetDuration(sampleBuffer) let pixelFormat = CVPixelBufferGetPixelFormatType(buffer) VTCompressionSessionEncodeFrame( compressionSession, imageBuffer: buffer, presentationTimeStamp: tm, duration: dur, frameProperties: prop, infoFlagsOut: nil ) { (status, flag, sample) in handler(status, flag, sample) } }
  9. Core Media ͕࣌ܥྻͷϝσΟΞσʔλ (ಈըɾԻ੠)ɺಛʹΤϯίʔυɾσίʔυΛ ൐͏ϝσΟΞΛऔΓѻ͏ͨΊͷσʔλܕΛఏڙ͢ΔɻͦͷͨΊΤϯίʔυ͞Εͨঢ়ଶͷ σʔλ΋อ࣋͢Δ͜ͱ͕Ͱ͖ΔɻAVFoundation ͱ Video Toolbox ͷೖग़ྗͰ࢖͏ɻ

    Core Video ͸࣮ࡍͷಈը૾ϑϨʔϜͷσʔλΛอ࣋͠ɺGPU ͰऔΓѻ͑Δɻ Core Media / Core Video CMSampleBuffer CVPixelBuffer CMSampleBuffer CMBlockBuffer CMVideoFormatDescription CMVideoFormatDescription [CMSampleTimingInfo] [CMSampleTimingInfo]
  10. Core Media / Core Video CVPixelBufferPool ͱΧϝϥ ಈը૾ॲཧΛ͠Α͏ͱ͢Δͱը૾σʔλΛόοϑΝ͍ͨ͜͠ͱ͕Α͋͘Δɻ ͜ͷͱ͖ CVPixelBuffer

    Λͦͷ··อ࣋ͯ͠͠·͏ͱɺΧϝϥ͔Βը૾σʔλ ͕ಧ͔ͳ͘ͳΔ͜ͱ͕͋Δɻ CVPixelBuffer ͸ੜ੒ίετ͕ߴ͍ͨΊɺΧϝϥ͸ CVPixelBuffer Λ࢖͍ճ͢ CVPixelBufferPool Λར༻͓ͯ͠ΓɺCVPixelBuffer Λฦ٫͠ͳ͍ͱ Pool ͕ ރׇͯ͠Χϝϥ͔Βը૾σʔλ͕ಧ͔ͳ͘ͳΔͨΊσʔλίϐʔ౳ͰରԠ ͢Δɻ
  11. Core Media / Core Video CVPixelBuffer ͱ CPU ΞΫηε CVPixelBuffer

    ͸લड़ͷ௨Γ GPU ͰऔΓѻ͑Δঢ়ଶʹͳ͍ͬͯΔɻͦͷͨΊ CPU ͔ΒΞΫηε͢Δࡍʹ͸ GPU ͱڝ߹͠ͳ͍Α͏ʹඞͣ CVPixelBufferLockBaseAddress(_:_:) ΛݺͿɻ ·ͨ CPU ͔ΒͷΞΫηε͕ऴΘͬͨΒ CVPixelBufferUnlockBaseAddress(_:_:) ΛݺΜͰඞͣϩοΫΛղআ͢Δɻ
  12. Metal Uni fi ed Memory Architecture Apple M γϦʔζͷ Mac

    ͸ UMA ʹͳ͓ͬͯΓɺGPU ͱ CPU ؒͷϝϞϦసૹ ͕ෆཁʹͳ͍ͬͯΔɻ͜Ε͸ GPU ͱ CPU ͲͪΒͰ΋ॲཧΛߦ͍͍ͨ / ಈը ϑϨʔϜΛසൟʹసૹ͢Δඞཁ͕͋Δಈը૾ॲཧʹ޲͍͍ͯΔɻ CPU GPU Discrete Memory Memory 􀏆 CPU GPU UMA Memory 􀏆 􀏆
  13. Metal Resource Storage Mode UMA ʹ͓͍ͯ GPU ͱ CPU ͕ಉҰϦιʔεʹΞΫηε͢ΔͷΛίϯτϩʔϧ

    ͢Δඞཁ͕͋Γ Shared / Private / Memoryless ͷ 3 ͭͷϞʔυ͕͋Δɻ https://developer.apple.com/documentation/metal/resource_fundamentals/choosing_a_resource_storage_mode_for_apple_gpus
  14. Metal / Core Video CVPixelBuffer ͱ Metal ͷ࿈ܞ CVPixelBuffer ͸

    GPU ͔ΒΞΫηεͰ͖ɺMTLTexture ʹม׵Ͱ͖Δɻ Χϝϥ → CMSampleBuffer → CVPixelBuffer → MTLTexture Ͱޮ཰తʹར༻Ͱ͖Δɻ private var textureCache: CVMetalTextureCache = ??? func makeTexture(from pixelBuffer: CVPixelBuffer) -> MTLTexture? { let w = CVPixelBufferGetWidth(pixelBuffer) let h = CVPixelBufferGetHeight(pixelBuffer) var cvMetalTexture: CVMetalTexture? CVMetalTextureCacheCreateTextureFromImage( kCFAllocatorDefault, textureCache, pixelBuffer, nil, .bgra8Unorm, w, h, 0, &cvMetalTexture ) return CVMetalTextureGetTexture(cvMetalTexture!) }
  15. Metal / Core Video CVPixelBuffer Λ Metal ࿈ܞ͢Δࡍͷ஫ҙ CVPixelBuffer ͷ

    attribute ʹ kCVPixelBufferMetalCompatibilityKey ͕ඞཁɻ let options = NSMutableDictionary() attributes[kCVPixelBufferMetalCompatibilityKey] = true var copy: CVPixelBuffer? CVPixelBufferCreate( nil, width, height, pixelFormat, attributes, &copy )
  16. Vision / Core ML Apple Neural Engine Λར༻ͯ͠ਂ૚ֶशϞσϧΛར༻ͨ͠ը૾ॲཧ͕Ͱ͖Δɻ Vision Framework

    Ͱ͸ਓ෺΍إɺςΩετ΍όʔίʔυͷݕग़ͳͲΛֶशࡁΈ ͷϞσϧͰखܰʹਪఆͰ͖ΔɻiOS 18+ Ͱ API ͕ΊͪΌͪ͘ΌมΘͬͨɻ
  17. Core Image GPU ࢖ͬͨଟ਺ͷը૾ॲཧϑΟϧλΛબͿ͚ͩͰ؆୯ʹద༻Ͱ͖Δɻ ಈը૾ॲཧͰ͸ GPU Λར༻ͭͭ͠ɺඇৗʹ؆ܿͳ API Ͱߴ଎ɾ௿ίετͳը૾ͷ ֦େॖখɾճసʹར༻ɻ

    CIImage ͱ͍͏ܕ͕͋Δ͕ɺ͜Ε͸ CMSampleBuffer ͷΑ͏ͳೖΕ෺Ͱɺ ࣮ࡍʹ͸ CIContext ͱ͍͏ϨϯμϦϯά؀ڥͷೖग़ྗͷͨΊʹར༻͢Δɻ CIImage CVPixelBuffer CIContext Scale / Rotate CVPixelBuffer Render
  18. Core Image CIContext ʹΑΔ CVPixelBuffer ΁ͷ render func rotate(pixelBuffer: CVPixelBuffer,

    orientation: CGImagePropertyOrientation, ciContext: CIContext) -> CVPixelBuffer? { let inputWidth = CVPixelBufferGetWidth(pixelBuffer) let inputHeight = CVPixelBufferGetHeight(pixelBuffer) let pixelFormat = CVPixelBufferGetPixelFormatType(pixelBuffer) let (outputWidth, outputHeight): (Int, Int) = { _ }() var newPixelBuffer: CVPixelBuffer? let originalOptions: NSDictionary = CVBufferCopyAttachments(pixelBuffer, .shouldPropagate) ?? [:] let options: NSMutableDictionary = NSMutableDictionary(dictionary: originalOptions) options[kCVPixelBufferMetalCompatibilityKey] = true let error = CVPixelBufferCreate( kCFAllocatorDefault, width, height, pixelFormat, options, &newPixelBuffer ) let ciImage = CIImage(cvPixelBuffer: pixelBuffer).oriented(orientation) ciContext.render(ciImage, to: newPixelBuffer!) return newPixelBuffer }
  19. Core Animation / Core Graphics func drawOn(pixelBuffer: CVPixelBuffer, speed: String)

    -> CVPixelBuffer? { let colorSpace = CGColorSpace(name: CGColorSpace.sRGB)! guard let cgContext = CGContext( data: nil, width: width, height: height, bitsPerComponent: 8, bytesPerRow: 0, space: colorSpace, bitmapInfo: CGImageAlphaInfo.premultipliedFirst.rawValue ) else { return nil } let overlayLayer = CALayer() overlayLayer.render(in: cgContext) let ciImage = CIImage(cvPixelBuffer: pixelBuffer) let overlayImage = cgContext.makeImage()! let overlayCI = CIImage(cgImage: overlayImage) let compositedImage = overlayCI.composited(over: ciImage) var outputPixelBuffer: CVPixelBuffer? CVPixelBufferCreate( kCFAllocatorDefault, width, height, CVPixelBufferGetPixelFormatType(pixelBuffer), nil, &outputPixelBuffer ) ciContext.render(compositedImage, to: outputBuffer) return outputBuffer }
  20. macOS ͷը૾ॲཧϑϨʔϜϫʔΫ AVFoundation Core Media Video Toolbox Metal Core Video

    Core Image Core Animation Core Graphics Vision Core ML AppKit SceneKit SpriteKit MetalKit I/O Video Data Model Image Processing Accelerate Renderer / Presenter
  21. • Apple Silicon ͷੑೳʢͱͦͷੑೳ޲্଎౓ʣ • CoreML ͱ Apple Neural Engine

    ʹΑΔ AI ΞΫηϥϨʔγϣϯ • Unified Memory (CPU ↔︎ GPU ͷσʔλసૹʣ • Metal ʹΑΔ GPU ࣮૷ (MTLTexture ↔︎ CVPixelBuffer ม׵) • ProRes / h.264 ϋʔυ΢ΣΞΤϯίʔμɾσίʔμඪ४૷උ • ։ൃ؀ڥ (Xcode / Swift) → ίʔυΛͦͷ·· iOS / iPadOS ΁Ҡ২Մೳ Qoncept ʹͱͬͯ macOS Ͱͷ։ൃ͸ѹ౗తʹίετύϑΥʔϚϯε͕ྑ͍ ͳͥ macOS Λ࢖͏͔