Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Optimizing your Swift code

Avatar for Yuta Saito Yuta Saito
January 21, 2023

Optimizing your Swift code

Avatar for Yuta Saito

Yuta Saito

January 21, 2023
Tweet

More Decks by Yuta Saito

Other Decks in Technology

Transcript

  1. • Yuta Saito / @kateinoigakukun • Waseda University B4 •

    Working at • Maintainer of SwiftWasm • Commiter to Swift / LLVM / CRuby About me
  2. Outline 1. Motivation: Why is performance important for us? 2.

    Background: Why is Swift “slow”? 3. Techniques: How to write compiler-friendly code 1. Pro fi le! Pro fi le!! Pro fi le!!! 2. Reduce dynamic dispatch 3. Reveal hidden CoW cost 4. Value operation cost
  3. Motivation When/Where does performance matter? • Apps sensitive to frame

    dropping • Most apps don’t need to care • Game App, Camera App, etc… • 1 frame must be done within 16ms (60fps) or 8ms (120fps)
  4. Motivation When/Where does performance matter? Non-optimized WebAssembly is still slow

    • V8 has two AOT compilers*1: • Baseline (Lifto ff ) • Optimizing (TurboFan) • Baseline is about 2x slower 😣 *1: https://v8.dev/docs/wasm-compilation-pipeline
  5. Why is Swift “slow”? • Tend to be slower than

    C/C++ • High-level language features • ARC (Automatic Reference Counting) • CoW (Copy-on-Write) • Protocol • Short code can have large hidden cost
  6. Why is Swift “slow”? Automatic Reference Counting class Animal {

    func bark() {} } class Cat: Animal { override func bark() { print("meow") } } let cat = Cat() let animal = cat as Animal animal.bark() Q. Where retain/release will be placed?
  7. Why is Swift “slow”? Automatic Reference Counting class Animal {

    func bark() {} } class Cat: Animal { override func bark() { print("meow") } } let cat = Cat() // retain(cat) let animal = cat as Animal animal.bark() // release(animal) // release(cat) Q. Where retain/release will be placed? A. → Hidden cost!!
  8. What do you mean “compiler-friendly”? • Easy for compilers to

    optimize • Compilers can optimize only a set of program patterns • Hand-annotated restriction would help compiler to optimize
  9. Profile! Profile!! Profile!!! Performance is often bound by non-CPU work

    • GPU • Alpha blending • Event latency • Blocking IO • Disk IO • Network IO Do we really need to optimize CPU work?
  10. Reduce dynamic dispatch Dynamic dispatch happens when: • Method called

    through class instances • Method called through protocol types
  11. Reduce dynamic dispatch • Avoid open access modi fi er

    open class Animal { open func bark() {} } func useAnimal(_ x: Animal) { x.bark() // Animal.bark can be overridden outside the module } • Check if compiler can know the callee method at compile-time 🧐 • Use small type as much as possible (Cat < Animal < Any) Class instance methods
  12. Reduce dynamic dispatch • Avoid existential container to be specialization-friendly

    func usePingable(_ x: Pingable) { x.ping() } // Better if possible 👍 func usePingable(_ x: some Pingable) { x.ping() } Protocol methods
  13. Reduce dynamic dispatch Protocol methods Module A Module B public

    func usePingable(_ x: some Pingable) { x.ping() } import ModuleA Cannot remove dynamic dispatch 😣 struct PingableImpl: Pingable { ... } usePingable(PingableImpl())
  14. Reduce dynamic dispatch Protocol methods Module A Module B import

    ModuleA public struct PingableImpl: Pingable { ... } usePingable(PingableImpl()) @_specialize(where X == PingableImpl) public func usePingable(_ x: some Pingable) { x.ping() } No dynamic dispatch! 💨
  15. Reveal hidden CoW cost func computeCurve(newEvents: [Event]) -> Curve {

    var allPoints = self.points for event in newEvents { allPoints.append(event.point) } let tails = Array(allPoints.suffix(3)) return Curve(p0: tails[0], p1: tails[1], p2: tails[2]) } ⚠ Large copy ⚠ Temporary Array allocation
  16. Reveal hidden CoW cost func computeCurve(newEvents: [Event]) -> Curve {

    let p0, p1, p2: Point switch newEvents.count { case 0: p0 = self.points[self.points.count - 3] p1 = self.points[self.points.count - 2] p2 = self.points[self.points.count - 1] case 1: p0 = self.points[self.points.count - 2] p1 = self.points[self.points.count - 1] p2 = newEvents[0].point case 2: p0 = self.points[self.points.count - 1] p1 = newEvents[0].point p2 = newEvents[1].point default: p0 = newEvents[newEvents.count - 3].point p1 = newEvents[newEvents.count - 2].point p2 = newEvents[newEvents.count - 1].point } return Curve(p0: p0, p1: p1, p2: p2) }
  17. Reveal hidden CoW cost https://forums.swift.org/t/in-place-mutation-of-an-enum-associated-value/11747 enum JSON { case string(String)

    case array([JSON]) } func appendString(json: inout JSON) { switch json { case .array(var array): array.append(.string("extra string”)) json = .array(json) default: break } } ⚠ Sharing the same storage ⚠ CoW triggered!
  18. Reveal hidden CoW cost https://forums.swift.org/t/in-place-mutation-of-an-enum-associated-value/11747 enum JSON { case string(String)

    // case array([JSON]) case array(Box<[JSON]>) } func appendString(json: inout JSON) { switch json { case .array(let array): array.value.append(.string(“extra string”)) default: break } } 🤔 Wrapped with Box to be uniquely referenced
  19. Value operation cost • Struct copy is cheap only when

    the struct type is trivial • Trivial Types (POD: Plain Old Data): No extra copy, move, destruction semantics • Int, Bool, Double, … • A struct type that consists of trivial types • Many container types in stdlib (Array, Set, …) has fast-path for trivial types • Optimized to be a memcpy
  20. Value operation cost class Owner { ... } struct Item

    { // non-trivial let id: Int // trivial let owner: Owner // non-trivial } var newItems = self.items // for item in newItems { // retain(item.owner) // } // memcpy(self.items, newItems) newItems.append(Item(...)) ⚠ Non-trivial copy operation struct Owner {} struct Item { // trivial let id: Int // trivial let owner: Owner // trivial } var newItems = self.items // memcpy(self.items, newItems) newItems.append(Item(...)) ✅ Relatively trivial copy operation
  21. Value operation cost print(_isPOD(Int.self)) // true print(_isPOD(String.self)) // false print(_isPOD(Array<Int>.self))

    // false struct Box<T> { let value: T } print(_isPOD(Box<Int>.self)) // true print(_isPOD(Box<String>.self)) // false Check a type is trivial or not by _isPOD
  22. Summary • Swift has some hidden cost even in a

    short code • Understanding the underlying mechanism makes your code fast! • The GoodNotes’ Ink algorithm is now 7x faster! Before After 0 150 300 450 600 Benchmark Time (ms)
  23. Resources • Low-level Swift optimization tips by Kelvin Ma
 https://swiftinit.org/articles/low-level-swift-optimization

    • Writing High-Performance Swift Code
 https://github.com/apple/swift/blob/main/docs/OptimizationTips.rst • Understanding Swift Performance - WWDC 2016
 https://developer.apple.com/videos/play/wwdc2016/416/