Squeezing the most out of Foundational Models on-device LLM

In the summer 2025 Apple released their OS updates shipping with on-device LLMs. While quite limited, you can still get quite a bit of milage out of them. This talk is going through multiple patterns that allow to mitigate many shortcomings:
1. Short context window → Recompact chat history to create illusion of infinite chat.
2. Routing → Make your own multimodal model without waiting for Apple to ship it.
3. RAG → Ground model in your private knowledge.
4. Majority voting → Improve quality of answers by choosing the best one with judge LLM.
5. Memory → Preserve user information across sessions allowing LLM to read and write memories.
6. Semantic caching → Save cycles on generating expensive content.
7. Agentic setup → Use Apple Foundation Models to build Perplexity-like agent searching internet for you.

Bonus:
How to set up evals using Swift unit testing framework preventing sudden quality degradation if Apple updates Foundation Models

Source code for the companion app https://github.com/zats/LLMPatterns

Sash Zats

September 21, 2025

More Decks by Sash Zats

See All by Sash Zats

Dictionary of generative technics

0

73

Should coders design

1

4.2k

Taming Animations

4

290

GameplayKit: beyond games

4

12k

EXC_BAD_ACCESS in <redacted>. Now what?

1

16k

Custom operators in swift

0

73

Advanced fun with Objective-C

0

88

Fun fact about Swift

0

120

Other Decks in Programming

See All in Programming

【やさしく解説設計編 #1】「ドメイン駆動」と「実装駆動」ってなに？〜設計の考え方を、たとえ話で学ぼう〜

PRO

1

110

コーディングルールの鮮度を保ちたい for SRE NEXT 2026 / keep-fresh-go-internal-conventions-sre-next-2026

0

140

Vue × Nuxt × Oxc どこまで使える？実運用の現在地

0

370

エンジニア向け会社紹介/Findy Company Profile

6

360k

Go1.27で導入されるジェネリクスメソッドでできること

0

280

鹿野さんに聞く！『TypeScriptコードレシピ集』で磨く実践力

tonkotsuboy_com

4

1.1k

Welcome to the "Parametricity" 🏙️ − Generic だけど Specific な世界 −

PRO

1

150

SLOをサービス品質の共通言語にするために取り組んできたこと

0

480

Embedded SREと共に達成した会員管理システムのAWS移行 - SRE NEXT 2026 ランチスポンサーセッション

PRO

1

2.4k

霧の中の代数的エフェクト

1

350

なぜ関数型プログラミングで「型」と「証明」が語られるのか #fp_matsuri

3

800

SREは、MCPとSRE Agentをこう使え！

0

150

Featured

See All Featured

Fantastic passwords and where to find them - at NoRuKo

52

3.8k

Leo the Paperboy

8

1.9k

JavaScript: Past, Present, and Future - NDC Porto 2020

52

6k

Dominate Local Search Results - an insider guide to GBP, reviews, and Local SEO

PRO

0

210

Have SEOs Ruined the Internet? - User Awareness of SEO in 2025

0

390

How to build a perfect <img>

1

5.8k

Put a Button on it: Removing Barriers to Going Fast.

60

4.4k

svc-hook: hooking system calls on ARM64 by binary rewriting

2

330

Jamie Indigo - Trashchat’s Guide to Black Boxes: Technical SEO Tactics for LLMs

PRO

0

300

Being A Developer After 40

91

590k

The State of eCommerce SEO: How to Win in Today's Products SERPs - #SEOweek

2

11k

Discover your Explorer Soul

2

1.2k

Transcript