On-Device AI for Humanity: Building Smarter, Safer, and More Private Mobile Intelligence

Cloud-Centric vs Human-Centric AI 1. Cloud-centric ◦ Data → Cloud
→ Model → Result ◦ Pros: huge models, easy updates ◦ Cons: privacy risk, latency, connectivity, cost 2. Human-centric (on-device ﬁrst) ◦ Data stays on device by default ◦ Cloud used selectively, with consent & clear value Two Mental Models

Why On-Device AI is a Technical Win Not just “ethics
tax” - it’s a performance upgrade • Latency: sub-50ms inference for truly real-time UX • Reliability: works in airplane mode / dead zones • Cost: fewer GPU servers, 80%+ cloud savings possible • Scalability: no “per-request” cloud bottleneck

Model Optimization: Small, Fast, Accurate From research model to pocket-sized
model • Quantization • FP32 → FP16 / INT8 • 3–4× smaller, faster with minimal accuracy loss • Pruning • Remove low-impact weights / neurons • 50–75%+ compression, sparser models • Mobile-aware design • Neural architecture search or manual design for: • Fewer parameters, cache-friendly ops, depthwise convs

Frameworks & Deployment: TFLite, Core ML, ONNX Shipping models to
real apps • TensorFlow Lite ◦ Android & cross-platform, supports quantization, delegates • Core ML ◦ Deep Apple integration (iOS, macOS, watchOS, etc.) • ONNX Runtime Mobile ◦ Common format across platforms Practical Tips • Treat models as versioned artifacts • Roll out using feature ﬂags • Support rollback & A/B testing

Personalization via On-Device Fine-Tuning Personalization via On-Device Fine-Tuning • Examples
◦ Keyboard adapting to your slang ◦ Camera/ﬁlters learning your preferences • Techniques: ◦ Fine-tune last layers or adapters on device ◦ Store learned weights encrypted, locally • Guardrails: ◦ On-device diﬀerential privacy noise (where needed) ◦ Clear settings to reset or opt out

Real-World Constraints: Heat, Battery, Updates What breaks in production (and
how to ﬁx it) Thermals • Cap inference time & frequency • Use lower-power modes / smaller models for background Battery • Schedule heavy work on charge + Wi-Fi • Use OS hints (Doze, Background Tasks, etc.) Model lifecycle • OTA model updates with integrity checks • Per-segment rollouts; log on-device metrics only

On-Device AI for Humanity: Building Smarter, Sa...

On-Device AI for Humanity: Building Smarter, Safer, and More Private Mobile Intelligence

Divya Jain

More Decks by Divya Jain

Other Decks in Technology

Featured

Transcript

Cloud-Centric vs Human-Centric AI 1. Cloud-centric ◦ Data → Cloud

Why On-Device AI is a Technical Win Not just “ethics

Model Optimization: Small, Fast, Accurate From research model to pocket-sized

Frameworks & Deployment: TFLite, Core ML, ONNX Shipping models to

Personalization via On-Device Fine-Tuning Personalization via On-Device Fine-Tuning • Examples

Real-World Constraints: Heat, Battery, Updates What breaks in production (and