Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AI in the browser: Smarter Angular apps with We...

AI in the browser: Smarter Angular apps with WebGPU and WebNN

In this session, we will explore the integration of Generative AI functions into Angular applications using WebGPU API and Web Neural Network (WebNN) API. These APIs enable the execution of Large Language Models (LLM) and Stable Diffusion models on the user’s device. The primary benefits of local execution include offline availability and data security, provided that the user’s device has sufficient power to run the AI models. During the presentation, we will discuss different use cases and compare the advantages and disadvantages of each solution. Join us to learn how to make your Angular app smarter.

Christian Liebel

March 14, 2025
Tweet

More Decks by Christian Liebel

Other Decks in Programming

Transcript

  1. AI in the Browser Smarter Angular apps with WebGPU and

    WebNN Christian Liebel @christianliebel Consultant
  2. Hello, it’s me. AI in the Browser Smarter Angular apps

    with WebGPU and WebNN Christian Liebel W3C WebML WG & CG TAG Associate X: @christianliebel Bluesky: @christianliebel.com Angular, PWA & Generative AI Microsoft MVP & Google GDE (Angular, Web)
  3. Rule-based algorithms are limited in their capabilities. AI in the

    Browser Smarter Angular apps with WebGPU and WebNN Why should you care?
  4. Use AI to implement use cases that are difficult or

    impossible to implement using rule-based algorithms. AI in the Browser Smarter Angular apps with WebGPU and WebNN Why should you care?
  5. Data Training Trained Model Inference/ Prediction Output AI in the

    Browser Smarter Angular apps with WebGPU and WebNN Schema
  6. AI in the Browser Smarter Angular apps with WebGPU and

    WebNN Generative AI everywhere Source: https://www.apple.com/chde/apple-intelligence/
  7. Overview AI in the Browser Smarter Angular apps with WebGPU

    and WebNN Generative AI Text OpenAI GPT Mistral … Audio/Music Musico Soundraw … Images DALL·E Firefly … Video Sora Runway … Speech Whisper tortoise-tts …
  8. Overview AI in the Browser Smarter Angular apps with WebGPU

    and WebNN Generative AI Text OpenAI GPT Mistral … Audio/Music Musico Soundraw … Images DALL·E Firefly … Video Sora Runway … Speech Whisper tortoise-tts …
  9. Overview AI in the Browser Smarter Angular apps with WebGPU

    and WebNN Generative AI Text OpenAI GPT Mistral … Audio/Music Musico Soundraw … Images DALL·E Firefly … Video Sora Runway … Speech Whisper tortoise-tts …
  10. Examples AI in the Browser Smarter Angular apps with WebGPU

    and WebNN Generative AI Cloud Providers
  11. Drawbacks AI in the Browser Smarter Angular apps with WebGPU

    and WebNN Generative AI Cloud Providers Require a (stable) internet connection Subject to network latency and server availability Data is transferred to the cloud service Require a subscription
  12. Can we run GenAI models locally? AI in the Browser

    Smarter Angular apps with WebGPU and WebNN
  13. AI in the Browser Smarter Angular apps with WebGPU and

    WebNN WebAssembly (Wasm) – Bytecode for the web – Compile target for arbitrary languages – Can be faster than JavaScript – WebLLM uses a model- specific Wasm library to accelerate model computations
  14. AI in the Browser Smarter Angular apps with WebGPU and

    WebNN WebGPU – Grants low-level access to the Graphics Processing Unit (GPU) – Near native performance for machine learning applications – Supported by Chromium-based browsers on Windows and macOS from version 113
  15. – Grants web apps access to the device’s CPU, GPU

    and Neural Processing Unit (NPU) – In specification by the WebML Working Group at W3C – Implementation in progress in Chromium (behind a flag) – Better performance for specific workloads AI in the Browser Smarter Angular apps with WebGPU and WebNN WebNN Source: https://webmachinelearning.github.io/webnn-intro/ DEMO
  16. AI in the Browser Smarter Angular apps with WebGPU and

    WebNN WebNN Source: https://github.com/webmachinelearning/webnn/issues/375#issuecomment-2720701672
  17. Storing model files locally AI in the Browser Smarter Angular

    apps with WebGPU and WebNN WebLLM Internet Website HTML/JS Cache with model files Hugging Face Note: Due to the Same-Origin Policy, models cannot be shared across origins.
  18. Model Size Comparison Model:Parameters Size phi3:3b 2.2 GB mistral:7b 4.1

    GB llama3:8b 4.7 GB gemma2:9b 5.4 GB gemma2:27b 16 GB llama3:70b 40 GB AI in the Browser Smarter Angular apps with WebGPU and WebNN WebLLM
  19. Drawbacks AI in the Browser Smarter Angular apps with WebGPU

    and WebNN WebLLM Models can’t be shared across origins Inference is fast, but doesn’t reach full native speed
  20. – Initiative by Google Chrome – Exploratory APIs for local

    experiments and use case determination – Downloads AI models into Google Chrome – Models are shared across origins – Uses native APIs directly (full performance) AI in the Browser Smarter Angular apps with WebGPU and WebNN Built-in AI https://developer.chrome.com/docs/ai/built-in
  21. Incubated by the WebML CG AI in the Browser Smarter

    Angular apps with WebGPU and WebNN Built-in AI APIs https://webmachinelearning.github.io/incubations/
  22. AI in the Browser Smarter Angular apps with WebGPU and

    WebNN Built-in AI APIs Operating System Website HTML/JS Browser Internet Apple Intelligence Gemini Nano
  23. about://on-device-internals https://www.google.com/chrome/canary/ about://flags Enables optimization guide on device à EnabledBypassPerfRequirement

    (API) for Gemini Nano à Enabled AI in the Browser Smarter Angular apps with WebGPU and WebNN Built-in AI APIs
  24. Adjust the implementations of runPrompt()/fillForm(): const session = await window.ai.languageModel.create({

    systemPrompt }); const reply = await session.prompt(value); // runPrompt(): this.reply.set(reply); // fillForm(): this.formGroup.setValue(JSON.parse(reply)); AI in the Browser Smarter Angular apps with WebGPU and WebNN Prompt API
  25. Pros & Cons + Data does not leave the browser

    (privacy) + High availability (offline support) + Low latency + Stability (no external API changes) + Low cost – Lower response quality – Less capable – High system (RAM, GPU) and bandwidth requirements – Large model size, models cannot always be shared – Model initialization and inference are relatively slow – APIs are experimental AI in the Browser Smarter Angular apps with WebGPU and WebNN On-device AI Models
  26. Cloud-based AI in the Browser Smarter Angular apps with WebGPU

    and WebNN Multimodal Realtime Models DEMO