Smarter Angular mit Transformers.js & Prompt API

Smarter Angular mit Transformers.js & Prompt API Christian Liebel @christianliebel
Consultant

Hello, it’s me. Smarter Angular mit Transformers.js & Prompt API
Christian Liebel X: @christianliebel Bluesky: @christianliebel.com Email: christian.liebel @thinktecture.com Angular, PWA & Generative AI Slides: thinktecture.com /christian-liebel

Original 09:00–10:30 Block 1 10:30–11:00 Coffee Break 11:00–12:30 Block 2
Smarter Angular mit Transformers.js & Prompt API Timetable

What to expect Focus on web app development Focus on
Generative AI Up-to-date insights: the ML/AI field is evolving fast Live demos on real hardware 12 hands-on labs What not to expect Deep dive into AI specifics, RAG, model finetuning or training Stable libraries or specifications Smarter Angular mit Transformers.js & Prompt API Expectations Huge downloads! High requirements! Things may break!

Smarter Angular mit Transformers.js & Prompt API Workshop Slides

Smarter Angular mit Transformers.js & Prompt API DEMO

(Workshop Edition) Smarter Angular mit Transformers.js & Prompt API Demo
Use Case DEMO

Setup complete? (Node.js, Google Chrome, Editor, Git, macOS/Windows, 20 GB
free disk space, 6 GB VRAM) Smarter Angular mit Transformers.js & Prompt API Setup

git clone https://github.com/thinktecture/angular- days-2026-spring-genai cd angular-days-2026-spring-genai npm i Smarter Angular
mit Transformers.js & Prompt API Setup LAB #0

Smarter Angular mit Transformers.js & Prompt API Generative AI everywhere
Source: https://www.apple.com/chde/apple-intelligence/

Run locally on the user’s system Smarter Angular mit Transformers.js
& Prompt API Single-Page Applications Server- Logik Web API Push Service Web API DBs HTML, JS, CSS, Assets Webserver Webbrowser SPA Client- Logik View HTML/CSS View HTML/CSS View HTML/CSS HTTPS WebSockets HTTPS HTTPS

Make SPAs offline-capable Smarter Angular mit Transformers.js & Prompt API
Progressive Web Apps Service Worker Internet Website HTML/JS Cache fetch

Overview Smarter Angular mit Transformers.js & Prompt API Generative AI
Text OpenAI GPT Mistral … Audio/Music Musico Soundraw … Images DALL·E Firefly … Video Sora Runway … Speech Whisper tortoise-tts …

Examples Smarter Angular mit Transformers.js & Prompt API Generative AI
Cloud Providers

Drawbacks Smarter Angular mit Transformers.js & Prompt API Generative AI
Cloud Providers Require a (stable) internet connection Subject to network latency and server availability Data is transferred to the cloud service Require a subscription

Can we run GenAI models locally? Smarter Angular mit Transformers.js
& Prompt API

npm i @huggingface/transformers npm start -- -o Smarter Angular mit
Transformers.js & Prompt API LAB #1

@huggingface/transformers – JavaScript library by Hugging Face 🤗 – Functionally
equivalent to Hugging Face’s transformers python lib – Supports various ML/AI use cases (LLMs, computer vision, audio, …) – Models are executed on-device (100% local, offline-capable) – Uses ONNX Runtime (model inference runtime) internally Smarter Angular mit Transformers.js & Prompt API Transformers.js

Smarter Angular mit Transformers.js & Prompt API Transformers.js vs. WebLLM
Source: https://npmtrends.com/@huggingface/transformers-vs-@mlc-ai/web-llm (27.02.2026)

Alternative Smarter Angular mit Transformers.js & Prompt API WebLLM

Size Comparison Model:Parameters Size lfm2.5-thinking:1.2b 0.7 GB lfm2:2.6b 1.6 GB
ministral-3:3b 3.0 GB gemma3:12b 8.1 GB gpt-oss:20b 14 GB devstral-2:123b 75 GB Smarter Angular mit Transformers.js & Prompt API Model Selection

Liquid Foundation Models (LFM) Highly performant on-device LLMs by LiquidAI
https://www.liquid.ai/models For this workshop, we are going to use the 2.6B model. Model sheet: https://huggingface.co/onnx- community/LFM2-2.6B-ONNX Smarter Angular mit Transformers.js & Prompt API Model Selection Source: https://www.liquid.ai/blog/introducing-lfm2-2-6b-redefining-efficiency-in-language-models

(1/3) Smarter Angular mit Transformers.js & Prompt API Downloading a
model LAB #2 1. Go to webgpureport.org 2. Does your GPU support the feature “shader-f16”?

(2/3) 3. In src/app/todo/todo.ts (ngOnInit()), add the following line: await
this.llmService.loadModel('2.6B'); 4. If your GPU does not support shader-f16, add this parameter: await this.llmService.loadModel('2.6B', 'q4'); Smarter Angular mit Transformers.js & Prompt API Downloading a model LAB #2

(3/3) 5. In todo.html, change the following lines: @if(!llmService.isReady()) {
<mat-progress-bar mode="determinate" [value]="llmService.progress()"></mat-progress-bar> } … <button mat-raised-button (click)="runPrompt(prompt.value, langModel.value)" [disabled]="!llmService.isReady()"> The progress bar should begin to move. Smarter Angular mit Transformers.js & Prompt API Downloading a model LAB #2

Storing model files locally Smarter Angular mit Transformers.js & Prompt
API Cache API Internet Website HTML/JS Cache with model files Hugging Face Note: Due to the Same-Origin Policy, models cannot be shared across origins.

Parameter cache Smarter Angular mit Transformers.js & Prompt API Cache
API

Smarter Angular mit Transformers.js & Prompt API WebAssembly (Wasm) –
Bytecode for the web – Compile target for arbitrary languages – Can be faster than JavaScript – WebLLM uses a model- specific Wasm library to accelerate model computations

Smarter Angular mit Transformers.js & Prompt API WebGPU – Grants
low-level access to the Graphics Processing Unit (GPU) – Near native performance for machine learning applications – Supported by Chromium-based browsers on Windows and macOS from version 113, Safari 26, and Firefox 141 on Windows

– Grants web apps access to the device’s CPU, GPU
and Neural Processing Unit (NPU) – In specification by the WebML Working Group at W3C – Origin Trial in Chrome 146 – Potentially even better performance compared to WebGPU Smarter Angular mit Transformers.js & Prompt API WebNN Source: https://webmachinelearning.github.io/webnn-intro/

Smarter Angular WebNN https://huggingface.co/webnn/spaces mit Transformers.js & Prompt API DEMO

Smarter Angular mit Transformers.js & Prompt API WebNN: near-native inference
performance Source: Intel. Browser: Chrome Canary 118.0.5943.0, DUT: Dell/Linux/i7-1260P, single p-core, Workloads: MediaPipe solution models (FP32, batch=1)

(1/4) 1. In todo.ts, add the following line at the
top of the class: protected readonly reply = signal(''); Smarter Angular mit Transformers.js & Prompt API Model inference LAB #3

(3/4) 2. In the inferTransformersJs() method, add the following code:
this.llmService.clearPastKeyValues(); const messages = [ { role: "user", content: userPrompt }, ]; return this.llmService.generateResponse(messages, []); Smarter Angular mit Transformers.js & Prompt API Model inference LAB #3

(2/4) 3. In the runPrompt() method, add the following code:
this.reply.set('…'); const chunks = inferenceEngine === 'transformers-js' ? this.inferTransformersJs(userPrompt) : this.inferPromptApi(userPrompt); let reply = ''; for await (const chunk of chunks) { reply += chunk; this.reply.set(reply); } Smarter Angular mit Transformers.js & Prompt API Model inference LAB #3

(4/4) 4. In todo.html, change the following line: <pre>{{ reply()
}}</pre> You should now be able to send prompts to the model and see the responses in the template. ⚠ Note: Browsers support better options for streaming LLM responses: https://developer.chrome.com/docs/ai/render-llm-responses Smarter Angular mit Transformers.js & Prompt API Model inference LAB #3

(1/2) In todo.ts, add the following signal at the top:
protected readonly todos = signal<TodoDto[]>([]); Add the following line to the addTodo() method: text ??= prompt() ?? ''; this.todos.update(todos => [...todos, { done: false, text }]); Smarter Angular mit Transformers.js & Prompt API Todo management LAB #4

(2/2) In todo.html, add the following lines to add todos
from the UI: @for (todo of todos(); track $index) { <mat-list-option>{{ todo.text }}</mat-list-option> } Smarter Angular mit Transformers.js & Prompt API Todo management LAB #4

@for (todo of todos(); track $index) { <mat-list-option [(selected)]="todo.done"> {{
todo.text }} </mat-list-option> } ⚠ Boo! This pattern is not recommended. Instead, you should set the changed values on the signal. But this messes up with Angular Material… Smarter Angular mit Transformers.js & Prompt API Todo management (extended) LAB #5

Concept and limitations The todo data has to be converted
into natural language. For the sake of simplicity, we will add all TODOs to the prompt. Remember: LLMs have a context window (LFM2-2.6B: 32K). If you need to chat with larger sets of text, refer to Retrieval Augmented Generation (RAG). These are the todos: * Wash clothes * Pet the dog * Take out the trash Smarter Angular mit Transformers.js & Prompt API Chat with data

System prompt Metaprompt that defines… – character – capabilities/limitations –
output format – behavior – grounding data Hallucinations and prompt injections cannot be eliminated. You are a helpful assistant. Answer user questions on todos. Generate a valid JSON object. Avoid negative content. These are the user’s todos: … Smarter Angular mit Transformers.js & Prompt API Chat with data

Flow System message • The user has these todos: 1.
… 2. … 3. … User message • How many todos do I have? Assistant message • You have three todos. Smarter Angular mit Transformers.js & Prompt API Chat with data

Using a system & user prompt Adjust the code in
inferTransformerJs() to include the system prompt: const systemPrompt = `Here's the user's todo list: ${JSON.stringify(this.todos())}`; const messages = [ { role: "system", content: systemPrompt }, { role: "user", content: userPrompt }, ]; Smarter Angular mit Transformers.js & Prompt API Chat with data LAB #6

Techniques – Providing examples (single shot, few shot, …) –
Priming outputs – Specify output structure – Repeating instructions – Chain of thought – … Success also depends on the model. Smarter Angular mit Transformers.js & Prompt API Prompt Engineering https://learn.microsoft.com/en-us/azure/ai-foundry/openai/concepts/prompt-engineering

const systemPrompt = `You are a helpful assistant. The user
will ask questions about their todo list. Briefly answer the questions. Don't try to make up an answer if you don't know it. Here's the user's todo list: ${JSON.stringify(this.todos())}`; Smarter Angular mit Transformers.js & Prompt API Prompt Engineering LAB #7

Alternatives Prompt Engineering Retrieval Augmented Generation Fine-tuning Custom model Smarter
Angular mit Transformers.js & Prompt API Prompt Engineering Effort

Adjust todo.ts as follows: return this.llmService.generateResponse(messages, [], { measurePerformance: true,
}); Ask a new question and check your console for performance statistics. Smarter Angular mit Transformers.js & Prompt API Performance LAB #8

Workshop Participants Device Tokens/s (Decode) Smarter Angular mit Transformers.js &
Prompt API Performance

Comparison 45 33 1200 0 200 400 600 800 1000
1200 1400 WebLLM (Llama3-8b, M4) Azure OpenAI (gpt-4o-mini) Groq (Llama3-8b) Tokens/sec Smarter Angular mit Transformers.js & Prompt API Performance WebLLM/Groq: Own tests (14.11.2024), OpenAI/Azure OpenAI: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/provisioned-throughput (18.07.2024)

The LLM “calls” the tool. However, the developer has to
take care of actually executing the code and to feed back the result into the conversation. Tool Calling allows an LLM to execute “real-world” actions. A tool usually has… – a name – a natural-language description – an interface definition usually in JSON Schema add_todo get_weather search_web read_file Smarter Angular mit Transformers.js & Prompt API Tool Calling

1. Add the TODO_TOOL to the tools array in inferTransformersJs():
return this.llmService.generateResponse(messages, [TODO_TOOL], { measurePerformance: true }); 2. Add this line to the end of the runPrompt() method: this.llmService.executeToolCalls(reply, { addTodo: (args: { text: string }) => this.addTodo(args.text), }); Smarter Angular mit Transformers.js & Prompt API Tool Calling LAB #9

Bring Your Own AI (BYOAI) – Libraries – WebLLM –
Frameworks – Transformers.js – ONNX Runtime – TensorFlow.js – APIs – WebGPU, WebNN – Cross-Origin Storage NEW! Built-in AI (BIAI) – Writing Assistance APIs – Summarizer API – Writer API – Rewriter API – Proofreader API NEW! – Translator & Language Detector APIs – Prompt API NEW! Multimodal input & structured output – WebMCP NEW! Smarter Angular Web AI Landscape mit Transformers.js & Prompt API

await LanguageModel.create(); about://components about://on-device-internals Smarter Angular mit Transformers.js & Prompt
API Prompt API LAB #10 Edge Chrome edge://flags chrome://flags Prompt API for Phi mini à Enabled Prompt API for Gemini Nano à Enabled Enables optimization guide on device à EnabledBypassPerfRequirement

Smarter Angular mit Transformers.js & Prompt API Prompt API Operating
System Website HTML/JS Browser Internet Apple Intelligence Gemini Nano

Part of Chrome’s Built-In AI initiative – Exploratory API for
local experiments and use case determination – Downloads Gemini Nano into Google Chrome – Model can be shared across origins – Uses native APIs directly – Fine-tuning API might follow in the future Smarter Angular mit Transformers.js & Prompt API Prompt API https://developer.chrome.com/docs/ai/built-in

npm i -D @types/dom-chromium-ai add "dom-chromium-ai" to the types in
tsconfig.app.json Smarter Angular mit Transformers.js & Prompt API Prompt API LAB #11

Add the following lines to inferPromptApi(): const systemPrompt = `
The user will ask questions about their todo list. Here's the user's todo list: ${JSON.stringify(this.todos())}`; const languageModel = await LanguageModel.create({ initialPrompts: [{ role: "system", content: systemPrompt }] }); const chunks = languageModel.promptStreaming(userPrompt); for await (const chunk of chunks) { yield chunk; } Smarter Angular mit Transformers.js & Prompt API Local AI Models LAB #12

Smarter Angular Agentic Browsers mit Transformers.js & Prompt API

Smarter Angular WebMCP – Allows websites to expose tools to
the browser or external agents – Joint effort by Microsoft and Google https://github.com/webmachinelearning/webmcp mit Transformers.js & Prompt API

API Imperative navigator.modelContext .provideContext({ tools: [{ "name": "start_game", "description": "Start
a new game.", "inputSchema": {}, "execute": () => {}, } ]}); Declarative <form id="reservationForm" toolname="book_table_le_petit_bistr o" tooldescription=...> <input name="name" toolparamdescription="Customer's full name (min 2 chars)" /> </form> Smarter Angular mit Transformers.js & Prompt API WebMCP

Pros & Cons + Data does not leave the browser
(privacy) + High availability (offline support) + Low latency + Stability (no external API changes) + Low cost – Lower quality – High system (RAM, GPU) and bandwidth requirements – Large model size, models cannot always be shared – Model initialization and inference are relatively slow – APIs are experimental Smarter Angular mit Transformers.js & Prompt API Summary

– Cloud-based models remain the most powerful models – Due
to their size and high system requirements, local generative AI models are currently rather interesting for very special scenarios (e.g., high privacy demands, offline availability) – Small, specialized models are an interesting alternative (if available) – Large language models are becoming more compact and efficient – Vendors are shipping AI models with their devices – Devices are becoming more powerful for running AI workloads – Experiment with the AI APIs and make your Angular App smarter! Smarter Angular mit Transformers.js & Prompt API Summary

Thank you for your kind attention! Christian Liebel @christianliebel [email protected]

Smarter Angular mit Transformers.js & Prompt API

Smarter Angular mit Transformers.js & Prompt API

More Decks by Christian Liebel

Other Decks in Programming

Featured

Transcript