Software Architecture Gathering 2025: Architecting with Semantic AI: A Language Model and an Embedding Model Walk Into a Bar...

Architecting with Semantic AI: A Language Model & an Embedding
Model walk into a bar... Christian Weyer | Co-Founder & CTO | Thinktecture AG | [email protected]

Architecting with Semantic AI A Language Model & an Embedding
Model Walk Into a Bar... Our Architectural Journey with AI Models 2 Model Foundation Retrieval Flow Control Semantic Observability Contracts

Model Walk Into a Bar... MODELS IN ARCHITECTURE LANDSCAPE 3 Architecture Building Block: Model Foundation Layer

Language Models understand and generate semantically rich human language, transforming
it into text or structured data for both humans and machines. ⚠ Non-deterministic: same input can lead to different outputs. Embedding Models capture semantic meaning by encoding human language into numerical vector representations, facilitating understanding, comparison, and retrieval for both humans and machines. ✅ Deterministic: same input always results in the same embedding. Architecting with Semantic AI A Language Model & an Embedding Model Walk Into a Bar... 🫱 🫲 Semantic AI Generative AI 4

§ Language & embedding models part of end-to-end architectures §
Accessed via an API § Embedding models can be run locally § Optimized for CPU § Language models (still) hard to run locally § High GPU power § High VRAM § High memory bandwidth Architecting with Semantic AI A Language Model & an Embedding Model Walk Into a Bar... Integration Architecture: Models as Services 5

Model Walk Into a Bar... Traditional Client Interfaces API-based data Document-based data 6 Architecture Building Block UI Layer

Model Walk Into a Bar... Language-enabled “UIs” 7 – e.g. Talk-to-TT

Model Walk Into a Bar... C4 System Context Diagram § Container-based distributed system § Various tech stacks 8

Model Walk Into a Bar... PATTERN LIGHTWEIGHT RAG 9 Architecture Building Block Retrieval Layer

Model Walk Into a Bar... Talking to Documents (Retrieval-Augmented Generation) Cleanup & Split Text Embedding Question Text Embedding Save Query Relevant Results Question Answ er w / sources LLM Embedding Model Embedding Model 💡 Indexing / Embedding Question Answering .md, .docx, .pdf etc. “Lorem ipsum…?” 💡 Vector DB 10

Model Walk Into a Bar... PATTERN STRUCTURED OUTPUT 11 Architecture Building Block Contract Layer

Model Walk Into a Bar... Talking to Systems (Function / Tool calling) “When is CW available for a two-days workshop?” System Prompt (+ employee data) + Schema (for structured output) Web API Availability business logic 12

Model Walk Into a Bar... PATTERN SEMANTIC GUARDING & ROUTING 13 Architecture Building Block Flow Control Layer

Model Walk Into a Bar... Semantic Decision-Making for Interaction Flows Guarding (e.g. prompt injection) Routing (selecting correct target) “Lorem ipsum…?” Target RAG Target API Call Target … something else … Fine-tuned NLP Model Embedding Model 14

Model Walk Into a Bar... PATTERN TELEMETRY 15 Architecture Building Block Semantic Observability Layer

Model Walk Into a Bar... Things can get… Overwhelming 16

Semantic Observability Layer Cross-cutting end-to-end telemetry insights Flow Control Layer
Semantic guarding & routing decisions Contract Layer Structured output + schema enforcement Retrieval Layer Embedding-based document & data retrieval Model Foundation Layer LLM + Embedding models as core capabilities Recap: Semantic AI Architecture A Language Model & an Embedding Model Walk Into a Bar... Architecting with Semantic AI 17

Model Walk Into a Bar... AI-based solutions are ≅10% AI and 100% software engineering. 18

Thank you! Christian Weyer [email protected] https://thinktecture.com/christian-weyer

20 Backup: Technical Details

§ Frameworks § Pydantic § Instructor § Methodology § Schema
with JSON Mode (not Function Calling) § SLM/LLM § Llama 3.3 70B on Cerebras (very fast) Architecting with Semantic AI A Language Model & an Embedding Model Walk Into a Bar... Technical implementation – Structured Output 21

§ Frameworks § LangChain § FastEmbed § Lightweight & efﬁcient
for generating text embeddings § Embedding model § jinaai/jina-embeddings-v2-base-de (local) – 768 dims § Vector store § PostgreSql (pgvector) vector store § LLM/SLM § Llama 3.3 70B on Cerebras (very fast) Architecting with Semantic AI A Language Model & an Embedding Model Walk Into a Bar... Technical implementation – Lightweight RAG 22

Guarding § Frameworks § llm-guard § HuggingFace Transformers § NLP
model § deepset/ deberta-v3-base-injection (local) Routing § Frameworks § semantic-routing § FastEmbed § Embedding model § intﬂoat/ multilingual-e5-large (local) – 1024 dims § Vector store § PostgreSql (pgvector) Architecting with Semantic AI A Language Model & an Embedding Model Walk Into a Bar... Technical implementation – Semantic Guarding & Routing 23

§ Methodology § Open Telemetry (OTel) § Frameworks § OTel
Python packages § Tools § LangFuse § Any OTel-enabled system Architecting with Semantic AI A Language Model & an Embedding Model Walk Into a Bar... Technical implementation - Observability 24

Software Architecture Gathering 2025: Architect...

Software Architecture Gathering 2025: Architecting with Semantic AI: A Language Model and an Embedding Model Walk Into a Bar...

Christian Weyer PRO

More Decks by Christian Weyer

Other Decks in Programming

Featured

Transcript

Architecting with Semantic AI: A Language Model & an Embedding

Architecting with Semantic AI A Language Model & an Embedding

Architecting with Semantic AI A Language Model & an Embedding

Language Models understand and generate semantically rich human language, transforming

§ Language & embedding models part of end-to-end architectures §

Architecting with Semantic AI A Language Model & an Embedding

Architecting with Semantic AI A Language Model & an Embedding

Architecting with Semantic AI A Language Model & an Embedding

Architecting with Semantic AI A Language Model & an Embedding

Architecting with Semantic AI A Language Model & an Embedding

Architecting with Semantic AI A Language Model & an Embedding

Architecting with Semantic AI A Language Model & an Embedding

Architecting with Semantic AI A Language Model & an Embedding

Architecting with Semantic AI A Language Model & an Embedding

Architecting with Semantic AI A Language Model & an Embedding

Architecting with Semantic AI A Language Model & an Embedding

Semantic Observability Layer Cross-cutting end-to-end telemetry insights Flow Control Layer

Architecting with Semantic AI A Language Model & an Embedding

Thank you! Christian Weyer [email protected] https://thinktecture.com/christian-weyer

20 Backup: Technical Details

§ Frameworks § Pydantic § Instructor § Methodology § Schema

§ Frameworks § LangChain § FastEmbed § Lightweight & efﬁcient

Guarding § Frameworks § llm-guard § HuggingFace Transformers § NLP

§ Methodology § Open Telemetry (OTel) § Frameworks § OTel