Microsoft JDConf 2025 - RAG & Tools for developers with LangChain4j

@edeandrea Eric Deandrea, Red Hat Java Champion | Senior Principal
Developer Advocate RAG & Tools with LangChain4j

@edeandrea • Java Champion • 26+ years software development experience
• Contributor to Open Source projects Quarkus Spring Boot, Spring Framework, Spring Security LangChain4j (& Quarkus LangChain4j) Wiremock Microcks • Boston Java Users ACM Chapter Board Member • Published Author About Me

@edeandrea • Showcase & explain Quarkus, how it enables modern
Java development & the Kubernetes-native experience • Introduce familiar Spring concepts, constructs, & conventions and how they map to Quarkus • Equivalent code examples between Quarkus and Spring as well as emphasis on testing patterns & practices 3 https://red.ht/quarkus-spring-devs

@edeandrea

@edeandrea RAG

@edeandrea Retrieval Augmented Generation (RAG) Enhance LLM knowledge by providing
relevant information in real-time from other sources – Dynamic data that changes frequently Fine-tuning is expensive! 2 stages Indexing / Ingestion Retrieval / Augmentation

@edeandrea Indexing / Ingestion

@edeandrea Indexing / Ingestion FileSystemDocumentLoader ClassPathDocumentLoader UrlDocumentLoader AmazonS3DocumentLoader AzureBlobStorageDocumentLoader GitHubDocumentLoader
TencentCosDocumentLoader

@edeandrea Indexing / Ingestion TextDocumentParser ApachePdfBoxDocumentParser ApachePoiDocumentParser ApacheTikaDocumentParser

@edeandrea Indexing / Ingestion What do I need to think
about? What is the representation of the data? How do I want to split? Per document? Chapter? Sentence? How many tokens do I want to end up with?

@edeandrea Indexing / Ingestion DocumentByParagraphSplitter DocumentByLineSplitter DocumentBySentenceSplitter DocumentByWordSplitter DocumentByCharacterSplitter DocumentByRegexSplitter
DocumentSplitters.recursive()

@edeandrea Indexing / Ingestion Compute an embedding (numerical vector) representing
semantic meaning of each segment. Requires an embedding model In-process/Onnx, Amazon Bedrock, Azure OpenAI, Cohere, DashScope, Google Vertex AI, Hugging Face, Jine, Jlama, LocalAI, Mistral, Nomic, Ollama, OpenAI, OVHcloud, Voyage AI, Cloudfare Workers AI, Zhipu AI

@edeandrea Store embedding alone or together with segment. Requires a
vector store In-memory, Chroma, Elasticsearch, Milvus, Neo4j, OpenSearch, Pinecone, PGVector, Redis, Vespa, Weaviate, Qdrant Indexing / Ingestion

@edeandrea

@edeandrea Indexing / Ingestion var ingestor = EmbeddingStoreIngestor.builder() .embeddingModel(embeddingModel) .embeddingStore(embeddingStore)
// Add userId metadata entry to each Document to be able to filter by it later .documentTransformer(document -> { document.metadata().put("userId", "12345"); return document; }) // Split each Document into TextSegments of 1000 tokens each with a 200-token overlap .documentSplitter(DocumentSplitters.recursive(1000, 200)) // Add the name of the Document to each TextSegment to improve the quality of search .textSegmentTransformer(textSegment -> TextSegment.from( textSegment.metadata().getString("file_name") + "\n" + textSegment.text(), textSegment.metadata() ) ) .build(); // Get the path of where the documents are and load them recursively Path path = Path.of(...); List<Document> documents = FileSystemDocumentLoader.loadDocumentsRecursively(path); // Ingest the documents into the embedding store ingestor.ingest(documents);

@edeandrea

@edeandrea Retrieval / Augmentation

@edeandrea Retrieval / Augmentation Compute an embedding (numerical vector) representing
semantic meaning of the query. Requires an embedding model.

@edeandrea Retrieval / Augmentation Retrieve & rank relevant content based
on cosine similarity or other similarity/distance measures.

@edeandrea Retrieval / Augmentation Augment input to the LLM with
related content. What do I need to think about? Will I exceed the max number of tokens? How much chat memory is available?

@edeandrea Retrieval / Augmentation public class RagRetriever { @Produces @ApplicationScoped
public RetrievalAugmentor create(EmbeddingStore store, EmbeddingModel model) { var contentRetriever = EmbeddingStoreContentRetriever. builder() .embeddingModel(model) .embeddingStore(store) .maxResults( 3) .minScore( 0.75) .filter( metadataKey("userId").isEqualTo("12345")) .build(); return DefaultRetrievalAugmentor. builder() .contentRetriever(contentRetriever) .build(); } }

@edeandrea Advanced RAG

@edeandrea public class RagRetriever { @Produces @ApplicationScoped public RetrievalAugmentor create(EmbeddingStore
store, EmbeddingModel model) { var embeddingStoreRetriever = EmbeddingStoreContentRetriever.builder() .embeddingModel(model) .embeddingStore(store) .maxResults(3) .minScore(0.75) .filter(metadataKey("userId").isEqualTo("12345")) .build(); var googleSearchEngine = GoogleCustomWebSearchEngine.builder() .apiKey(System.getenv("GOOGLE_API_KEY")) .csi(System.getenv("GOOGLE_SEARCH_ENGINE_ID")) .build(); var webSearchRetriever = WebSearchContentRetriever.builder() .webSearchEngine(googleSearchEngine) .maxResults(3) .build(); return DefaultRetrievalAugmentor.builder() .queryRouter(new DefaultQueryRouter(embeddingStoreRetriever, webSearchRetriever)) .build(); } } Advanced RAG https://github.com/cescofﬁer/langchain4j-deep-dive/blob/main/4-rag/src/main/java/dev/langchain4j/quarkus/deepdive/RagRetriever.java

@edeandrea public class RagRetriever { @Produces @ApplicationScoped public RetrievalAugmentor create(EmbeddingStore
store, EmbeddingModel model, ChatLanguageModel chatModel) { var embeddingStoreRetriever = ... var webSearchRetriever = ... var queryRouter = LanguageModelQueryRouter.builder() .chatLanguageModel(chatModel) .fallbackStrategy(FallbackStrategy.ROUTE_TO_ALL) .retrieverToDescription( Map.of( embeddingStoreContentRetriever, “Local Documents”, webSearchContentRetriever, “Web Search” ) ) .build(); return DefaultRetrievalAugmentor.builder() .queryRouter(queryRouter) .build(); } } Advanced RAG https://github.com/cescofﬁer/langchain4j-deep-dive/blob/main/4-rag/src/main/java/dev/langchain4j/quarkus/deepdive/RagRetriever.java

@edeandrea

@edeandrea Function Calling, Agents, and Tools

@edeandrea Agent and Tools A tool is a function that
the model can call: - Tools are parts of CDI beans - Tools are deﬁned and described using @Tool Prompt (Context) Extend the context with tool descriptions Invoke the model The model asks for a tool invocation (name + parameters) The tool is invoked (on the caller) and the result sent to the model The model computes the response using the tool result Response

@edeandrea Tools - A tool is just a method -
It can access databases, or invoke a remote service - It can also use another LLM Tools require memory Application

@edeandrea Using tools with LangChain4j Assistant assistant = AiServices.builder(Assistant.class) .chatLanguageModel(
model) .tools(new Calculator()) .chatMemory( MessageWindowChatMemory .withMaxMessages(10)) .build(); static class Calculator { @Tool("Calculates the length of a string") int stringLength(String s) { return s.length(); } @Tool("Calculates the square root of a number" ) double sqrt(int x) { System.out.println("Called sqrt() with x=" + x); return Math.sqrt(x); } } Objects to use as tools Declare an tool method (description optional)

@edeandrea Giving access to database (Quarkus Panache) @ApplicationScoped public class
BookingRepository implements PanacheRepository<Booking> { @Tool("Cancel a booking" ) @Transactional public void cancelBooking(long bookingId, String firstName, String lastName) { var booking = getBookingDetails( bookingId, firstName, lastName); delete(booking); } @Tool("List booking for a customer" ) public List<Booking> listBookingsForCustomer (String name, String surname) { return Customer.find("firstName = ?1 and lastName = ?2" , name, surname) .singleResultOptional() .map(found -> list( "customer", found)) .orElseGet(List::of); } }

@edeandrea Agentic AI @RegisterAiService public interface WeatherForecastAgent { @SystemMessage("You are
a meteorologist ...") @Toolbox({ CityExtractorAgent.class, ForecastService.class, GeoCodingService.class }) String forecast(String query); } @RegisterAiService public interface CityExtractorAgent { @Tool("Extracts the city name from a given question") @UserMessage("Extract the city name from {question}") String extractCity(String question); } @RegisterRestClient public interface ForecastService { @Tool("Forecasts the weather for the given coordinates") @ClientQueryParam(name = "forecast_days", value = "?") WeatherForecast forecast(@RestQuery double latitude, @RestQuery double longitude); }

@edeandrea Function Calling - Tracing

@edeandrea Web Search Tools (Tavily) @UserMessage(""" Search for information about
the user query: {query}, and answer the question. """) @ToolBox(WebSearchTool.class) String chat(String query); Provided by quarkus-langchain4j-tavily Can also be used with RAG

@edeandrea Risks • Things can go wrong quickly • Risk
of prompt injection • Audit is very important to check the parameters • Distinction between read and write tools Application

@edeandrea @edeandrea Thank you!

Microsoft JDConf 2025 - RAG & Tools for develop...

Microsoft JDConf 2025 - RAG & Tools for developers with LangChain4j

Video

More Decks by Eric Deandrea

Other Decks in Technology

Featured

Transcript