Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Implementing a RAG System in Dart @FlutterNinjas

Implementing a RAG System in Dart @FlutterNinjas

FlutterNinjas Tokyo 2024 - JaiChangPark 박제창
2024.06.14 (Fri)

https://flutterninjas.dev/

JaiChangPark

June 14, 2024
Tweet

More Decks by JaiChangPark

Other Decks in Programming

Transcript

  1. Dreamus Company Community ! • Flutter Seoul Organizer • GDG

    Golang Korea Organizer PARK. JAI-CHANG (박제창) @jaichangpark 2
  2. Agenda 1 2 3 4 Overview What’s RAG (Retrieval Augmented

    Generation) LangChain Build a RAG App with Dart 3
  3. What’s Large Language Model Overview 7 • LLM stands for

    Large Language Model. • An LLM is a language model based on artificial neural networks, trained on vast amounts of text data. • There are limitations in inferring the latest data to produce results. Various methods are being developed to overcome this. • LLMs are slightly different from search engines (LLMs were more often preferred for tasks requiring nuanced understanding and language processing.) • Examples of usage: ◦ Chatbots: Can answer questions and engage in conversations. ◦ Text summarization: Can summarize long documents into shorter versions. ◦ Translation: Can translate from one language to another. ◦ Writing: Can perform creative writing and code generation.
  4. @source: Harnessing the Power of LLMs in Practice: A Survey

    on ChatGPT and Beyond The evolutionary tree of modern LLMs traces the development of language models in recent years and highlights some of the most well-known models. Models on the same branch have closer relationships. Transformer-based models are shown in non-grey colors: decoder-only models in the blue branch, encoder-only models in the pink branch, and encoder-decoder models in the green branch. The vertical position of the models on the timeline represents their release dates. Open-source models are represented by solid squares, while closed-source models are represented by hollow ones. The stacked bar plot in the bottom right corner shows the number of models from various companies and institutions. Overview The evolutionary tree of modern LLMs 8
  5. 3 12 Causes of LLM Hallucinations • Data Limitations: The

    training data may not cover all possible knowledge areas comprehensively, leading the model to fill gaps with fabricated information. • Model Training: When an AI model becomes overfitted, it generates outputs that are highly specific to the training data and do not generalize well to new data. This can lead to hallucinations or the production of irrelevant outputs by the AI model. + training data bias/inaccuracy and high model complexity. • Prompt Ambiguity: Ambiguous or poorly phrased prompts can lead the model to generate responses that diverge from factual accuracy. Overview LLM Hallucination
  6. 16 Overview Preventing LLM hallucinations When is New Jeans' Japanese

    debut date? 1. Enhancing data through web searches for utilization in user queries. 2. This enables responses to questions regarding the latest data.
  7. Retrieval Augmented Generation RAG 18 Retrieval-augmented generation (RAG) is a

    software architecture and technique that integrates large language models with external information sources, enhancing the accuracy and reliability of generative AI models by incorporating specific business data like documents, SQL databases, and internal applications. @source: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
  8. Load document & Save Vector DB RAG 8 Step Process

    2 3 4 1 Split Split the loaded document into chunk units Load Read data from documents (pdf, word, xlsx), web pages, Notion, Confluence, etc. Embedding Convert a document to a vector representation Vector Store Save the converted vector in DB 20
  9. Search documents & Get results RAG 8 Step Process 6

    7 8 5 Prompt Prompt to derive desired results based on search results Retrieval Similarity search cosine, mmr LLM Select model (GPT-4o, GPT-4, GPT-3.5, Gemini, Llama, Gemma etc) Output Output such as text, JSON, markdown, etc. 21
  10. How to Implement? 23 1. Building an LLM-based application using

    RAG architecture with LangChain 2. Building an LLM-based app using LammaIndex 3. Implementing all processes directly for Developer
  11. LangChain 25 What’s Langchain • LangChain is a framework for

    developing applications powered by large language models (LLMs). a. Open-source libraries: Build your applications using LangChain's modular building blocks and components. b. Productionization: Inspect, monitor, and evaluate your apps with LangSmith so that you can constantly optimize and deploy with confidence. c. Deployment: Turn any chain into a REST API with LangServe.
  12. LangChain 26 Langchain Dart • LangChain.dart is an unofficial Dart

    port of the popular LangChain Python framework created by Harrison Chase. @source: https://pub.dev/packages/langchain
  13. LangChain Dart 27 Motivation • The adoption of LLMs is

    creating a new tech stack in its wake. However, emerging libraries and tools are predominantly being developed for the Python and JavaScript ecosystems. As a result, the number of applications leveraging LLMs in these ecosystems has grown exponentially. • In contrast, the Dart / Flutter ecosystem has not experienced similar growth, which can likely be attributed to the scarcity of Dart and Flutter libraries that streamline the complexities associated with working with LLMs. • LangChain.dart aims to fill this gap by abstracting the intricacies of working with LLMs in Dart and Flutter, enabling developers to harness their combined potential effectively. @source: https://github.com/davidmigloz/langchain_dart
  14. LangChain Dart Pros • Even without knowing Python or JS,

    you can create an LLM app (client) using only the Dart language. • The official documentation is well-organized and easy to use. ◦ https://langchaindart.de v/#/ Cons • Currently, the number of supported third-party libraries is limited. 28
  15. PDF loader 1 Data Loader Future _pickPDFText() async { var

    filePickerResult = await FilePicker.platform.pickFiles(); if (filePickerResult != null) { _pdfDoc = await PDFDoc.fromPath(filePickerResult.files.single.path!); String text = await _pdfDoc!.text; setState(() {}); } } Future _fromPDFURL() async { if (urlTextController.text.isNotEmpty) { _pdfDoc = await PDFDoc.fromURL(urlTextController.text.trim()); String text = await _pdfDoc!.text; setState(() {}); } return; } 31 dependencies: flutter: sdk: flutter flutter_pdf_text: 0.6.0
  16. PDF loader 1 Data Loader Future _pickPDFText() async { var

    filePickerResult = await FilePicker.platform.pickFiles(); if (filePickerResult != null) { _pdfDoc = await PDFDoc.fromPath(filePickerResult.files.single.path!); String text = await _pdfDoc!.text; setState(() {}); } } Future _fromPDFURL() async { if (urlTextController.text.isNotEmpty) { _pdfDoc = await PDFDoc.fromURL(urlTextController.text.trim()); String text = await _pdfDoc!.text; setState(() {}); } return; } 32 dependencies: flutter: sdk: flutter flutter_pdf_text: 0.6.0 data: k-pop idol Groups & wikipedia
  17. PDF loader 1 Data Loader Future _pickPDFText() async { var

    filePickerResult = await FilePicker.platform.pickFiles(); if (filePickerResult != null) { _pdfDoc = await PDFDoc.fromPath(filePickerResult.files.single.path!); String text = await _pdfDoc!.text; setState(() { _text = text; }); } } 33 dependencies: flutter: sdk: flutter flutter_pdf_text: 0.6.0
  18. 2 Text Split 34 dependencies: flutter: sdk: flutter langchain: List<Document>

    docs = []; const splitter = RecursiveCharacterTextSplitter( chunkSize: 2000, chunkOverlap: 400, ); final splitLists = splitter.splitText(_text); final _ids = splitLists.map( (e) => const Uuid().v4(), ) .toList(); docs = splitter.createDocuments(splitLists, ids: _ids);
  19. 2 Text Split 35 List<Document> docs = []; const splitter

    = RecursiveCharacterTextSplitter( chunkSize: 2000, chunkOverlap: 400, ); final splitLists = splitter.splitText(_text); final _ids = splitLists.map( (e) => const Uuid().v4(), ) .toList(); docs = splitter.createDocuments(splitLists, ids: _ids); dependencies: flutter: sdk: flutter langchain:
  20. 2 Text Split 36 List<Document> docs = []; const splitter

    = RecursiveCharacterTextSplitter( chunkSize: 2000, chunkOverlap: 400, ); final splitLists = splitter.splitText(_text); final _ids = splitLists.map( (e) => const Uuid().v4(), ) .toList(); docs = splitter.createDocuments(splitLists, ids: _ids); dependencies: flutter: sdk: flutter langchain: Original Text Chunk Chunk Chunk
  21. 2 Text Split 37 List<Document> docs = []; const splitter

    = RecursiveCharacterTextSplitter( chunkSize: 2000, chunkOverlap: 400, ); final splitLists = splitter.splitText(_text); final _ids = splitLists.map( (e) => const Uuid().v4(), ) .toList(); docs = splitter.createDocuments(splitLists, ids: _ids); dependencies: flutter: sdk: flutter langchain:
  22. 3 Embedding 38 const openaiApiKey = "API KEY"; final embeddings

    = OpenAIEmbeddings(apiKey: openaiApiKey); String baseUrl = "localhost"; if (defaultTargetPlatform == TargetPlatform.android) { baseUrl = "10.0.2.2"; } dependencies: flutter: sdk: flutter langchain: final embeddings = OllamaEmbeddings(model: "nomic-embed-text");
  23. 4 Vector Store 39 dependencies: flutter: sdk: flutter langchain: final

    vectorStore = Chroma(baseUrl: "http://$baseUrl:8000", embeddings: embeddings); await vectorStore.addDocuments(documents: docs);
  24. 4 Vector Store 40 final vectorStore = Chroma(baseUrl: "http://$baseUrl:8000", embeddings:

    embeddings); await vectorStore.addDocuments(documents: docs); dependencies: flutter: sdk: flutter langchain: w/ Text Embedding
  25. 5 Retriver 41 final retriever = vectorStore.asRetriever(); final setupAndRetrieval =

    Runnable.fromMap<String>({ 'context': retriever.pipe( Runnable.mapInput((docs) => docs.map((d) => d.pageContent).join('\n')), ), 'question': Runnable.passthrough(), }); Similarity search
  26. 5 Retriver 42 final retriever = vectorStore.asRetriever(); final setupAndRetrieval =

    Runnable.fromMap<String>({ 'context': retriever.pipe( Runnable.mapInput((docs) => docs.map((d) => d.pageContent).join('\n')), ), 'question': Runnable.passthrough(), }); Similarity search for query
  27. 6 Prompt 43 final promptTemplate = PromptTemplate.fromTemplate( "Answer the question

    based on only the following " "context:\n{context}\n{question}"); final chatPromptTemplate = ChatPromptTemplate.fromTemplates( const [ ( ChatMessageType.system, 'Answer the question based on only the following context:\n{context}' ), (ChatMessageType.human, "\n{question}"), ], ); Prompt is an art
  28. 6 Prompt 44 final promptTemplate = PromptTemplate.fromTemplate( "Answer the question

    based on only the following " "context:\n{context}\n{question}"); final chatPromptTemplate = ChatPromptTemplate.fromTemplates( const [ ( ChatMessageType.system, 'Answer the question based on only the following context:\n{context}' ), (ChatMessageType.human, "\n{question}"), ], ); Retrievers Search Result User Query
  29. 7 LLM 46 final llm = ChatOllama( baseUrl: "http://10.0.2.2:11434/api", defaultOptions:

    ChatOllamaOptions( temperature: 0.1, model: modelValue, ), ); aya llama3 gemma etc..
  30. 8 Chian(LCEL) & Outout 47 const outputParser = StringOutputParser<ChatResult>(); final

    chain = setupAndRetrieval.pipe(promptTemplate).pipe(llm).pipe(outputParser); final result = chain.stream(text); await for (var text in result) { setState(() { resultText += text; }); }
  31. 8 Chian(LCEL) & Output 48 const outputParser = StringOutputParser<ChatResult>(); final

    chain = setupAndRetrieval.pipe(promptTemplate).pipe(llm).pipe(outputParser); final result = chain.stream(text); await for (var text in result) { setState(() { resultText += text; }); } User Query Output
  32. • The RAG architecture is used to improve the hallucination

    phenomenon in LLMs and to obtain accurate answers for personal data and updated information. • Recent LLM services handle responses for the latest data by performing web searches in advance (Agent). • LangChain and LlamaIndex frameworks are mainly used for developing LLM apps. • LangChain is implemented in Python and JavaScript. • By using the LangChain Dart package, LLM app development can be done easily and quickly. • We learned how to implement it based on the basic 8 steps of RAG with dart . Summary Implementing a RAG System with Dart language 49