Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LLM App with Momento

LLM App with Momento

2023.8.3 MoCon 2023 @ T-Mobile Park (Seattle)

吉田真吾

August 03, 2023
Tweet

More Decks by 吉田真吾

Other Decks in Programming

Transcript

  1. Momento Confidential CYDAS PEOPLE = Talent Management SaaS on AWS

    • Emp Profile, 1on1, MBO, Performance, HR FAQ, etc… • Big Issue 1. Company with tens of thousands of employees 500 inquiries / 2 personnel / month →90% are listed in FAQ → consulted about closing the inquiry function
  2. Momento Confidential “PEOPLE Copilot Chat” = RAG[Grounding] App • RAG(Retrieval

    Augmented Generation) App • HR FAQ & Chat history with HR → Embedding • User question → retrieve and answer using ChatGPT(API) • 6 Days to make Demo for HR Conference • 2 Months to rebuild for PRODUCTION
  3. Momento Confidential What is LangChain? 🦜🔗 • LangChain is a

    framework for application development using LLM. ◦ There are two implementations: Python and JavaScript/TypeScript. ◦ Python version is more active development. • LangChain is available as OSS and is updated daily. • My recommendation is to use TypeScript for creating a demo, but if you want to build a production version or take full advantage of LangChain’s functionality in the long term, I recommend the Python version.
  4. Momento Confidential LangChain > Usage • Sentence summarization • Chatbot

    • Q&A for documents • chat2query • etc… LangChain > Module • Models • Prompts • Indexes • Chains • Memory • Agents
  5. Momento Confidential Why we use Momento? • Perfect Serverless •

    Easy to integrate ◦ Needs few lines of code. • Super Fast and Reliable ◦ Always respond within few msec. • Higher Security ◦ We need to care about many compliance because CYDAS is hosting many personal data in it.
  6. Momento Confidential For Production It’s easy to make something cool

    with LLMs, but very hard to make something production-ready with them. - Chip Huyen • Security and Compliance: ◦ OpenAI → Azure OpenAI ◦ Pinecone → Azure Cognitive Search • Safety ◦ Accuracy, Hallucination, Fairness
  7. Momento Confidential Lessons we learned for Production 1. RAG app

    is easy to implement -> Can traditional search UI (without LLMs) solve this problem? 2. Workflow to take advantage of LLM capabilities is important 1. Combine deterministic programming with non-deterministic LLMs 2. Chain multiple tasks together 🦜🔗. 3. 🦜🔗 is a treasure trove of ideas + implementations 1. ReAct → langchain.agents 2. HyDE → LLM fantasizes about the answer to a question and searches for knowledge similar to that answer from langchain.chains import HypotheticalDocumentEmbedder 4. Enterprise search is usually beer for everything than Vector Similarity Search only 5. LLMOps≠MLOps 1. Hard to notice changes in input / output 2. Limited ability to notice = replace API or model, adjust prompts (+ version control) 3. Response time should be captured e.g. LangSmith
  8. Momento Confidential OWASP Top10 for LLM 1. Prompt Injection 2.

    Insecure Output Handling 3. Training Data Poisoning 4. Model Denial of Service 5. Supply Chain Vulnerabilities 6. Sensitive Information Disclosure 7. Insecure Plugin Design 8. Excessive Agency 9. Overreliance 10. Model Theft OWASP Top 10 for Large Language Model Applications https://owasp.org/www-project-top-10-for-large-language-model-applications/
  9. Momento Confidential OWASP Top10 for LLM 1. Prompt Injection 2.

    Insecure Output Handling 3. Training Data Poisoning 4. Model Denial of Service 5. Supply Chain Vulnerabilities 6. Sensitive Information Disclosure 7. Insecure Plugin Design 8. Excessive Agency 9. Overreliance 10. Model Theft OWASP Top 10 for Large Language Model Applications https://owasp.org/www-project-top-10-for-large-language-model-applications/
  10. Momento Confidential GSPNMBOHDIBJOFYQFSJNFOUBM UPMBOHDIBJO@FYQFSJNFOUBM  .PWFFYQFSJNFOUBMUPFYQFSJNFOUBM QBDLBHF  &BTZUPQVTIUIF13T 

    "TSFTVMU $7&TXJMMSFNPWFGSPN DPSFQBDLBHF MBOHDIBJO  *U`TKVTUCFHJOOJOH NPSFIJHIMFWFM UIPVHIUTFY.PEVMBSJUZˠ IUUQTHJUIVCDPNMBOHDIBJO BJMBOHDIBJOEJTDVTTJPOT
  11. Momento Confidential 🦜🔗 .PWFFYQFSJNFOUBMUPFYQFSJNFOUBMQBDLBHF • Big News ◦ All features

    including CVE (vulnerabilities) are now in a separate package (Experimental) ◦ Streamlining of the 🦜🔗 core ◦ Mentioned plans for a package called Community Chain • Means… ◦ Cannot be used in production → Can be used ◦ Unlimited expansion over the past year or so has meant that the Lambda Layer will one day no longer be ridden → Constant traffic control will be possible. ◦ Implementation of papers and ambitious ideas will be more PR-friendly • for AWS Lambda ◦ Current size: 130MB after expansion including dependent libraries ◦ Spin-up takes approximately 5 seconds → Multiple measures are needed, such as Lazy listeners and retry header checks when using from Slack
  12. Momento Confidential Amazon Kendra + 🦜🔗 高精度な生成系 AI アプリケーションを Amazon

    Kendra、LangChain、大規模言語モデルを使って作る https://aws.amazon.com/jp/blogs/news/quickly-build-high-accuracy-generative-ai-applications-on-enterprise-data-using-amazon-kendra-langchain-and- large-language-models/ Salesforce, Slack, box, Mail…