Upgrade to Pro — share decks privately, control downloads, hide ads and more …

From blog posts to GenAI

From blog posts to GenAI

How can you bring GenAI to your data stored in MySQL.

In this session, we will examine how to add value to existing data stored in MySQL by utilizing our data context for RAG.
We discover three different methods using OCI GenAI Service, MySQL HeatWave, and even on-premises solutions.

This session was presented at the RoOUG in Bucharest.

Avatar for lefred

lefred

June 04, 2025
Tweet

More Decks by lefred

Other Decks in Technology

Transcript

  1. Frédéric Descamps Community Manager Oracle MySQL RoOUG celebrates MySQL 30

    - June 2025 From blog posts to GenAI Query your content with AI
  2. • @lefred • @lefredbe.bsky.social • @[email protected] • @lefred14:matrix.org • MySQL

    Evangelist • using MySQL since version 3.20 • devops believer • living in • h�ps://lefred.be Frédéric Descamps Copyright @ 2025 Oracle and/or its affiliates. 3
  3. GenAI GenAI is very trendy and powerful. It's a set

    of tools and techniques that allow machines to learn from data and mimic some things humans can do, like providing answers. For DBAs or developers using databases, it's an excellent opportunity to query their content di�erently and provide more value to the existing data. When we use a chat service, like chatGPT, we can ask questions and get answers generated with data from all over ressources that was processed from the internet. But what if we could do the same with our existing data stored in MySQL? Copyright @ 2025 Oracle and/or its affiliates. 5
  4. Disclaimer Copyright @ 2025 Oracle and/or its affiliates. I'm not

    a GenAI expert, but I want to share the processes and steps I learned while working on this project. 6
  5. GenAI and Data - the process Copyright @ 2025 Oracle

    and/or its affiliates. embeddings This is the process of converting text into a numerical representation. This is the first step in the process of using GenAI to query data. For example: "content 1" --> [1.2, 3.0, -1.5] 7
  6. GenAI and Data - the process Copyright @ 2025 Oracle

    and/or its affiliates. vector search This is the process of querying the data using the embeddings generated in the previous step. This is the third step in the process of using GenAI to query data. To do this, we use the distance() function in MySQL HeatWave. On other databases, this is where "vector indexes" are used to speed up the search. 13
  7. GenAI and Data - the process Copyright @ 2025 Oracle

    and/or its affiliates. reranking Reranking is the process where an initial set of results (the retrieved documents) is reordered based on more sophisticated criteria, such as relevance, user intent, or quality related to the query. Initial retrieval methods (like a search engine) might not always surface the most relevant results at the top. Reranking ensures that the most contextually appropriate or high-quality results are prioritized. This step is optional but recommended to improve the quality of the results. 15
  8. GenAI and Data - the process Copyright @ 2025 Oracle

    and/or its affiliates. RAG Retrieval-Augmented Generation (RAG) This is the final step in the process of using GenAI to query data. Inference The ability of a large language model (LLM) to generate a response based on instructions and context provided by the user in the prompt. An LLM can generate new data, make predictions, or draw conclusions based on its learned patterns and relationships in the training data, without having been explicitly programmed. Inference is a key feature of natural language processing (NLP) tasks such as question answering, summarizing text, and translating. You can use the foundational models in Generative AI for inference. 17
  9. WordPress - lefred.be Our goal is to use the content

    of my blog, lefred.be, and use it to generate answers with GenAI. We will create embeddings of the content using GenAI service in OCI and store the vectors in MySQL HeatWave. Then we will use the similarity function, distance() to query the content. We have 2 options to generate answers based on the content of the blog: • Use of GPU with OCI GenAI service • Use of CPU with HeatWave Copyright @ 2025 Oracle and/or its affiliates. 21
  10. Prerequisites • A WordPress database • An OCI account •

    A MySQL HeatWave instance (version 9.x) • Access to GenAI service in OCI • A compute instance to run our code (or similar) • Python or JavaScript Copyright @ 2025 Oracle and/or its affiliates. 22
  11. WordPress database on HeatWave We need to �rst check that

    we are running MySQL HeatWave 9.x: Copyright @ 2025 Oracle and/or its affiliates. 23
  12. WordPress database on HeatWave (2) If this is the production

    of our WordPress database, we are �ne to continue to the next step, if not, we have to import the data from our WordPress database to the MySQL HeatWave instance. Copyright @ 2025 Oracle and/or its affiliates. 24
  13. Prepare a table for embeddings We create a table to

    store the embeddings of the content of our blog: SQL SQL > > CREATE CREATE TABLE TABLE ` `wp_embeddings wp_embeddings` ` ( ( ` `id id` ` bigint bigint unsigned unsigned NOT NOT NULL NULL AUTO_INCREMENT AUTO_INCREMENT, , ` `vec vec` ` vector vector( (1024 1024) ) DEFAULT DEFAULT NULL NULL, , ` `wp_post_id wp_post_id` ` bigint bigint unsigned unsigned DEFAULT DEFAULT NULL NULL, , PRIMARY PRIMARY KEY KEY ( (` `id id` `) ) ) ) ENGINE ENGINE= =InnoDB InnoDB DEFAULT DEFAULT CHARSET CHARSET= =utf8mb4 utf8mb4 COLLATE COLLATE= =utf8mb4_0900_ai_ci utf8mb4_0900_ai_ci Copyright @ 2025 Oracle and/or its affiliates. 25
  14. If we want to use GenAI related actions with GPU,

    we need to use the OCI GenAI service: • we use an API to interact with the service for: ◦ embeddings ◦ reranking ◦ inference • we use an external application (in Python) OCI GenAI service Copyright @ 2025 Oracle and/or its affiliates. 27
  15. We use Python for GenAI We will use Python to

    interact with the GenAI service in OCI therefore we need to provide our OCI credentials to the Python script. And install the required modules: • oci • mysql-connector-python • unstructured • beautifulsoup4 Copyright @ 2025 Oracle and/or its affiliates. 29
  16. Natural Language Toolkit The python unstructured module requires the Natural

    Language Toolkit (nltk) to be installed. unstructured is a Python library that provides a set of tools for working with unstructured data, such as text, images, and audio. nltk is a Python library that provides a set of tools for building Programs to work with human language data. Some data needs to be installed before being able to use it. Copyright @ 2025 Oracle and/or its affiliates. 30
  17. Natural Language Toolkit (2) $ python3.9 $ python3.9 Python Python

    3.9 3.9.20 .20 ( (main, Oct main, Oct 24 24 2024 2024, 07:04:44 , 07:04:44) ) >> >>> > Copyright @ 2025 Oracle and/or its affiliates. 31
  18. Natural Language Toolkit (2) $ python3.9 $ python3.9 Python Python

    3.9 3.9.20 .20 ( (main, Oct main, Oct 24 24 2024 2024, 07:04:44 , 07:04:44) ) >> >>> > >> >>> > import import nltk nltk >> >>> > nltk nltk. .download download( ('punkt' 'punkt') ) >> >>> > nltk nltk. .download download( ('punkt_tab' 'punkt_tab') ) >> >>> > nltk nltk. .download download( ('averaged_perceptron_tagger_eng' 'averaged_perceptron_tagger_eng') ) Copyright @ 2025 Oracle and/or its affiliates. 31
  19. Natural Language Toolkit (2) $ python3.9 $ python3.9 Python Python

    3.9 3.9.20 .20 ( (main, Oct main, Oct 24 24 2024 2024, 07:04:44 , 07:04:44) ) >> >>> > >> >>> > import import nltk nltk >> >>> > nltk nltk. .download download( ('punkt' 'punkt') ) >> >>> > nltk nltk. .download download( ('punkt_tab' 'punkt_tab') ) >> >>> > nltk nltk. .download download( ('averaged_perceptron_tagger_eng' 'averaged_perceptron_tagger_eng') ) $ $ du du -sh -sh nltk_data/* nltk_data/* 7 7.0M nltk_data/taggers .0M nltk_data/taggers 64M nltk_data/tokenizers 64M nltk_data/tokenizers Copyright @ 2025 Oracle and/or its affiliates. 31
  20. OCI & MySQL HeatWave Con�guration We create a �le to

    store all our con�guration se�ings, wp_con�g.py: COMPARTMENT COMPARTMENT = = "<ENTER HERE THE GENAI COMPARTMENT ID>" "<ENTER HERE THE GENAI COMPARTMENT ID>" CONFIG_FILE CONFIG_FILE = = "~/.oci/config" "~/.oci/config" CONFIG_PROFILE CONFIG_PROFILE = = "DEFAULT" "DEFAULT" # Service endpoint # Service endpoint ENDPOINT ENDPOINT = = "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com" "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com" LLM_MODEL_ID LLM_MODEL_ID = = "<LLM CHAT GENAI MODEL ID>" "<LLM CHAT GENAI MODEL ID>" DB_USER DB_USER = = "<YOUR DB USER ACCOUNT>" "<YOUR DB USER ACCOUNT>" DB_PASSWORD DB_PASSWORD = = "<YOUR DB USER PASSWORD>" "<YOUR DB USER PASSWORD>" DB_HOST DB_HOST = = "<YOUR DB HOST IP>" "<YOUR DB HOST IP>" DB_PORT DB_PORT = = 3306 3306 DB_SCHEMA DB_SCHEMA = = "<WORDPRESS SCHEMA>" "<WORDPRESS SCHEMA>" Copyright @ 2025 Oracle and/or its affiliates. 32
  21. GenAI Compartment The best region for GenAI in OCI is

    us-chicago-1, we need to �nd the Compartment ID where our GenAI service is running to add in the con�guration �le. Copyright @ 2025 Oracle and/or its affiliates. 33
  22. GenAI Compartment (3) We copy the compartment_id and paste it

    in our wp_con�g.py �le. Copyright @ 2025 Oracle and/or its affiliates. 35
  23. Embeddings with OCI GenAI From Text to Vectors Copyright @

    2025 Oracle and/or its affiliates. 36
  24. Embeddings with OCI GenAI Now we need to create the

    embeddings of the content of our blog, we will generate vectors of 1024 dimensions from the content of our blog transformed into text and cut into pieces of 96 characters. We will store these vectors in the table wp_embeddings we created earlier. Copyright @ 2025 Oracle and/or its affiliates. 37
  25. Embeddings - code cursor cursor. .execute execute( ("SELECT ID, post_content

    from wp_posts WHERE post_type='post' AND post_status='publish'" "SELECT ID, post_content from wp_posts WHERE post_type='post' AND post_status='publish'") ) results results = = cursor cursor. .fetchall fetchall( () ) for for row row in in results results: : print print( ("." ".", , flush flush= =True True, , end end= ="" "") ) inputs inputs = = [ [] ] content content = = row row[ [1 1] ] content_text content_text = = BeautifulSoup BeautifulSoup( (content content, , "html.parser" "html.parser") ) for for pi pi in in content_text content_text. .find_all find_all( (string string= =lambda lambda text text: : isinstance isinstance( (text text, , bs4 bs4. .element element. .ProcessingInstruction ProcessingInstruction) )) ): : pi pi. .extract extract( () ) clean_html clean_html = = str str( (content_text content_text) ) try try: : content_text content_text = = partition_html partition_html( (text text= =clean_html clean_html) ) for for content_text_el content_text_el in in content_text content_text: : content content = = clean clean( (content_text_el content_text_el. .text text, , extra_whitespace extra_whitespace= =True True) ) if if len len( (content content) ) > > 1 1: : inputs inputs. .append append( (content content) ) except except Exception Exception as as e e: : print print( (f"Error with wp_post_id: f"Error with wp_post_id: { {row row[ [0 0] ]} } !" !") ) Copyright @ 2025 Oracle and/or its affiliates. 38
  26. Embeddings - code (2) if if len len( (inputs inputs)

    ) > > 1 1: : for for input_block input_block in in range range( (0 0, , len len( (inputs inputs) ), , 96 96) ): : block block = = inputs inputs[ [input_block input_block : : input_block input_block + + 96 96] ] pdebug pdebug( (block block) ) embed_text_detail embed_text_detail = = ( ( oci oci. .generative_ai_inference generative_ai_inference. .models models. .EmbedTextDetails EmbedTextDetails( () ) ) ) embed_text_detail embed_text_detail. .inputs inputs = = block block embed_text_detail embed_text_detail. .truncate truncate = = embed_text_detail embed_text_detail. .TRUNCATE_END TRUNCATE_END embed_text_detail embed_text_detail. .serving_mode serving_mode = = ( ( oci oci. .generative_ai_inference generative_ai_inference. .models models. .OnDemandServingMode OnDemandServingMode( ( model_id model_id= ="cohere.embed-english-v3.0" "cohere.embed-english-v3.0" ) ) ) ) embed_text_detail embed_text_detail. .compartment_id compartment_id = = compartment_id compartment_id embed_text_detail embed_text_detail. .input_type input_type = = ( ( embed_text_detail embed_text_detail. .INPUT_TYPE_SEARCH_DOCUMENT INPUT_TYPE_SEARCH_DOCUMENT ) ) try try: : embed_text_response embed_text_response = = generative_ai_inference_client generative_ai_inference_client. .embed_text embed_text( ( embed_text_detail embed_text_detail ) ) except except Exception Exception as as e e: : print print( ("Error while creating embeddings " "Error while creating embeddings ", , e e) ) embeddings embeddings = = [ [] ] Copyright @ 2025 Oracle and/or its affiliates. 39
  27. Embeddings - code (3) else else: : embeddings embeddings =

    = embed_text_response embed_text_response. .data data. .embeddings embeddings insert_stmt insert_stmt = = ( ( "INSERT INTO wp_embeddings(vec, wp_post_id) " "INSERT INTO wp_embeddings(vec, wp_post_id) " "VALUES (string_to_vector(%s), %s)" "VALUES (string_to_vector(%s), %s)" ) ) for for emb emb in in embeddings embeddings: : myvectorStr myvectorStr = = "," ",". .join join( ( str str( (item item) ) for for item item in in list list( (emb emb) ) ) ) myvectorStr myvectorStr = = "[" "[" + + myvectorStr myvectorStr + + "]" "]" data data = = ( (myvectorStr myvectorStr, , row row[ [0 0] ]) ) cursor cursor. .execute execute( (insert_stmt insert_stmt, , data data) ) cnx cnx. .commit commit( () ) Copyright @ 2025 Oracle and/or its affiliates. 40
  28. Embeddings Models in OCI (as May 2025) • cohere.embed-english-light-v2.0 •

    cohere.embed-english-light-v3.0 • cohere.embed-english-v3.0 • cohere.embed-multilingual-light-v3.0 • cohere.embed-multilingual-v3.0 • cohere.embed-english-image-v3.0 • cohere.embed-multilingual-image-v3.0 • cohere.embed-english-light-image-v3.0 Copyright @ 2025 Oracle and/or its affiliates. 41
  29. Embeddings - Check SQL SQL > > SELECT SELECT COUNT

    COUNT( (* *) ) FROM FROM wp_embeddings wp_embeddings; ; + +----------+ ----------+ | | count count( (* *) ) | | + +----------+ ----------+ | | 10934 10934 | | + +----------+ ----------+ 1 1 row row in in set set ( (0.0052 0.0052 sec sec) ) SQL SQL > > SELECT SELECT wp_post_id wp_post_id, , COUNT COUNT( (* *) ) FROM FROM wp_embeddings wp_embeddings GROUP GROUP BY BY wp_post_id wp_post_id ORDER ORDER BY BY 2 2 DESC DESC LIMIT LIMIT 5 5; ; + +------------+----------+ ------------+----------+ | | wp_post_id wp_post_id | | count count( (* *) ) | | + +------------+----------+ ------------+----------+ | | 5395 5395 | | 91 91 | | | | 45 45 | | 86 86 | | | | 6609 6609 | | 83 83 | | | | 3068 3068 | | 80 80 | | | | 6452 6452 | | 79 79 | | + +------------+----------+ ------------+----------+ Copyright @ 2025 Oracle and/or its affiliates. 43
  30. We need �rst to convert the prompt to a vector

    using the same model we used to generate the embeddings. From prompt to relevant documents We need to reduce all the content of our blog to only the most relevant ones in relation to the prompt. We use the distance() function to query the content. We perform a COSINE similarity search to �nd the most relevant documents. Copyright @ 2025 Oracle and/or its affiliates. 45
  31. From prompt to relevant documents (2) def def generate_embeddings_for_question generate_embeddings_for_question(

    (question_list question_list) ): : print print( ("Performing Embeddings of the prompt..." "Performing Embeddings of the prompt...") ) embed_text_detail embed_text_detail = = oci oci. .generative_ai_inference generative_ai_inference. .models models. .EmbedTextDetails EmbedTextDetails( () ) embed_text_detail embed_text_detail. .inputs inputs = = question_list question_list embed_text_detail embed_text_detail. .input_type input_type = = embed_text_detail embed_text_detail. .INPUT_TYPE_SEARCH_QUERY INPUT_TYPE_SEARCH_QUERY embed_text_detail embed_text_detail. .serving_mode serving_mode = = ( ( oci oci. .generative_ai_inference generative_ai_inference. .models models. .OnDemandServingMode OnDemandServingMode( ( model_id model_id= ="cohere.embed-english-v3.0" "cohere.embed-english-v3.0" ) ) ) ) embed_text_detail embed_text_detail. .compartment_id compartment_id = = compartment_id compartment_id embed_text_response embed_text_response = = generative_ai_inference_client generative_ai_inference_client. .embed_text embed_text( (embed_text_detail embed_text_detail) ) return return embed_text_response embed_text_response Copyright @ 2025 Oracle and/or its affiliates. 46
  32. From prompt to relevant documents (3) We need to �nd

    similare docs in the database: def def search_data search_data( (cursor cursor, , query_vec query_vec, , list_dict_docs list_dict_docs) ): : print print( ("Performing Vector Search Similarity..." "Performing Vector Search Similarity...") ) myvectorStr myvectorStr = = "," ",". .join join( (str str( (item item) ) for for item item in in query_vec query_vec) ) myvectorStr myvectorStr = = "[" "[" + + myvectorStr myvectorStr + + "]" "]" relevant_docs relevant_docs = = [ [] ] mydata mydata = = myvectorStr myvectorStr cursor cursor. .execute execute( ( """ """ select distinct wp_post_id from ( select distinct wp_post_id from ( select id, wp_post_id, distance from select id, wp_post_id, distance from (select id, wp_post_id, (select id, wp_post_id, DISTANCE(string_to_vector(%s), vec, 'COSINE') distance DISTANCE(string_to_vector(%s), vec, 'COSINE') distance from {}.wp_embeddings from {}.wp_embeddings order by distance limit 100) a order by distance limit 100) a where distance < 1 order by distance) b limit 50 where distance < 1 order by distance) b limit 50 """ """. .format format( ( wp_config wp_config. .DB_SCHEMA DB_SCHEMA ) ), , [ [myvectorStr myvectorStr] ], , ) ) Copyright @ 2025 Oracle and/or its affiliates. 47
  33. From prompt to relevant documents (4) We clean the content

    of the documents by removing the HTML tags and we will use the RAG model to generate the answer passing the content of the documents and the prompt: prompt_template prompt_template = = """ """ {question} \n {question} \n Answer the question based on the text provided and also return the relevant document numbers Answer the question based on the text provided and also return the relevant document numbers where you found the answer. If the text doesn't contain the answer, reply that the where you found the answer. If the text doesn't contain the answer, reply that the answer is not available. answer is not available. """ """ prompt prompt = = prompt_template prompt_template. .format format( (question question= =query query) ) llm_response_result llm_response_result = = query_llm_with_prompt query_llm_with_prompt( (similar_docs similar_docs, , prompt prompt) ) response response = = { {} } response response[ ["message" "message"] ] = = query query response response[ ["text" "text"] ] = = llm_response_result llm_response_result response response[ ["documents" "documents"] ] = = [ [ { {"id" "id": : doc doc. .doc_id doc_id, , "snippet" "snippet": : doc doc. .doc_text doc_text} } for for doc doc in in similar_docs_subset similar_docs_subset ] ] Copyright @ 2025 Oracle and/or its affiliates. 48
  34. LLM Model ID - On Demand Chat We need to

    add the model_id in our wp_con�g.py �le: Copyright @ 2025 Oracle and/or its affiliates. 49
  35. LLM Models in OCI (as Jan 31 2025) • cohere.command-a-03-2025)

    • cohere.command-r-08-2024 • cohere.command-r-plus-08-2024 • meta.llama-3.1-405b-instruct • meta.llama-3.1-70b-instruct (will be retired 2025-06-05) • meta.llama-3.2-90b-vision-instruct • meta.llama-3.3-70b-instruct • meta.llama-4-maverick-17b-128e-instruct-fp8 • meta.llama-4-scout-17b-16e-instruct Copyright @ 2025 Oracle and/or its affiliates. 50
  36. LLM Response def def query_llm_with_prompt query_llm_with_prompt( (documents documents, , prompt

    prompt) ): : print print( ("Generating the Answer..." "Generating the Answer...") ) my_documents my_documents = = [ [] ] for for docs docs in in documents documents: : my_documents my_documents. .append append( ({ {"id" "id": : f" f"{ {docs docs. .doc_id doc_id} }" ", , "text" "text": : docs docs. .doc_text doc_text} }) ) generative_ai_inference_client generative_ai_inference_client = = ( ( oci oci. .generative_ai_inference generative_ai_inference. .GenerativeAiInferenceClient GenerativeAiInferenceClient( ( config config= =config config, , service_endpoint service_endpoint= =endpoint endpoint, , retry_strategy retry_strategy= =oci oci. .retry retry. .NoneRetryStrategy NoneRetryStrategy( () ), , timeout timeout= =( (10 10, , 240 240) ), , ) ) ) ) chat_detail chat_detail = = oci oci. .generative_ai_inference generative_ai_inference. .models models. .ChatDetails ChatDetails( () ) chat_request chat_request = = oci oci. .generative_ai_inference generative_ai_inference. .models models. .CohereChatRequest CohereChatRequest( () ) chat_request chat_request. .documents documents = = my_documents my_documents chat_request chat_request. .message message = = prompt prompt chat_request chat_request. .max_tokens max_tokens = = 600 600 chat_request chat_request. .temperature temperature = = 1 1 chat_request chat_request. .frequency_penalty frequency_penalty = = 0 0 chat_request chat_request. .top_p top_p = = 0.75 0.75 chat_request chat_request. .top_k top_k = = 0 0 chat_detail chat_detail. .serving_mode serving_mode = = oci oci. .generative_ai_inference generative_ai_inference. .models models. .OnDemandServingMode OnDemandServingMode( ( model_id model_id= =wp_config wp_config. .LLM_MODEL_ID LLM_MODEL_ID ) ) chat_detail chat_detail. .chat_request chat_request = = chat_request chat_request chat_detail chat_detail. .compartment_id compartment_id = = compartment_id compartment_id chat_response chat_response = = generative_ai_inference_client generative_ai_inference_client. .chat chat( (chat_detail chat_detail) ) return return vars vars( (chat_response chat_response) ) Copyright @ 2025 Oracle and/or its affiliates. 51
  37. LLM Response - limiting documents We saw, that we send

    several documents to the RAG model, in the previous code we limited to 50, but too many documents can be confusing for the model: Copyright @ 2025 Oracle and/or its affiliates. 52
  38. LLM Response - limiting documents (2) If we limit the

    document returned by the vector search to 10, we can see that the answer is more relevant: Copyright @ 2025 Oracle and/or its affiliates. 53
  39. Reranking The vector search is not always returning the most

    relevant documents, we can improve the quality of the results by reranking the documents, using a model that will reorder the documents based on more sophisticated criteria, such as relevance, user intent, or quality related to the query. Copyright @ 2025 Oracle and/or its affiliates. 55
  40. Reranking - OCI The documentation of the GenAI service in

    OCI provides information related to the API: Copyright @ 2025 Oracle and/or its affiliates. 56
  41. Reranking - OCI (2) But there are no examples yet

    as it seems no models are supporting this feature yet in OCI on demand GenAI service: Copyright @ 2025 Oracle and/or its affiliates. 57
  42. Reranking - OCI (3) To con�rm: $ python3.9 test.py $

    python3.9 test.py Error: cohere.embed-multilingual-v3.0 does not support TextRerank Error: cohere.embed-multilingual-v3.0 does not support TextRerank Error: meta.llama-3.1-405b-instruct does not support TextRerank Error: meta.llama-3.1-405b-instruct does not support TextRerank Error: meta.llama-3.2-90b-vision-instruct does not support TextRerank Error: meta.llama-3.2-90b-vision-instruct does not support TextRerank Error: meta.llama-3.1-70b-instruct does not support TextRerank Error: meta.llama-3.1-70b-instruct does not support TextRerank Error: cohere.command-r-08-2024 does not support TextRerank Error: cohere.command-r-08-2024 does not support TextRerank Error: cohere.command-r-plus-08-2024 does not support TextRerank Error: cohere.command-r-plus-08-2024 does not support TextRerank Error: cohere.embed-english-v3.0 does not support TextRerank Error: cohere.embed-english-v3.0 does not support TextRerank Error: cohere.embed-english-light-v3.0 does not support TextRerank Error: cohere.embed-english-light-v3.0 does not support TextRerank Error: cohere.embed-multilingual-light-v3.0 does not support TextRerank Error: cohere.embed-multilingual-light-v3.0 does not support TextRerank Copyright @ 2025 Oracle and/or its affiliates. 58
  43. Reranking - using cohere service We can use the cohere

    service to rerank the documents. We need to add our COHERE_API_KEY in our wp_con�g.py �le. You can get a free one from h�ps://dashboard.cohere.com/. Copyright @ 2025 Oracle and/or its affiliates. 59
  44. Reranking - using cohere service (2) import import cohere cohere

    cohere_client cohere_client = = cohere cohere. .Client Client( (wp_config wp_config. .COHERE_API_KEY COHERE_API_KEY) ) . .. .. . if if len len( (rerank_docs rerank_docs) ) > > 1 1: : print print( ("Performing Reranking..." "Performing Reranking...") ) rerank_results rerank_results = = cohere_client cohere_client. .rerank rerank( ( query query= =query query, , documents documents= =rerank_docs rerank_docs, , top_n top_n= =5 5, , model model= ="rerank-english-v3.0" "rerank-english-v3.0", , return_documents return_documents= =True True, , ) ) Copyright @ 2025 Oracle and/or its affiliates. 60
  45. Reranking - using cohere service (3) Now we can test,

    we still limit our vectopr search to 50 documents but we rerank them and use only the top 5 for Generating the answer: Copyright @ 2025 Oracle and/or its affiliates. 61
  46. Reranking - using OCI GenAI service NEW OCI GenAI service

    now supports reranking with the cohere.rerank.3-5 model that was released on 2025-05-14 in OCI. But this model is only available on dedicated AI Clusters: cohere.rerank-v3.5 cohere.rerank-v3.5 State: ACTIVE - Creating Base Model State: ACTIVE - Creating Base Model capabilities: capabilities: [ ['TEXT_RERANK' 'TEXT_RERANK'] ] created: created: 2025 2025-05-14 -05-14 20 20:05:05.321000+00:00 - deprecated: :05:05.321000+00:00 - deprecated: 2025 2025-05-01 00:00:00+00:00 - on demand retired: -05-01 00:00:00+00:00 - on demand retired: 2025 2025-05-01 00:00:00+00:00 -05-01 00:00:00+00:00 cohere.rerank-multilingual-v3.1 cohere.rerank-multilingual-v3.1 State: ACTIVE - Creating Base Model State: ACTIVE - Creating Base Model capabilities: capabilities: [ ['TEXT_RERANK' 'TEXT_RERANK'] ] created: created: 2024 2024-11-26 -11-26 20 20:25:33.745000+00:00 - deprecated: None - on demand retired: None :25:33.745000+00:00 - deprecated: None - on demand retired: None cohere.rerank-english-v3.1 cohere.rerank-english-v3.1 State: ACTIVE - Creating Base Model State: ACTIVE - Creating Base Model capabilities: capabilities: [ ['TEXT_RERANK' 'TEXT_RERANK'] ] created: created: 2024 2024-06-04 00:34:49.790000+00:00 - deprecated: None - on demand retired: None -06-04 00:34:49.790000+00:00 - deprecated: None - on demand retired: None Copyright @ 2025 Oracle and/or its affiliates. 62
  47. Reranking - using OCI GenAI service NEW Dedicated Cluster -

    endpoint Copyright @ 2025 Oracle and/or its affiliates. 65
  48. Reranking - using OCI GenAI service NEW Dedicated Cluster -

    endpoint Copyright @ 2025 Oracle and/or its affiliates. 66
  49. Reranking - using OCI GenAI service NEW In action Copyright

    @ 2025 Oracle and/or its affiliates. 67
  50. Conclusion We saw how we can use GenAI to also

    quey existing data stored in MySQL and generate answers based on that data. It doesn't require too much coding skills and the result is stunning! Welcome to a new world of GenAI, LLMS, RAG for your data, and this service is available in OCI! Copyright @ 2025 Oracle and/or its affiliates. 68
  51. With HeatWave GenAI, we have simply�ed all the process and

    everything can be done directly in the database. • we use optimized models for CPU in HeatWave ◦ everything is done in HeatWave • we use in database stored procedures (in JavaScript) HeatWave GenAI Copyright @ 2025 Oracle and/or its affiliates. 72
  52. Embeddings with HeatWave To be able to use the GenAI

    features in HeatWave, we need to enable the HeatWave cluster: Copyright @ 2025 Oracle and/or its affiliates. 74
  53. HeatWave GenAI - Supported Models If the HeatWave cluster is

    not enabled or inactive, the models are not available: MySQL MySQL > > SELECT SELECT * * FROM FROM sys sys. .ML_SUPPORTED_LLMS ML_SUPPORTED_LLMS WHERE WHERE model_type model_type = = "embedding" "embedding"; ; ERROR: ERROR: 1146 1146 ( (42 42S02 S02) ): : Table Table 'sys.ML_SUPPORTED_LLMS' 'sys.ML_SUPPORTED_LLMS' doesn't exist doesn't exist Copyright @ 2025 Oracle and/or its affiliates. 75
  54. HeatWave GenAI - Supported Models When active, we can see

    the supported models: MySQL MySQL > > SELECT SELECT * * FROM FROM sys sys. .ML_SUPPORTED_LLMS ML_SUPPORTED_LLMS WHERE WHERE model_type model_type = = "embedding" "embedding"; ; + +--------------------------------------+------------+ --------------------------------------+------------+ | | model_name model_name | | model_type model_type | | + +--------------------------------------+------------+ --------------------------------------+------------+ | | minilm minilm | | embedding embedding | | | | all_minilm_l12_v2 all_minilm_l12_v2 | | embedding embedding | | | | multilingual multilingual- -e5 e5- -small small | | embedding embedding | | | | cohere cohere. .embed embed- -english english- -light light- -v3 v3. .0 0 | | embedding embedding | | | | cohere cohere. .embed embed- -multilingual multilingual- -v3 v3. .0 0 | | embedding embedding | | | | cohere cohere. .embed embed- -multilingual multilingual- -light light- -v3 v3. .0 0 | | embedding embedding | | | | cohere cohere. .embed embed- -english english- -v3 v3. .0 0 | | embedding embedding | | + +--------------------------------------+------------+ --------------------------------------+------------+ 7 7 rows rows in in set set ( (0.0017 0.0017 sec sec) ) Copyright @ 2025 Oracle and/or its affiliates. 76
  55. Embeddings with HeatWave (2) post_title (simple text) To perform the

    embeddings, we need use the sys.ML_EMBED_ROW() function. Let's do it for the titles of each post: MySQL MySQL > > SET SET @embeddOptions @embeddOptions = = '{"model_id": "minilm"}' '{"model_id": "minilm"}'; ; MySQL MySQL > > INSERT INSERT INTO INTO wp_embeddings wp_embeddings SELECT SELECT 0 0, , sys sys. .ML_EMBED_ROW ML_EMBED_ROW( (post_title post_title, , @embeddOptions @embeddOptions) ), , ID ID FROM FROM wp_posts wp_posts WHERE WHERE post_type post_type= ="post" "post" AND AND post_status post_status= ="publish" "publish"; ; Query OK Query OK, , 607 607 rows rows affected affected ( (6 6 min min 20.0115 20.0115 sec sec) ) Copyright @ 2025 Oracle and/or its affiliates. 77
  56. Embeddings with HeatWave (3) post_content (complex text) The content of

    an article is more complex as it is usually longer and can contain HTML tags. As we don't want to use any external tool or API to leave our data inside MySQL HeatWave, we use a stored procedure wri�en in JavaScript to clean the content and perform the embeddings. Copyright @ 2025 Oracle and/or its affiliates. 78
  57. Embeddings with HeatWave (4) wp_create_embb_js DROP DROP PROCEDURE PROCEDURE IF

    IF EXISTS EXISTS wp_create_embb_js wp_create_embb_js; ; CREATE CREATE PROCEDURE PROCEDURE wp_create_embb_js wp_create_embb_js( () ) LANGUAGE LANGUAGE JAVASCRIPT JAVASCRIPT AS AS $$ $$ let let s s = = session session. .sql sql( ("SELECT count(*) FROM wp_posts where post_type='post' and post_status='publish'" "SELECT count(*) FROM wp_posts where post_type='post' and post_status='publish'") ) let let res res = = s s. .execute execute( () ) let let row row = = res res. .fetchOne fetchOne( () ) let let total total = = row row[ [0 0] ] let let batch batch = = 10 10 let let processed processed = = 0 0 let let last_id last_id = = 0 0 while while ( (processed processed < < total total) ) { { s s = = session session. .sql sql( (" "SELECT SELECT ID ID FROM FROM wp_posts where post_type wp_posts where post_type= ='post' 'post' and post_status and post_status= ='publish' 'publish' and and ID ID > > " " + + last_id last_id + +" ORDER BY ID LIMIT " " ORDER BY ID LIMIT " + + batch batch) ) res res = = s s. .execute execute( () ) row row = = res res. .fetchOne fetchOne( () ) while while( (row row) ) { { let let id id = = row row[ [0 0] ] last_id last_id = = id id let let s2 s2 = = session session. .sql sql( ("CALL wp_create_row_embb_js(" "CALL wp_create_row_embb_js(" + + id id + + ")" ")") ) s2 s2. .execute execute( () ) processed processed += += 1 1 row row = = res res. .fetchOne fetchOne( () ) } } } } let let stmt_out stmt_out = = session session. .sql sql( ('Select "Embeddings done! ' 'Select "Embeddings done! ' + + total total + + ' posts processed" as "wp_create_embb_js"' ' posts processed" as "wp_create_embb_js"', , { {passResultToClient passResultToClient: : true true} }) ) stmt_out stmt_out. .execute execute( () ) $$ $$; ; Copyright @ 2025 Oracle and/or its affiliates. 79
  58. Embeddings with HeatWave (4) wp_create_row_embb_js DROP DROP PROCEDURE PROCEDURE IF

    IF EXISTS EXISTS wp_create_row_embb_js wp_create_row_embb_js; ; CREATE CREATE PROCEDURE PROCEDURE wp_create_row_embb_js wp_create_row_embb_js( (id id INT INT) ) LANGUAGE LANGUAGE JAVASCRIPT JAVASCRIPT AS AS $$ $$ let let s s = = session session. .sql sql( ("SELECT post_content FROM wp_posts WHERE ID=" "SELECT post_content FROM wp_posts WHERE ID=" + + id id) ) let let res res = = s s. .execute execute( () ) let let row row = = res res. .fetchOne fetchOne( () ) while while( (row row) ) { { let let content content = = row row[ [0 0] ] let let content_text content_text = = content content. .replace replace( (/ /<\/?[^>]+(>|$) <\/?[^>]+(>|$)/ /g g, , "" "") ); ; content_text content_text = = content_text content_text. .replace replace( (/ /\r\n \r\n/ /g g, , " " " ") ); ; content_text content_text = = content_text content_text. .replace replace( (/ /\n \n/ /g g, , " " " ") ); ; // replace all double quotes with \" to escape them // replace all double quotes with \" to escape them content_text content_text = = content_text content_text. .replace replace( (/ /(?<!\\)" (?<!\\)"/ /g g, , '\\"' '\\"') ); ; // replace all single quotes with \' // replace all single quotes with \' content_text content_text = = content_text content_text. .replace replace( (/ /(?<!\\)' (?<!\\)'/ /g g, , "\\'" "\\'") ); ; // replace all lines starting with +- with empty string // replace all lines starting with +- with empty string content_text content_text = = content_text content_text. .replace replace( (/ /[+-].*[-+] [+-].*[-+]/ /gm gm, , "" "") ) // remove all empty lines // remove all empty lines content_text content_text = = content_text content_text. .replace replace( (/ /^\s*[\r\n] ^\s*[\r\n]/ /gm gm, , "" "") ) // check if content_text is empty // check if content_text is empty if if ( (content_text content_text. .length length == == 0 0) ) { { return return "Content is empty" "Content is empty" } } Copyright @ 2025 Oracle and/or its affiliates. 80
  59. // split content_text into strings of 100 characters // split

    content_text into strings of 100 characters // but not breaking words // but not breaking words let let tokens tokens = = [ [] ] let let token token = = "" "" let let words words = = content_text content_text. .split split( (" " " ") ) for for ( (let let i i = = 0 0; ; i i < < words words. .length length; ; i i++ ++) ) { { if if ( (token token. .length length + + words words[ [i i] ]. .length length < < 100 100) ) { { token token += += " " " " + + words words[ [i i] ] } } else else { { tokens tokens. .push push( (token token) ) token token = = words words[ [i i] ] } } } } tokens tokens. .push push( (token token) ) // insert into wp_embeddings for each token // insert into wp_embeddings for each token for for ( (let let i i = = 0 0; ; i i < < tokens tokens. .length length; ; i i++ ++) ) { { // check if tokeks[i] is not empty // check if tokeks[i] is not empty if if ( (tokens tokens[ [i i] ]. .length length == == 0 0) ) { { row row = = res res. .fetchOne fetchOne( () ) continue continue } } if if ( (tokens tokens[ [i i] ]. .trim trim( () ). .length length > > 0 0) ) { { let let q q = = "insert into wp_embeddings values (0, sys.ML_EMBED_ROW(\"" "insert into wp_embeddings values (0, sys.ML_EMBED_ROW(\""+ + tokens tokens[ [i i] ] + +"\", '{\"model_id\": \"minilm\", \"truncate\": false}'), " "\", '{\"model_id\": \"minilm\", \"truncate\": false}'), " + + id id + + ")" ")" let let s2 s2 = = session session. .sql sql( (q q) ) s2 s2. .execute execute( () ) } } } } row row = = res res. .fetchOne fetchOne( () ) } } return return "Done" "Done" $$ $$; ; Copyright @ 2025 Oracle and/or its affiliates. 81
  60. Running the Embeddings MySQL MySQL > > call call wp_create_embb_js

    wp_create_embb_js; ; + +--------------------------------------+ --------------------------------------+ | | wp_create_embb_js wp_create_embb_js | | + +--------------------------------------+ --------------------------------------+ | | Embeddings done Embeddings done! ! 607 607 posts processed posts processed | | + +--------------------------------------+ --------------------------------------+ 1 1 row row in in set set ( (1 1 hour hour 12 12 min min 52.9051 52.9051 sec sec) ) Copyright @ 2025 Oracle and/or its affiliates. 82
  61. Running the Embeddings MySQL MySQL > > call call wp_create_embb_js

    wp_create_embb_js; ; + +--------------------------------------+ --------------------------------------+ | | wp_create_embb_js wp_create_embb_js | | + +--------------------------------------+ --------------------------------------+ | | Embeddings done Embeddings done! ! 607 607 posts processed posts processed | | + +--------------------------------------+ --------------------------------------+ 1 1 row row in in set set ( (1 1 hour hour 12 12 min min 52.9051 52.9051 sec sec) ) MySQL MySQL > > select select count count( (* *) ) from from wp_embeddings wp_embeddings; ; + +----------+ ----------+ | | count count( (* *) ) | | + +----------+ ----------+ | | 7347 7347 | | + +----------+ ----------+ 1 1 row row in in set set ( (0.0199 0.0199 sec sec) ) Copyright @ 2025 Oracle and/or its affiliates. 82
  62. Filtering out - distance To �lter out some records, we

    need to use the distance() function between our request and the content of the database. We need of course to use de same model: MySQL MySQL > > SET SET @embeddOptions @embeddOptions = = '{"model_id": "minilm"}' '{"model_id": "minilm"}'; ; MySQL MySQL > > select select sys sys. .ML_EMBED_ROW ML_EMBED_ROW( ("What are roles in MySQL 8?" "What are roles in MySQL 8?", , @embeddOptions @embeddOptions) ) into into @questionEmb @questionEmb; ; Query OK Query OK, , 1 1 row row affected affected ( (0.5705 0.5705 sec sec) ) Copyright @ 2025 Oracle and/or its affiliates. 83
  63. Filtering out - distance (2) We can now use the

    distance() function to �lter out the records that are not relevant and keep just some of the top ones having the less distance: MySQL MySQL > > select select wp_post_id wp_post_id, , post_title post_title, , min min( (distance distance( (@questionEmb @questionEmb, , vec vec, , 'COSINE' 'COSINE') )) ) as as distance distance from from wp_embeddings e wp_embeddings e join join wp_posts p wp_posts p where where p p. .ID ID = = e e. .wp_post_id wp_post_id group group by by wp_post_id wp_post_id, , post_title post_title order order by by distance distance limit limit 3 3\G \G * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * wp_post_id: wp_post_id: 2494 2494 post_title: post_title: Some Some queries related queries related to to MySQL Roles MySQL Roles distance: distance: 0.10369491577148438 0.10369491577148438 * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 2. 2. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * wp_post_id: wp_post_id: 1478 1478 post_title: MySQL post_title: MySQL 8.0 8.0: Listing Roles : Listing Roles distance: distance: 0.19010812044143677 0.19010812044143677 * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 3. 3. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * wp_post_id: wp_post_id: 1402 1402 post_title: MySQL post_title: MySQL 8.0 8.0 Roles Roles and and Graphml Graphml distance: distance: 0.19620484113693237 0.19620484113693237 3 3 rows rows in in set set ( (0.0124 0.0124 sec sec) ) Copyright @ 2025 Oracle and/or its affiliates. 84
  64. RAG with HeatWave Now we will create a new JavaScript

    stored procedure to perform the RAG and generate the answer based on the content of the database. This procedure will include the distance search and the generation of the answer using the LLM model (using the sys.ML_GENERATE() function): wp_ask_js DROP DROP PROCEDURE PROCEDURE IF IF EXISTS EXISTS wp_ask_js wp_ask_js; ; CREATE CREATE PROCEDURE PROCEDURE wp_ask_js wp_ask_js( (question question TEXT TEXT) ) LANGUAGE LANGUAGE JAVASCRIPT JAVASCRIPT AS AS $$ $$ let let s s = = session session. .sql sql( ("SELECT sys.ML_EMBED_ROW(\"" "SELECT sys.ML_EMBED_ROW(\"" + + question question + + "\" "\", , '{\"model_id\": \"minilm\", \"truncate\": false}' '{\"model_id\": \"minilm\", \"truncate\": false}') ) into @questionEmb" into @questionEmb") ) let let res res = = s s. .execute execute( () ) let let q q = = "select wp_post_id "select wp_post_id from from ( (select wp_post_id select wp_post_id, , post_title post_title, , min min( (distance distance( (@questionEmb @questionEmb, , vec vec, , 'COSINE' 'COSINE') )) ) as as distance distance from wp_embeddings e join wp_posts p where p from wp_embeddings e join wp_posts p where p. .ID ID = = e e. .wp_post_id group by wp_post_id wp_post_id group by wp_post_id, , post_title post_title order by distance limit order by distance limit 5 5) ) a where distance a where distance < < 0.25 0.25" " s s = = session session. .sql sql( (q q) ) res res = = s s. .execute execute( () ) let let row row = = res res. .fetchOne fetchOne( () ) let let output output = = [ [] ] Copyright @ 2025 Oracle and/or its affiliates. 85
  65. if if ( (output output. .length length == == 0

    0) ) { { let let smt_out smt_out = = session session. .sql sql( ('Select "No results found" as "Asking WordPress with HeatWave GenAI"' 'Select "No results found" as "Asking WordPress with HeatWave GenAI"', , { {passResultToClient passResultToClient: : true true} }) ) smt_out smt_out. .execute execute( () ) return return "No results found" "No results found" } } let let post_contents post_contents = = [ [] ] for for ( (let let i i = = 0 0; ; i i < < output output. .length length; ; i i++ ++) ) { { let let wp_post_id wp_post_id = = output output[ [i i] ][ [0 0] ] let let s3 s3 = = session session. .sql sql( ("SELECT post_content FROM wp_posts WHERE ID=" "SELECT post_content FROM wp_posts WHERE ID=" + + wp_post_id wp_post_id) ) let let res3 res3 = = s3 s3. .execute execute( () ) let let row3 row3 = = res3 res3. .fetchOne fetchOne( () ) let let content content = = row3 row3[ [0 0] ] let let content_text content_text = = content content. .replace replace( (/ /<\/?[^>]+(>|$) <\/?[^>]+(>|$)/ /g g, , "" "") ); ; content_text content_text = = content_text content_text. .replace replace( (/ /\r\n \r\n/ /g g, , " " " ") ); ; content_text content_text = = content_text content_text. .replace replace( (/ /\n \n/ /g g, , " " " ") ); ; // replace all double quotes with \" to escape them // replace all double quotes with \" to escape them content_text content_text = = content_text content_text. .replace replace( (/ /(?<!\\)" (?<!\\)"/ /g g, , '\\"' '\\"') ); ; // replace all single quotes with \' // replace all single quotes with \' content_text content_text = = content_text content_text. .replace replace( (/ /(?<!\\)' (?<!\\)'/ /g g, , "\\'" "\\'") ); ; // replace all lines starting with +- with empty string // replace all lines starting with +- with empty string content_text content_text = = content_text content_text. .replace replace( (/ /[+-].*[-+] [+-].*[-+]/ /gm gm, , "" "") ) // remove all empty lines // remove all empty lines content_text content_text = = content_text content_text. .replace replace( (/ /^\s*[\r\n] ^\s*[\r\n]/ /gm gm, , "" "") ) post_contents post_contents. .push push( (content_text content_text) ) } } Copyright @ 2025 Oracle and/or its affiliates. 86
  66. let let context_to_send context_to_send = = post_contents post_contents. .join join(

    (" " " ") ) s s = = session session. .sql sql( (" "SET SET @inferenceSetup @inferenceSetup = = JSON_OBJECT JSON_OBJECT( ('task' 'task', , 'generation' 'generation', , 'model_id' 'model_id', , 'mistral-7b-instruct-v1' 'mistral-7b-instruct-v1', , 'context' 'context', , \ \"" ""+ + context_to_send context_to_send + +"\");" "\");") ) s s. .execute execute( () ) q q = = "Select sys.ML_GENERATE(\"" "Select sys.ML_GENERATE(\"" + + question question + + "\", @inferenceSetup)" "\", @inferenceSetup)" s s = = session session. .sql sql( (q q) ) res res = = s s. .execute execute( () ) row row = = res res. .fetchOne fetchOne( () ) let let answer answer = = row row[ [0 0] ][ ["text" "text"] ] let let smt_out smt_out = = session session. .sql sql( ('Select "' 'Select "' + + answer answer + +'" as "Asking WordPress with HeatWave GenAI"' '" as "Asking WordPress with HeatWave GenAI"', , { {passResultToClient passResultToClient: : true true} }) ) smt_out smt_out. .execute execute( () ) return return "Done" "Done" $$ $$; ; Copyright @ 2025 Oracle and/or its affiliates. 87
  67. MySQL MySQL> > SELECT SELECT * * FROM FROM sys

    sys. .ML_SUPPORTED_LLMS ML_SUPPORTED_LLMS WHERE WHERE model_type model_type = = "generation" "generation"; ; + +----------------------------------+------------+ ----------------------------------+------------+ | | model_name model_name | | model_type model_type | | + +----------------------------------+------------+ ----------------------------------+------------+ | | llama2 llama2- -7 7b b- -v1 v1 | | generation generation | | | | mistral mistral- -7 7b b- -instruct instruct- -v1 v1 | | generation generation | | | | llama3 llama3- -8 8b b- -instruct instruct- -v1 v1 | | generation generation | | | | cohere cohere. .command command- -text text- -v14 v14 | | generation generation | | | | cohere cohere. .command command- -light light- -text text- -v14 v14 | | generation generation | | | | cohere cohere. .command command- -r r- -v1: v1:0 0 | | generation generation | | | | cohere cohere. .command command- -r r- -plus plus- -v1: v1:0 0 | | generation generation | | | | meta meta. .llama3 llama3- -70 70b b- -instruct instruct- -v1: v1:0 0 | | generation generation | | | | meta meta. .llama3 llama3- -1 1- -70 70b b- -instruct instruct- -v1: v1:0 0 | | generation generation | | | | meta meta. .llama3 llama3- -1 1- -405 405b b- -instruct instruct- -v1: v1:0 0 | | generation generation | | | | meta meta. .llama llama- -3.1 3.1- -405 405b b- -instruct instruct | | generation generation | | | | meta meta. .llama llama- -3.2 3.2- -90 90b b- -vision vision- -instruct instruct | | generation generation | | | | meta meta. .llama llama- -3.3 3.3- -70 70b b- -instruct instruct | | generation generation | | | | cohere cohere. .command command- -r r- -08 08- -2024 2024 | | generation generation | | | | cohere cohere. .command command- -r r- -plus plus- -08 08- -2024 2024 | | generation generation | | + +------------------------------------+------------+ ------------------------------------+------------+ 15 15 rows rows in in set set ( (0.0010 0.0010 sec sec) ) RAG with HeatWave (2) - models In the previous procedure, we used mistral-7b-instruct-v1, but we could have used any of the LLM language provided by HeatWave: Copyright @ 2025 Oracle and/or its affiliates. 88
  68. RAG with HeatWave (3) - run We can now run

    the RAG procedure with a simple question: SQL SQL> > call call wp_ask_js wp_ask_js( ("How can I split reads and writes with MySQL Router?" "How can I split reads and writes with MySQL Router?") )\G \G * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * Asking WordPress Asking WordPress with with HeatWave GenAI: Answer: MySQL Router can be used HeatWave GenAI: Answer: MySQL Router can be used to to split split reads reads and and writes writes in in a WordPress application a WordPress application by by configuring it configuring it to to use use Transparent Transparent Read Read/ /Write Write Splitting Splitting. . This involves setting up a MySQL Router This involves setting up a MySQL Router instance that acts instance that acts as as a proxy a proxy between between the WordPress application server the WordPress application server and and the MySQL HeatWave the MySQL HeatWave Database Database Service Service, , with with the WordPress application server the WordPress application server configured configured to to connect connect to to the MySQL Router instance instead the MySQL Router instance instead of of the the primary primary database database instance directly instance directly. . The MySQL Router instance can The MySQL Router instance can then then automatically direct automatically direct read read and and write write queries queries to to the appropriate instance the appropriate instance based based on on the configuration settings the configuration settings. . This allows This allows for for automatic automatic load load balancing balancing and and fault tolerance fault tolerance in in the WordPress application the WordPress application, , without without requiring requiring any any modifications modifications or or configuration changes configuration changes within within WordPress WordPress itself itself. . 1 1 row row in in set set ( (11.4867 11.4867 sec sec) ) Copyright @ 2025 Oracle and/or its affiliates. 89
  69. RAG with HeatWave (4) - run To demonstrate that we

    are indeed using our content, let's pose a question about a topic not covered in my blog: MSQL MSQL> > call call wp_ask_js wp_ask_js( ("Who won the last World Cup ?" "Who won the last World Cup ?") )\G \G * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * 1. 1. row row * ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** * Asking WordPress Asking WordPress with with HeatWave GenAI: HeatWave GenAI: No No results found results found 1 1 row row in in set set ( (0.6156 0.6156 sec sec) ) Copyright @ 2025 Oracle and/or its affiliates. 90
  70. GenAI with MySQL HeatWave - Summary When using MySQL HeatWave,

    we can use the GenAI features directly in the database without the need to use any external service or API. We can directly use internal functions such as sys.ML_EMBED_ROW() to generate embeddings and sys.ML_GENERATE() to generate answers based on the content of the database. But we can also put everything together in a stored procedure wri�en in JavaScript that will make it even easier to use. Copyright @ 2025 Oracle and/or its affiliates. 91
  71. GenAI on my machine Will it be possible to run

    something similar on my machine? Copyright @ 2025 Oracle and/or its affiliates. 93
  72. GenAI on my machine Will it be possible to run

    something similar on my machine? Yes! But this will requires some work and a lot of patience... Copyright @ 2025 Oracle and/or its affiliates. 93
  73. GenAI on my machine Will it be possible to run

    something similar on my machine? Yes! But this will requires some work and a lot of patience... You need: • a LLM inference server • LLM models • a MySQL server supporting the VECTOR datatype (>= 9) • being a able to perform a COSINE similarity search Copyright @ 2025 Oracle and/or its affiliates. 93
  74. LLM Inference Server I use llama.cpp to run the LLM

    models on my machine: h�ps://github.com/ggerganov/llama.cpp Copyright @ 2025 Oracle and/or its affiliates. 94
  75. LLM Models The models are available on huggingface. I used

    the following models: $ $ ls ls -lh -lh total total 6 6.2G .2G -rw-r--r-- -rw-r--r-- 1 1 fred fred 419M Jan fred fred 419M Jan 22 22 09:35 bge-reranker-v2-m3-Q4_K_M.gguf 09:35 bge-reranker-v2-m3-Q4_K_M.gguf -rw-r--r-- -rw-r--r-- 1 1 fred fred 140M Jan fred fred 140M Jan 23 23 11 11:52 jina-embeddings-v2-base-en-q8_0.gguf :52 jina-embeddings-v2-base-en-q8_0.gguf -rw-r--r-- -rw-r--r-- 1 1 fred fred fred fred 5 5.7G Jan .7G Jan 22 22 09:56 meta-llama-3.1-8b-instruct-q4_0.gguf 09:56 meta-llama-3.1-8b-instruct-q4_0.gguf Copyright @ 2025 Oracle and/or its affiliates. 95
  76. Models Models names represent the type of model and the

    version of the model. It provides: • model name or family (e.g., meta) • model architecture and version (e.g., llama-3.1) • model size (e.g., 8b or 8 billion parameters) • instruction-tuned variant (e.g., instruct) • quantization or compression format (e.g., q8_0, Q4_K_M) • �le format (e.g., gguf) Copyright @ 2025 Oracle and/or its affiliates. 96
  77. Models (2) The model meta-llama-3.1-8b-instruct-q4_0.gguf is: • Created by Meta

    as part of the LLaMA 3.1 series. Copyright @ 2025 Oracle and/or its affiliates. 97
  78. Models (2) The model meta-llama-3.1-8b-instruct-q4_0.gguf is: • Created by Meta

    as part of the LLaMA 3.1 series. • A mid-sized model with 8 billion parameters. Copyright @ 2025 Oracle and/or its affiliates. 97
  79. Models (2) The model meta-llama-3.1-8b-instruct-q4_0.gguf is: • Created by Meta

    as part of the LLaMA 3.1 series. • A mid-sized model with 8 billion parameters. • Fine-tuned for instruction-following tasks, making it user-friendly for conversational or task-speci�c applications. Copyright @ 2025 Oracle and/or its affiliates. 97
  80. Models (2) The model meta-llama-3.1-8b-instruct-q4_0.gguf is: • Created by Meta

    as part of the LLaMA 3.1 series. • A mid-sized model with 8 billion parameters. • Fine-tuned for instruction-following tasks, making it user-friendly for conversational or task-speci�c applications. • Optimized using 4-bit quantization for more e�cient deployment on lower-resource hardware. Copyright @ 2025 Oracle and/or its affiliates. 97
  81. Models (2) The model meta-llama-3.1-8b-instruct-q4_0.gguf is: • Created by Meta

    as part of the LLaMA 3.1 series. • A mid-sized model with 8 billion parameters. • Fine-tuned for instruction-following tasks, making it user-friendly for conversational or task-speci�c applications. • Optimized using 4-bit quantization for more e�cient deployment on lower-resource hardware. • Packaged in the .gguf format for easy integration with speci�c tools or environments. Copyright @ 2025 Oracle and/or its affiliates. 97
  82. Embeddings Service To serve the embeddings, I use llama.cpp server

    with a text embeddings model by Jina AI like this: $ llama-server $ llama-server -m -m MODELS/jina-embeddings-v2-base-en-q8_0.gguf MODELS/jina-embeddings-v2-base-en-q8_0.gguf \ \ --embedding --embedding --pooling --pooling cls cls -ub -ub 8192 8192 --no-webui --no-webui This will serve the embeddings on http://localhost:8080. Copyright @ 2025 Oracle and/or its affiliates. 98
  83. Reranking Service For reranking, I use llama.cpp server with a

    reranker model from the Beijing Academy of Arti�cial Intelligence like this: $ llama-server $ llama-server -m -m MODELS/bge-reranker-v2-m3-Q4_K_M.gguf MODELS/bge-reranker-v2-m3-Q4_K_M.gguf \ \ -b -b 81920 81920 -ub -ub 81920 81920 -c -c 16384 16384 --reranking --reranking --port --port 8082 8082 --no-webui --no-webui This will serve the reranking on http://localhost:8082. Copyright @ 2025 Oracle and/or its affiliates. 99
  84. Inference Service And �nally, the inference service with the LLM

    model from Meta-Llama is served like this: $ llama-server $ llama-server -m -m MODELS/meta-llama-3.1-8b-instruct-q4_0.gguf MODELS/meta-llama-3.1-8b-instruct-q4_0.gguf --port --port 8081 8081 This will serve the inference on http://localhost:8081 and can also be accessed via a browser. Copyright @ 2025 Oracle and/or its affiliates. 100
  85. MySQL Similarity Search In MySQL Community, there is not yet

    any support for vector indexes or similary search functions. But a COSINE distance function is relatively easy to implement as UDF: Copyright @ 2025 Oracle and/or its affiliates. 102
  86. MySQL Similarity Search (2) This can be translated to something

    like this in C++: for for ( (size_t i size_t i = = 0 0; ; i i < < vector1 vector1. .size size( () ); ; ++ ++i i) ) { { sum sum += += vector1 vector1[ [i i] ] * * vector2 vector2[ [i i] ]; ; magnitude_vec1 magnitude_vec1 += += vector1 vector1[ [i i] ] * * vector1 vector1[ [i i] ]; ; magnitude_vec2 magnitude_vec2 += += vector2 vector2[ [i i] ] * * vector2 vector2[ [i i] ]; ; } } if if ( (magnitude_vec1 magnitude_vec1 == == 0.0 0.0 || || magnitude_vec2 magnitude_vec2 == == 0.0 0.0) ) { { // Error with zero vector // Error with zero vector printf printf( ("Vectors must not be zero vectors.\n" "Vectors must not be zero vectors.\n") ); ; return return 0 0; ; } } double double cosine_similarity cosine_similarity = = sum sum / / ( (sqrt sqrt( (magnitude_vec1 magnitude_vec1) ) * * sqrt sqrt( (magnitude_vec2 magnitude_vec2) )) ); ; return return 1.0 1.0 - - cosine_similarity cosine_similarity; ; Copyright @ 2025 Oracle and/or its affiliates. 103
  87. MySQL Similarity Search (3) SQL SQL > > SELECT SELECT

    UDF_NAME UDF_NAME FROM FROM performance_schema performance_schema. .user_defined_functions user_defined_functions WHERE WHERE UDF_NAME UDF_NAME LIKE LIKE 'distance%' 'distance%'; ; + +-----------------+ -----------------+ | | UDF_NAME UDF_NAME | | + +-----------------+ -----------------+ | | distance_cosine distance_cosine | | + +-----------------+ -----------------+ 1 1 row row in in set set ( (0.0104 0.0104 sec sec) ) But of course such function will perform a full table scan. It's also possibe to create a JavaScript function to perform the same operation if you use MySQL Enterprise Edition. Copyright @ 2025 Oracle and/or its affiliates. 104
  88. Testing Embeddings We changed a bit the previous code to

    use the local services APIs: LLAMA_EMBEDDINGS_URL LLAMA_EMBEDDINGS_URL = = "http://localhost:8080/v1/embeddings" "http://localhost:8080/v1/embeddings" LLAMA_COMPLETIONS_URL LLAMA_COMPLETIONS_URL = = "http://localhost:8081/v1/completions" "http://localhost:8081/v1/completions" LLAMA_RERANK_URL LLAMA_RERANK_URL = = "http://localhost:8082/v1/rerank" "http://localhost:8082/v1/rerank" Copyright @ 2025 Oracle and/or its affiliates. 106
  89. And what about DeepSeek? llama.cpp is also compatible with DeepSeek:

    $ $ ls ls -lh -lh *gguf *gguf -rw-r--r-- -rw-r--r-- 1 1 fred fred fred fred 4 4.6G Jan .6G Jan 28 28 10 10:29 DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf :29 DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf Copyright @ 2025 Oracle and/or its affiliates. 108
  90. And what about DeepSeek? llama.cpp is also compatible with DeepSeek:

    $ $ ls ls -lh -lh *gguf *gguf -rw-r--r-- -rw-r--r-- 1 1 fred fred fred fred 4 4.6G Jan .6G Jan 28 28 10 10:29 DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf :29 DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf Copyright @ 2025 Oracle and/or its affiliates. Disclaimer: Due to suspicions of potential issues with this model in certain countries, this test was conducted on a non-Oracle machine and within a VM for research purposes. 108
  91. DeepSeek We use a similar approach to the one we

    used for the other models, with just some minor changes in the code generating the prompt. Running the server: $ llama-server $ llama-server -m -m MODELS/DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf MODELS/DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf --port --port 8081 8081 \ \ --cache-type q8_0 --cache-type q8_0 --threads --threads 16 16 Copyright @ 2025 Oracle and/or its affiliates. 109
  92. Local GenAI RAG The code is available on my GitHub:

    https://github.com/lefred/local-genai-wp Copyright @ 2025 Oracle and/or its affiliates. 111
  93. Share your to MySQL #mysql #MySQLCommunity Join our slack channel!

    bit.ly/mysql-slack Copyright @ 2025 Oracle and/or its affiliates. 112