In this article, we’ll configure the Oracle AI Optimizer and Toolkit with an LLM, enable embedding and chat APIs, and integrate user context through vectorized data for RAG-powered responses. We’ll embed and store user data in Oracle AI Database with vector indexes, all without a single line of code!
What is the Oracle AI Optimizer and Toolkit? It’s a free and open-source tool designed to make it easier for developers and data engineers to build, benchmark, and optimize AI workflows running on Oracle Database.
Prerequisites: A running AI Optimizer and Toolkit server with a connected Oracle Database. If you don’t have an optimizer server, check out my previous article, Set Up A Local Sandbox.
Configure a Chat LLM
We’ll use OpenAI LLMs for this example, though the optimizer supports many different providers both public (like OpenAI) and private self-hosted LLMs—For example, you could run this with a local Llama3.1 container on your laptop!
Create an OpenAI API key if you don’t have one already. You can follow OpenAI’s quickstart to generate an API key. Remember to fund your OpenAI account—this tutorial should only use $0.01-$0.02!
Adding a language model to the optimizer
Let’s add OpenAI’s gpt-4.1.mini model. From the optimizer configuration page (http://localhost:8501/config), select the “Models” tab:

From the models tab, click the “Add” button in the Language Models section:

Fill in the model configuration, with “openai” as the provider, gpt-4.1-mini as the model, and paste in your OpenAI API key. Fill the fields, and click
“Add” to save the model in the optimizer’s configuration:

Now you can use the OpenAI gpt-4.1-mini language model in the optimizer!
Adding an embedding model to the optimizer
We’ll now configure an embedding model to create embedding vectors for Oracle AI Database. If you’re unfamiliar with vector databases, I suggest reading my prior article, What’s a Vector Database?
From the “Models” configuration tab, scroll down to the Embedding Models section, and click the “Edit” button for the openai/text-embedding-3-small model:

Paste your OpenAI API key and click “Save” to configure the embedding model:

Create and populate a vector store
To enable Retrieval-Augmented Generation (RAG), we need to vectorize external knowledge so the LLM can retrieve it during chat. This lets us ground AI responses in real documentation, not just the model’s training cutoff.
From the “Tools” section, click on the “Split/Embed” tab to open the optimizer’s embedding configuration:

From this page, you can configure embedding parameters like the embedding model, chunk size, distance metric, vector index, and more. We’ll stick to the defaults for now:

In the “Load Knowledge Base” section, we can select data to load as vector embeddings. Choose “Web” as the knowledge base source, and copy the Oracle AI Database Vector Search documentation (https://docs.oracle.com/en/database/oracle/oracle-database/26/vecse/ai-vector-search-users-guide.pdf) as the URL.
We’ll name this vector store “OracleDocs”:

Click “Populate Vector Store” to begin processing the data. This should take around 10 seconds.

You will see a message like this, indicating your Oracle AI Database was successfully populated with vector data from the knowledge base:

Using the vector store (RAG query)
To use our new “OracleDocs” vector store, navigate to the optimizer’s “ChatBot” section and select “Vector Search” as the toolkit. This enables vector search (RAG) retrieval of data:

Because we have a single vector store, it is automatically selected in the “Vector Store” chatbot configuration:

Now we can chat using our vector store! Since we loaded a knowledge base about Oracle AI Database Vector Search, I’m going to ask a question about vector indexes:

The chatbot response is enhanced by our vector data, including instructions on how to create an IVF index:

You can click on each reference to see where the optimizer sourced its information from the knowledge base:

That’s it! You now have a working vector store for your knowledge base—enhancing an LLM beyond the knowledge cutoff.
In next part AI Optimizer & Toolkit: test and generate apps, we’ll walk through automated testing of optimizer configurations, and generating app code from those tested configurations!

Leave a Reply