Introduction to Knowledge Banks and RAG¶
Knowledge Banks support key Generative AI features in Dataiku, such as Retrieval-Augmented Generation (RAG) and semantic search.
RAG and Knowledge Banks primarily rely on embeddings, vector representations of text or documents generated by a specialized type of LLM called an Embedding LLM.
In a RAG workflow, an LLM is augmented with a Knowledge Bank to retrieve the most relevant chunks or other contextual elements, for example, a screenshot of a document from which text has been extracted, before generating a response.
This allows the model to use information retrieved from the Knowledge Bank when formulating its answer. See Knowledge and RAG for details about how Knowledge Banks are used in Retrieval-Augmented Generation workflows.
Capabilities enabled by Knowledge Banks¶
From the Knowledge Bank page, you can:
Search¶
Run semantic searches across your corpus. This lets you explore search results, check chunking and retrieval quality, and validate how well your content supports RAG queries.
Create a Retrieval-Augmented LLM¶
Create an RA-LLM that combines your Knowledge Bank with an LLM. The model is augmented with relevant chunks or other contextual data from the Knowledge Bank before answering, providing source-based responses in Prompt Studio, Prompt Recipes, or through the LLM Mesh API.
Create a Knowledge Bank Search Tool¶
Build a Knowledge Bank Search Tool to allow LLM agents or automation workflows to query the Knowledge Bank programmatically. See Knowledge Bank Search tool for details about the tool settings.