Retrieval-Augmented Generation (RAG)

A pattern that grounds an LLM in your own documents by retrieving relevant passages at query time.

RAG combines vector search over your knowledge base with an LLM that drafts the final answer. It is the standard way to make ChatGPT-style assistants answer questions over private data without retraining the model.

Related terms