RAG Documentation

The RAG documentation is divided into the following sections:

RAG Documentation

Getting Started

Getting Started guides: A series of quick start steps that will help you to understand the core concepts and start the pipeline quickly for the different examples and usecases provided in this repository. These guides also include Jupyter notebooks that you can experiment with.

User Guides

The user guides cover the core details of the provided sample canonical developer rag example and how to configure and use different features to make your own chains.

LLM Inference Server: Learn about the service which accelerates LLM inference time using TRT-LLM.
Integration with Nvidia AI Playground: Understand how to access NVIDIA AI Playground on NGC which allows developers to experience state of the art LLMs and embedding models accelerated on NVIDIA DGX Cloud with NVIDIA TensorRT and Triton Inference Server.
Configuration Guide: The complete guide to all the configuration options available in the config.yaml file.
Frontend: Learn more about the sample playground provided as part of the workflow used by all the examples.
Chat Server Guide: Learn about the chat server which exposes core API's for the end user. All the different examples are deployed behind these standardized API's, exposed by this server.
Notebooks Guide: Learn about the different notebooks available and the server which can be used to access them.

Architecture Guide

This guide sheds more light on the infrastructure details and the execution flow for a query when the runtime is used for the default canonical RAG example:

Architecture: Understand the architecture of the sample RAG workflow.

Evaluation Tool

The sample RAG worlflow provides a set of evaluation pipelines via notebooks which developers can use for benchmarking the default canonical RAG example. There are also detailed guides on how to reproduce results and create datasets for the evaluation.

RAG Evaluation: Understand the different notebooks available.

Observability Tool

Observability is a crucial aspect that facilitates the monitoring and comprehension of the internal state and behavior of a system or application.

Observability tool: Understand the tool and deployment steps for the observability tool.

Others

Support Matrix
Open API schema references

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

RAG Documentation

Getting Started

User Guides

Architecture Guide

Evaluation Tool

Observability Tool

Others

Files

README.md

Latest commit

History

README.md

File metadata and controls

RAG Documentation

Getting Started

User Guides

Architecture Guide

Evaluation Tool

Observability Tool

Others