Skip to content

Latest commit

 

History

History
48 lines (32 loc) · 2.91 KB

README.md

File metadata and controls

48 lines (32 loc) · 2.91 KB

RAG Documentation

The RAG documentation is divided into the following sections:

Getting Started

  • Getting Started guides: A series of quick start steps that will help you to understand the core concepts and start the pipeline quickly for the different examples and usecases provided in this repository. These guides also include Jupyter notebooks that you can experiment with.

User Guides

The user guides cover the core details of the provided sample canonical developer rag example and how to configure and use different features to make your own chains.

  • LLM Inference Server: Learn about the service which accelerates LLM inference time using TRT-LLM.
  • Integration with Nvidia AI Playground: Understand how to access NVIDIA AI Playground on NGC which allows developers to experience state of the art LLMs and embedding models accelerated on NVIDIA DGX Cloud with NVIDIA TensorRT and Triton Inference Server.
  • Configuration Guide: The complete guide to all the configuration options available in the config.yaml file.
  • Frontend: Learn more about the sample playground provided as part of the workflow used by all the examples.
  • Chat Server Guide: Learn about the chat server which exposes core API's for the end user. All the different examples are deployed behind these standardized API's, exposed by this server.
  • Notebooks Guide: Learn about the different notebooks available and the server which can be used to access them.

Architecture Guide

This guide sheds more light on the infrastructure details and the execution flow for a query when the runtime is used for the default canonical RAG example:

  • Architecture: Understand the architecture of the sample RAG workflow.

Evaluation Tool

The sample RAG worlflow provides a set of evaluation pipelines via notebooks which developers can use for benchmarking the default canonical RAG example. There are also detailed guides on how to reproduce results and create datasets for the evaluation.

Observability Tool

Observability is a crucial aspect that facilitates the monitoring and comprehension of the internal state and behavior of a system or application.

  • Observability tool: Understand the tool and deployment steps for the observability tool.

Others