This repository is a starting point for developers looking to integrate with the NVIDIA software ecosystem to speed up their generative AI systems. Whether you are building RAG pipelines, agentic workflows, or fine-tuning models, this repository will help you integrate NVIDIA, seamlessly and natively, with your development stack.
This example implements a GPU-accelerated pipeline for creating and querying knowledge graphs using RAG by leveraging NIM microservices and the RAPIDS ecosystem to process large-scale datasets efficiently.
- Build an Agentic RAG Pipeline with Llama 3.1 and NVIDIA NeMo Retriever NIM microservices [Blog, Notebook]
- NVIDIA Morpheus, NIM microservices, and RAG pipelines integrated to create LLM-based agent pipelines
- Tips for Building a RAG Pipeline with NVIDIA AI LangChain AI Endpoints by Amit Bleiweiss. [Blog, Notebook]
For more information, refer to the Generative AI Example releases.
A collection of Jupyter notebooks, sample code and reference applications built with Vision NIMs.
To pull the vision NIM workflows, clone this repository recursively:
git clone https://github.com/nvidia/GenerativeAIExamples --recurse-submodules
The workflows will then be located at GenerativeAIExamples/vision_workflows
Follow the links below to learn more:
- Learn how to use VLMs to automatically monitor a video stream for custom events.
- Learn how to search images with natural language using NV-CLIP.
- Learn how to combine VLMs, LLMs and CV models to build a robust text extraction pipeline.
- Learn how to use embeddings with NVDINOv2 and a Milvus VectorDB to build a few shot classification model.
Experience NVIDIA RAG Pipelines with just a few steps!
-
Get your NVIDIA API key.
- Go to the NVIDIA API Catalog.
- Select any model.
- Click Get API Key.
- Run:
export NVIDIA_API_KEY=nvapi-...
-
Clone the repository.
git clone https://github.com/nvidia/GenerativeAIExamples.git
-
Build and run the basic RAG pipeline.
cd GenerativeAIExamples/RAG/examples/basic_rag/langchain/ docker compose up -d --build
-
Go to https://localhost:8090/ and submit queries to the sample RAG Playground.
-
Stop containers when done.
docker compose down
NVIDIA has first-class support for popular generative AI developer frameworks like LangChain, LlamaIndex, and Haystack. These end-to-end notebooks show how to integrate NIM microservices using your preferred generative AI development framework.
Use these notebooks to learn about the LangChain and LlamaIndex connectors.
- RAG
- Agents
By default, these end-to-end examples use preview NIM endpoints on NVIDIA API Catalog. Alternatively, you can run any of the examples on premises.
Example tools and tutorials to enhance LLM development and productivity when using NVIDIA RAG pipelines.
- NVIDIA Tokkio LLM-RAG: Use Tokkio to add avatar animation for RAG responses.
- Hybrid RAG Project on AI Workbench: Run an NVIDIA AI Workbench example project for RAG.
- Changing the Inference or Embedded Model
- Customizing the Vector Database
- Customizing the Chain Server:
- Configuring LLM Parameters at Runtime
- Supporting Multi-Turn Conversations
- Speaking Queries and Listening to Responses with NVIDIA Riva
- Support Matrix
- Architecture
- Using the Sample Chat Web Application
- RAG Playground Web Application
- Software Component Configuration
We're posting these examples on GitHub to support the NVIDIA LLM community and facilitate feedback. We invite contributions! Open a GitHub issue or pull request! See contributing Check out the community examples and notebooks.