Welcome to the Production-Ready GenAI Apps with Tracing & Evaluations session of the DevFest AI Workshop!
LLM demos are easy - making them reliable enough for production is hard. Weave (from Weights & Biases) helps with every stage of development and deployment, so you can iterate rapidly and build confidence in your application.
Instructor: Sam Stowers
Duration: 30 minutes
Objective:
- Understand complex queries with tracing (3 lines of code!!)
- Rapidly identify cost & latency bottlenecks
- Implement evaluations to make LLM responses consistently good
- Test which LLMs are best for your needs
By the end of this session, you will have a deeper understanding of [specific concepts/skills].
- Basic knowledge of Python
- Familiarity with git, github, and the command line. You'll need to locally clone a repo.
- A Weights & Biases account (free)
- A Groq API key (free)
-
Intro
- [Brief description of the introduction phase, e.g., "Overview of human-in-the-loop concepts."]
-
Sam briefly talks about evals
- Less than 5 mins about why tracing, why evals, how to use it, what we'll cover
-
Challenge 1: Observability
-
Challenge 2: Evaluations
-
Challenge 3: The improvement flywheel
-
Q&A and Discussion
- Open floor for questions on implementing evals in production systems.
Full instructions at this link: https://wandbai.notion.site/GDG-Workshop-Instructions-Nov-9-24-138e2f5c7ef38078942beebe524ee171?pvs=4
If you haven't cloned the repository already, run:
git clone https://github.com/SamMakesThings/gdg-observability-workshop
cd gdg-observability-workshop
Follow the instructions in the Notion for each part of the exercise: https://wandbai.notion.site/GDG-Workshop-Instructions-Nov-9-24-138e2f5c7ef38078942beebe524ee171?pvs=4
-
Documentation:
-
Further Reading:
If you need help with any part of the session, refer to the completed solution files in the repo. Specifically, the evals_completed.py
and main_completed.py
files.
If you have questions during the workshop, please reach out to Sam Stowers, or open an issue in the repository.
Happy coding!