The Prompt Evaluator is a test suite that helps evaluate prompt templates and AI models. It enables Product Managers and Developers to create prompt templates with custom variables, define test cases with specific variable values and expected responses, and match the generated responses exactly or fuzzily. The suite also allows for comparing GraphQL query responses and measuring the accuracy of prompt templates against different AI models. By leveraging the capabilities of the Prompt Evaluator, Product Managers and Developers can make informed decisions, iterate on their prompt designs, and enhance the overall quality and accuracy of their AI-powered applications.
-
Experiments - The experiment feature in our product allows users to create collections of prompt templates. Users can define their own conversations with various roles and prompts, incorporating variables where necessary. Users can evaluate the performance of prompts by executing them with different OpenAI models and associated test cases.
-
Prompt Templates - Prompt templates are the building blocks of an Experiment which allow users to define their own prompts. They are highly customizable, allowing users the flexibility to modify the content, format, and variables according to their requirements.
-
Test Cases - These are the cases on which the accuracy of a prompt is evaluated. Users can define their own test cases and associate them with prompts. Test cases can be defined as a list of inputs and expected outputs.
By running prompt templates with different models and test cases, users gain valuable insights into the performance and suitability of their prompts for different scenarios. For detailed information on the features, please refer to the product guide.
Prompt Evaluator has two components:
This is the frontend component of the Prompt Evaluator tool. It is built using Next.js. The backend is built using Django and MongoDB. The frontend and backend communicate with each other using the GraphQL API. Basic authentication is implemented for authentication in backend API calls. It is a standalone application that can be deployed separately.
- Next.js
- React
- Apollo Client
- GraphQL
Follow the instructions below for installation:
- Install all the dependencies required for the project by running the following command
npm install
- Go to the project directory and copy the contents of
.env.sample
file in.env
file and add the values for all env variables.
cd prompt-eval-fe
# For Linux/macOS
cp .env.sample .env
# For Windows
copy .env.sample .env
- The value of
NEXT_PUBLIC_API_BASE_URL
in the .env file should be the base URL of the Prompt eval API Server followed by/graphql
.
- Run the server using the following command
npm run dev
- Provide the username and password from your .env file when the alert window pops up.
- Cloudflare Account
- List of Environment variables to be configured
- Create a fork of our github
prompt-eval-fe
repository into your Github Organization. - Ensure that you have a Cloudflare account and a DNS Record present.
Kindly follow this article to setup the Frontend.
Note: You would need to add the following environment vars:-
NEXT_PUBLIC_API_BASE_URL
should be set to https://<backend-url-endpoint>/graphql
NODE_VERSION
should be set to 18.16.0
. This can be changed based on your version.
That’s it! You will be able to access the webpage using the url provided by Clouflare Pages.
We welcome more helping hands to make Prompt Evaluator better. Feel free to report issues, raise PRs for fixes & enhancements. We are constantly working towards addressing broader, more generic issues to provide a clear and user-centric solution that unleashes your full potential. Stay tuned for exciting updates as we continue to enhance our tool.
Built with ❤️ by True Sparrow