Get Started

Getting Started

  • Create a Test Set: Login to the UI to create your first test set, or if you have a large CSV already, use our API directly. We provide instructions for Bulk Test Upload.

  • Select Evaluators: Login to the platform and to see your Evaluation Test Set and select your Evaluators.

  • Run Evaluations: Select a Test Set version, and click the Run button (or use the dropdown to run multiple iterations). will run each Test Case, fetching the assistant response from the configured model and run each evaluators on each response. Each Test Set version can only be run once, but to re-run the Test Set a duplicate version can be created

  • Analyze Results: Review the detailed results provided by Evaluators, identifying areas for improvement and success.

  • Create New Versions: When you have updated your model or identified new testing scenarios, you can create new versions though the API or SDKs.

  • Compare Across Versions: Leverage the comparison feature to evaluate how your models have progressed over different Test Set versions.

Last updated