Evaluating Custom Models

Learn how to use Context.ai to evaluate response from custom models.

Overview

A common evaluation usecase is evaluating custom fine-tuned models developed internally that are deployed behind a firewall or within a VPC. Context.ai supports evaluating generations from these models within the platform. Instead of running the generations within Context.ai, you can instead pre-generate responses from your internal models and then upload these as test cases for evaluation and visualization.

Instructions

To run evaluations against a pre-generated response for a test case, upload test cases to Context.ai as normal, only including an additional pregenerated_response field within the test case request payload. If the pregenerated_response field is set, then model is becomes an optional parameter.

Example

Last updated