Conversation Cost Estimation
Context.ai can estimate the LLM API costs for ingested conversations. Every model adopts a different pricing methodology and/or tokenization configuration along with unit price making it difficult to determine relative costs of models. Conversation cost estimation enables you to observe conversation transcript costs in a single interface for multiple model providers.
To start using transcript cost estimation, add a
model
metadata key-value pair to ingested transcripts. The model
metadata must contain a string from the list of supported models.{
"conversation": {
"messages": [ ... ],
"metadata": {
"model": "gpt-4-32k"
}
}
}
Once conversations are ingested, an estimated cost will be visible under the "Transcripts" tab within the Context UI.
Conversation Cost Estimation uses the
model
key in the metadata attached to ingested conversations to select configurations for pricing. The model
value determines the pricing methods (token or character counts), tokenization configuration and associated unit price.When processing
system
or user
messages we assume the full previous context window was also sent to the LLM for that conversation (i.e. every previous message in that conversation). Each assistant
message is priced individually.As an example consider the following conversation.
{
"conversation": {
"messages": [
{
"role": "user",
"message": "When does the next train to Bristol depart?",
},
{
"role": "assistant",
"message": "The next train to Bristol departs from London Paddington at 15:03."
},
{
"role": "user",
"message": "What is the earliest train tomorrow morning?",
},
{
"role": "assistant",
"message": "The first morning train to Bristol departs 05:32."
}
],
"metadata": {
"model": "gpt-3.5-turbo-16k"
}
}
}
The first user and assistant messages will be priced individually. The follow up user question will be priced using a concatenation with the previous two messages:
"When does the next train to Bristol depart? The next train to Bristol departs from London Paddington at 15:03. What is the earliest train tomorrow morning?"
.The final assistant message will again be priced individually.
gpt-3.5-turbo-4k davinci-002 claude-1
gpt-3.5-turbo-16k babbage-002 claude-2
gpt-4-8k curie claude-instant
gpt-4-32k chat-bison
gpt-4-1106-preview
gpt-3.5-turbo-1106