List of Trackable Metrics

Global

  • # conversations over time. E.g. Last week there were 13,173 conversations

  • # messages over time. E.g. Last week there were 82,263 messages

  • Average global user input sentiment over time. E.g. Last week the average sentiment of user inputs trended down

  • Average global user rating over time. E.g. Last week the user feedback rating trended down

  • # messages received in each language. E.g. Last week there were 872 messages in French

Per Custom Category

  • Classify messages and conversations into custom categories:

    • Pre-define a taxonomy of keywords, semantic meanings, user intents, or custom categories defined by LLM prompts

    • Context.ai will also suggest salient groups of messages

  • # conversations matching a specific semantic, keyword, or user intent category over time. E.g. Last week there were 2,920 messages about hair dyes

  • Average user input sentiment for a specific semantic, keyword, or user intent category over time. E.g. Last week the average sentiment of user inputs for conversations about hair dyes trended lower

  • Average user input sentiment for a specific semantic, keyword, or user intent category over time. E.g. Last week the average sentiment of user inputs for conversations about hair dyes trended lower

Per User

If User IDs are provided, Context.ai can report:

  • User retention rates per week. E.g. For the cohort who signed up on date X, we retained Y% of users after Z weeks

  • Topics of conversation discussed most often by new users. E.g. Last week new users most frequently discussed: hair dyes, dandruff, shampoo

  • Topics of conversation discussed most often by veteran users. E.g. Last week frequently engaged users most often discussed: product prices, product availability

  • Table of users reporting: user ID, # messages sent, # conversations, average sentiment, average feedback rating, first seen, last seen

Per Conversation

List all the conversations matching any of the following filter criteria:

  • User rating: positive, negative, neutral

  • User input sentiment: positive, negative, neutral

  • User input sentiment trending: upwards, downwards

  • Freetext feedback: positive, negative, neutral

  • Custom events: occurred or did not occur. Examples: conversion events, clicks, or other interactions

  • Category: any semantic, keyword, or intent category

  • Labels: any user-applied label

  • Ratings: any user-applied rating: 1 star to 5 stars

  • Custom metadata: any metadata provided with the transcripts, including: model, user ID, environment, experiment arm

Guardrails Validation

  • # non-compliant conversations

  • # conversations containing high risk keywords

  • % of all conversations that are non-compliant

  • % of all conversations that contain high risk keywords

Last updated