LlamaIndex Support Assistant

British Airways RAG support agent with evaluations

LlamaIndex is a data framework for LLM applications with a strong emphasis for context augmentation (i.e. Retrieval Augmented Generation) applications. In this cookbook we will build a simple RAG Agent for British Airways using some data scraped from their site.

To get started we will create a new Python project with a main.py file where we will store all our code and a data directory where we will store the ba-docs.csv. Nothing too complex!

We will need to import some LlamaIndex classes for vector stores, directory readers, prompt templates and an OpenAI client.

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, ChatPromptTemplate
from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.llms.openai import OpenAI
import os


The next step is to load our csv in the data directory and create a vector index. We will also instantiate our LLM of choice.

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)

llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)

For our assistant we will use a custom prompt template. The user query and the relevant context will be injected into the user message.

chat_msgs = [
            "You are a support agent for British Airways.\n\n"
            "Be courteous and professional. "
            "Only answer questions based on the context provided."
            "Context information is below.\n"
            "Given the context information and not prior knowledge, "
            "answer the question: {query_str}\n"
chat_template = ChatPromptTemplate(chat_msgs)

We can see the prompt in action using the format_messages method directly.

print(chat_template.format_messages(context_str="<boarding process>", query_str="What is the boarding process?"))
        role=<MessageRole.SYSTEM: 'system'>,
        content='You are a support agent for British Airways.\n\nBe courteous and professional. Only answer questions based on the context provided.', additional_kwargs={}
        role=<MessageRole.USER: 'user'>,
        content='Context information is below.\n---------------------\n<boarding process>\n---------------------\nGiven the context information and not prior knowledge, answer the question: What is the boarding process?\n', additional_kwargs={}

The last step is to create a query_engine using the template and llm we defined earlier.

query_engine = index.as_query_engine(llm=llm, template=chat_template)

Great! Now let's do some sanity checks with sample inputs.

user_queries = [
    {'name': 'Boarding', 'query': 'What is the boarding process?'},
    {'name': 'Baggage', 'query': 'What is the baggage policy?'},
    {'name': 'Political Party', 'query': 'What political party should I vote for?'}
results = [{**user_query, 'response': query_engine.query(user_query['query'])} for user_query in user_queries]

Customers who need extra time to get settled in will still be able to board first, followed by group 1, 2 and so on, in order. Please take note of your boarding group number before you arrive at the gate, so you know when it’s your turn, and keep an ear out for the announcements.

The baggage policy states that baggage can only be through checked to the destination shown on the ticket. Baggage cannot be through checked to the final destination if holding separate tickets. Most countries require clearing baggage through customs at the first entry point if traveling onwards within the same country.

I'm here to provide information and assistance based on the context provided. If you have any questions related to travel, baggage, security procedures, or customs regulations, feel free to ask for guidance.

All looks good! This toy example only has 3 test cases, if wanted to do more robust evaluations or do automatic testing whenever we make a change to the prompt we would need a more automated process. Let's set up an automatic evaluation using Context.ai!

from getcontext import ContextAPI
from getcontext.token import Credential
from getcontext.generated.models import TestSet, TestCase, TestCaseMessage, TestCaseMessageRole, TestCaseFrom, Evaluator

context_token = "GETCONTEXT_TOKEN"
context = ContextAPI(credential=Credential(context_token))

We use the chat_template from before to generate the messages with the RAG inputs formatted. We then convert them into Context messages. We specify the attempts_anser or refuse_answer evaluator and add them to a list of test_cases.

test_cases = []
for result in results:
    llama_messages = chat_template.format_messages(context_str=result['response'].source_nodes[0].node.text, query_str=result['query'])
    context_message = [TestCaseMessage(message=message.content, role=message.role) for message in llama_messages]
    evaluators = [Evaluator(evaluator="attempts_answer")] if result['name'] != 'Political Party' else [Evaluator(evaluator="refuse_answer")]

The final step is to upload the test cases to Context.

# Upload the generated text to Context.ai evaluations
    copy_test_cases_from=TestCaseFrom.NONE, # ignore test cases from previous test set version
    body=TestSet(name='British Airways', test_cases=test_cases)

If we navigate to the Context.ai test sets page with.context.ai/evaluations/sets and run the evaluation we can see that all test cases pass, success!!

If you have any feedback, please let us know by emailing henry@context.ai

Last updated