Question and Answer with OpenAI and RedisVL#

This example shows how to use RedisVL to create a question and answer system using OpenAI’s API.

In this notebook we will

  1. Download a dataset of wikipedia articles (thanks to OpenAI’s CDN)

  2. Create embeddings for each article

  3. Create a RedisVL index and store the embeddings with metadata

  4. Construct a simple QnA system using the index and GPT-3

  5. Improve the QnA system with LLM caching

The image below shows the architecture of the system we will create in this notebook.

Diagram

Setup#

In order to run this example, you will need to have a Redis Stack running locally (or spin up for free on Redis Cloud). You can do this by running the following command in your terminal:

docker run --name redis -d -p 6379:6379 -p 8001:8001 redis/redis-stack:latest

This will also provide the RedisInsight GUI at http://localhost:8001

Next, we will install the dependencies for this notebook.

# first we need to install a few things

%pip install pandas wget tenacity tiktoken openai==0.28.1
import wget
import pandas as pd

embeddings_url = 'https://cdn.openai.com/API/examples/data/wikipedia_articles_2000.csv'

wget.download(embeddings_url)
df = pd.read_csv('wikipedia_articles_2000.csv')
df = df.drop(columns=['Unnamed: 0'])
df.head()
id url title text
0 3661 https://simple.wikipedia.org/wiki/Photon Photon Photons (from Greek φως, meaning light), in m...
1 7796 https://simple.wikipedia.org/wiki/Thomas%20Dolby Thomas Dolby Thomas Dolby (born Thomas Morgan Robertson; 14...
2 67912 https://simple.wikipedia.org/wiki/Embroidery Embroidery Embroidery is the art of decorating fabric or ...
3 44309 https://simple.wikipedia.org/wiki/Consecutive%... Consecutive integer Consecutive numbers are numbers that follow ea...
4 41741 https://simple.wikipedia.org/wiki/German%20Empire German Empire The German Empire ("Deutsches Reich" or "Deuts...

Data Preparation#

Text Chunking#

In order to create embeddings for the articles, we will need to chunk the text into smaller pieces. This is because there is a maximum length of text that can be sent to the OpenAI API. The code that follows pulls heavily from this notebook by OpenAI

TEXT_EMBEDDING_CHUNK_SIZE = 1000
EMBEDDINGS_MODEL = "text-embedding-ada-002"


def chunks(text, n, tokenizer):
    tokens = tokenizer.encode(text)
    """Yield successive n-sized chunks from text.

    Split a text into smaller chunks of size n, preferably ending at the end of a sentence
    """
    i = 0
    while i < len(tokens):
        # Find the nearest end of sentence within a range of 0.5 * n and 1.5 * n tokens
        j = min(i + int(1.5 * n), len(tokens))
        while j > i + int(0.5 * n):
            # Decode the tokens and check for full stop or newline
            chunk = tokenizer.decode(tokens[i:j])
            if chunk.endswith(".") or chunk.endswith("\n"):
                break
            j -= 1
        # If no end of sentence found, use n tokens as the chunk size
        if j == i + int(0.5 * n):
            j = min(i + n, len(tokens))
        yield tokens[i:j]
        i = j

def get_unique_id_for_file_chunk(title, chunk_index):
    return str(title+"-!"+str(chunk_index))

def chunk_text(record, tokenizer):
    chunked_records = []

    url = record['url']
    title = record['title']
    file_body_string = record['text']

    """Return a list of tuples (text_chunk, embedding) for a text."""
    token_chunks = list(chunks(file_body_string, TEXT_EMBEDDING_CHUNK_SIZE, tokenizer))
    text_chunks = [f'Title: {title};\n'+ tokenizer.decode(chunk) for chunk in token_chunks]

    for i, text_chunk in enumerate(text_chunks):
        doc_id = get_unique_id_for_file_chunk(title, i)
        chunked_records.append(({"id": doc_id,
                                "url": url,
                                "title": title,
                                "content": text_chunk,
                                "file_chunk_index": i}))
    return chunked_records
# Initialise tokenizer
import tiktoken
oai_tokenizer = tiktoken.get_encoding("cl100k_base")

records = []
for _, record in df.iterrows():
    records.extend(chunk_text(record, oai_tokenizer))
chunked_data = pd.DataFrame(records)
chunked_data.head()
id url title content file_chunk_index
0 Photon-!0 https://simple.wikipedia.org/wiki/Photon Photon Title: Photon;\nPhotons (from Greek φως, mean... 0
1 Photon-!1 https://simple.wikipedia.org/wiki/Photon Photon Title: Photon;\nElementary particles 1
2 Thomas Dolby-!0 https://simple.wikipedia.org/wiki/Thomas%20Dolby Thomas Dolby Title: Thomas Dolby;\nThomas Dolby (born Thoma... 0
3 Embroidery-!0 https://simple.wikipedia.org/wiki/Embroidery Embroidery Title: Embroidery;\nEmbroidery is the art of d... 0
4 Consecutive integer-!0 https://simple.wikipedia.org/wiki/Consecutive%... Consecutive integer Title: Consecutive integer;\nConsecutive numbe... 0

Embedding Creation#

With the text broken up into chunks, we can create embeddings with the OpenAITextVectorizer. This provider uses the OpenAI API to create embeddings for the text. The code below shows how to create embeddings for the text chunks.

import os
import getpass

from redisvl.utils.vectorize import OpenAITextVectorizer

api_key = os.getenv("OPENAI_API_KEY") or getpass.getpass("Enter your OpenAI API key: ")
oaip = OpenAITextVectorizer(EMBEDDINGS_MODEL, api_config={"api_key": api_key})

chunked_data["embedding"] = oaip.embed_many(chunked_data["content"].tolist(), as_buffer=True, dtype="float32")
chunked_data
id url title content file_chunk_index embedding
0 Photon-!0 https://simple.wikipedia.org/wiki/Photon Photon Title: Photon;\nPhotons (from Greek φως, mean... 0 b'\x9e\xbf\xc9;\xca\x8e\xfb;\x00\xf8P\xbc\xe5\...
1 Photon-!1 https://simple.wikipedia.org/wiki/Photon Photon Title: Photon;\nElementary particles 1 b'd\xda#\xbc\xb7\xf1\x8c<\xea\xd0m\xbc\x13\x8b...
2 Thomas Dolby-!0 https://simple.wikipedia.org/wiki/Thomas%20Dolby Thomas Dolby Title: Thomas Dolby;\nThomas Dolby (born Thoma... 0 b'NG\xce\xbck\xf0\xb2;\x81\xed\xd7\xbc\xb6\x94...
3 Embroidery-!0 https://simple.wikipedia.org/wiki/Embroidery Embroidery Title: Embroidery;\nEmbroidery is the art of d... 0 b'\xa4\xba\xf5\xbcS\xf3\x02\xbc\xa1\x15O\xbc\x...
4 Consecutive integer-!0 https://simple.wikipedia.org/wiki/Consecutive%... Consecutive integer Title: Consecutive integer;\nConsecutive numbe... 0 b'0(\xfa\xbb\x81\xd2\xd9;\xaf\x92\x9a;\xd3FL\x...
... ... ... ... ... ... ...
2688 Alanis Morissette-!1 https://simple.wikipedia.org/wiki/Alanis%20Mor... Alanis Morissette Title: Alanis Morissette;\nTwin people from Ca... 1 b'Ii4\xbc\x8e>\xe0\xbc\x18]\x07\xbb%\xa0\x92\x...
2689 Brontosaurus-!0 https://simple.wikipedia.org/wiki/Brontosaurus Brontosaurus Title: Brontosaurus;\nBrontosaurus is a genus... 0 b'\xad\xa5\xdb\xbc\xa5\xa5\xba:\xb4"\x81\xbc\x...
2690 Work (physics)-!0 https://simple.wikipedia.org/wiki/Work%20%28ph... Work (physics) Title: Work (physics);\nIn physics, a force do... 0 b'\x97\x82\xb9\xbbL\x90d\xbc\xb7G\x9c\xba\x94g...
2691 Syllable-!0 https://simple.wikipedia.org/wiki/Syllable Syllable Title: Syllable;\nA syllable is a unit of pron... 0 b'\xe4\xa3\x1c:\x83g\x90<\x99=s;*[E\xbb\x10 "\...
2692 Syllable-!1 https://simple.wikipedia.org/wiki/Syllable Syllable Title: Syllable;\nGrammar 1 b'T,-\xbbS\xe5\x87;\x1c\x0f\x9d:\xc4\xd4\xcd:\...

2693 rows × 6 columns

Construct the SearchIndex#

Now that we have the embeddings, we can create a SearchIndex to store them in Redis. We will use the SearchIndex to store the embeddings and metadata for each article.

Define the wikipedia IndexSchema#

%%writefile wiki_schema.yaml

version: '0.1.0'

index:
    name: wikipedia
    prefix: chunk

fields:
    - name: content
      type: text
    - name: title
      type: text
    - name: id
      type: tag
    - name: embedding
      type: vector
      attrs:
          dims: 1536
          distance_metric: cosine
          algorithm: flat
Overwriting wiki_schema.yaml
import redis.asyncio as redis

from redisvl.index import AsyncSearchIndex
from redisvl.schema import IndexSchema


client = redis.Redis.from_url("redis://localhost:6379")
schema = IndexSchema.from_yaml("wiki_schema.yaml")

index = await AsyncSearchIndex(schema).set_client(client)

await index.create()
!rvl index listall
16:00:26 [RedisVL] INFO   Indices:
16:00:26 [RedisVL] INFO   1. wikipedia

Load the wikipedia dataset#

keys = await index.load(chunked_data.to_dict(orient="records"))

Build a simple QnA System#

Now that we have the data and the embeddings, we can build the QnA system. The system will perform three actions

  1. Embed the user question and search for the most similar content

  2. Make a prompt with the query and retrieved content

  3. Send the prompt to the OpenAI API and return the answer

import openai

from redisvl.query import VectorQuery
CHAT_MODEL = "gpt-3.5-turbo"

def make_prompt(query, content):
    retrieval_prompt = f'''Use the content to answer the search query the customer has sent.
    If you can't answer the user's question, do not guess. If there is no content, respond with "I don't know".

    Search query:

    {query}

    Content:

    {content}

    Answer:
    '''
    return retrieval_prompt

async def retrieve_context(index: AsyncSearchIndex, query: str):
    # Embed the query
    query_embedding = await oaip.aembed(query)

    # Get the top result from the index
    vector_query = VectorQuery(
        vector=query_embedding,
        vector_field_name="embedding",
        return_fields=["content"],
        num_results=1
    )

    results = await index.query(vector_query)
    content = ""
    if len(results) > 1:
        content = results[0]["content"]
    return content

async def answer_question(index: AsyncSearchIndex, query: str):
    # Retrieve the context
    content = await retrieve_context(index, query)

    prompt = make_prompt(query, content)
    retrieval = await openai.ChatCompletion.acreate(
        model=CHAT_MODEL,
        messages=[{'role':"user", 'content': prompt}],
        max_tokens=50
    )

    # Response provided by GPT-3.5
    return retrieval['choices'][0]['message']['content']
import textwrap

question = "What is a Brontosaurus?"
textwrap.wrap(await answer_question(index, question), width=80)
['A Brontosaurus, also known as Apatosaurus, is a type of large, long-necked',
 'dinosaur that lived during the Late Jurassic Period, about 150 million years',
 'ago. They were herbivores and belonged to the saurop']
# Question that makes no sense
question = "What is a trackiosamidon?"
await answer_question(index, question)
"I don't know."
question = "Tell me about the life of Alanis Morissette"
textwrap.wrap(await answer_question(index, question))
['Alanis Morissette is a Canadian-American singer-songwriter and',
 'actress. She gained international fame with her third studio album,',
 '"Jagged Little Pill," released in 1995. The album went on to become a',
 'massive success, selling over']

Improve the QnA System with LLM caching#

The QnA system we built above is pretty good, but we can use the SemanticCache to improve the throughput and stability. The SemanticCache will store the results of previous queries and return them if the query is similar enough to a previous query. This will reduce the number of round trip queries we need to send to the OpenAI API.

Note this technique will work assuming we expect a similar profile of queries to be asked.

from redisvl.extensions.llmcache import SemanticCache

cache = SemanticCache(name="qna_cache", redis_url="redis://localhost:6379", distance_threshold=0.2)
async def answer_question(index: AsyncSearchIndex, query: str):

    # check the cache
    if result := cache.check(prompt=query):
        return result[0]['response']

    # Retrieve the context
    content = await retrieve_context(index, query)

    prompt = make_prompt(query, content)
    retrieval = await openai.ChatCompletion.acreate(
        model=CHAT_MODEL,
        messages=[{'role':"user", 'content': prompt}],
        max_tokens=500
    )

    # Response provided by GPT-3.5
    answer = retrieval['choices'][0]['message']['content']

    # cache the query_embedding and answer
    cache.store(query, answer)
    return answer
# ask a question to cache an answer
import time
start = time.time()
question = "Tell me about the life of Alanis Morissette"
answer = await answer_question(index, question)
print(f"Time taken: {time.time() - start}\n")
textwrap.wrap(answer, width=80)
Time taken: 6.253775119781494
['Alanis Morissette is a Canadian singer, songwriter, and actress. She was born on',
 'June 1, 1974, in Ottawa, Ontario, Canada. Morissette began her career in the',
 'music industry as a child, releasing her first album "Alanis" in 1991. However,',
 'it was her third studio album, "Jagged Little Pill," released in 1995, that',
 'brought her international fame and critical acclaim. The album sold over 33',
 'million copies worldwide and produced hit singles such as "You Oughta Know,"',
 '"Ironic," and "Hand in My Pocket."  Throughout her career, Morissette has',
 'continued to release successful albums and has received numerous awards,',
 'including Grammy Awards, Juno Awards, and Billboard Music Awards. Her music',
 'often explores themes of love, relationships, self-discovery, and spirituality.',
 'Some of her other notable albums include "Supposed Former Infatuation Junkie,"',
 '"Under Rug Swept," and "Flavors of Entanglement."  In addition to her music',
 'career, Alanis Morissette has also ventured into acting. She has appeared in',
 'films such as "Dogma" and "Radio Free Albemuth," as well as on television shows',
 'like "Weeds" and "Sex and the City."  Offstage, Morissette has been open about',
 'her struggles with mental health and has become an advocate for mental wellness.',
 'She has also expressed her views on feminism and spirituality in her music and',
 'interviews.  Overall, Alanis Morissette has had a successful and influential',
 'career in the music industry, with her powerful and emotional songs resonating',
 'with audiences around the world.']
# Same question, return cached answer, save time, save money :)
start = time.time()
answer = await answer_question(index, question)
print(f"Time taken with cache: {time.time() - start}\n")
textwrap.wrap(answer, width=80)
Time taken with cache: 0.3175082206726074
['Alanis Morissette is a Canadian-American singer, songwriter, and actress. She',
 'rose to fame in the 1990s with her breakthrough album "Jagged Little Pill,"',
 'which became one of the best-selling albums of all time. Born on June 1, 1974,',
 'in Ottawa, Ontario, Morissette began her career as a teen pop star in Canada',
 'before transitioning to alternative rock.  Throughout her career, Morissette has',
 'released several successful albums and has won numerous awards, including',
 'multiple Grammy Awards. Her music often explores themes of female empowerment,',
 'personal introspection, and social commentary. Some of her notable songs include',
 '"Ironic," "You Oughta Know," and "Hand in My Pocket."   In addition to her music',
 'career, Morissette has also acted in various films and television shows. She is',
 'known for her roles in movies such as "Dogma" and "Jay and Silent Bob Strike',
 'Back."  Morissette has been transparent about her personal struggles, including',
 'her experiences with eating disorders, depression, and postpartum depression.',
 'She has used her platform to advocate for mental health awareness and has been',
 'involved in various charitable causes.  Overall, Alanis Morissette has had a',
 'successful and influential career in the music industry while also making an',
 'impact beyond music.']
# ask a semantically similar question returns the same answer from the cache
# but isn't exactly the same question. In this case, the semantic similarity between
# the questions is greater than the threshold of 0.8 the cache is set to.
start = time.time()
question = "Who is Alanis Morissette?"
answer = await answer_question(index, question)
print(f"Time taken with the cache: {time.time() - start}\n")
textwrap.wrap(answer, width=80)
Time taken with the cache: 0.26262593269348145
['Alanis Morissette is a Canadian-American singer, songwriter, and actress. She',
 'rose to fame in the 1990s with her breakthrough album "Jagged Little Pill,"',
 'which became one of the best-selling albums of all time. Born on June 1, 1974,',
 'in Ottawa, Ontario, Morissette began her career as a teen pop star in Canada',
 'before transitioning to alternative rock.  Throughout her career, Morissette has',
 'released several successful albums and has won numerous awards, including',
 'multiple Grammy Awards. Her music often explores themes of female empowerment,',
 'personal introspection, and social commentary. Some of her notable songs include',
 '"Ironic," "You Oughta Know," and "Hand in My Pocket."   In addition to her music',
 'career, Morissette has also acted in various films and television shows. She is',
 'known for her roles in movies such as "Dogma" and "Jay and Silent Bob Strike',
 'Back."  Morissette has been transparent about her personal struggles, including',
 'her experiences with eating disorders, depression, and postpartum depression.',
 'She has used her platform to advocate for mental health awareness and has been',
 'involved in various charitable causes.  Overall, Alanis Morissette has had a',
 'successful and influential career in the music industry while also making an',
 'impact beyond music.']
# Cleanup
await index.delete()