Superhero Help Desk App powered by Cohere and Chroma Database

Cohere Tutorial: Building a Simple Help Desk App for Superheroes

Introduction

Introducing the Cohere Platform Cohere is a robust platform that provides access to state-of-the-art natural language processing models via a user-friendly API. This platform enables developers to seamlessly integrate a variety of natural language processing tasks into their applications, such as text classification, embeddings, and even text generation.

Beyond its standard offerings, Cohere also provides the ability to create custom models tailored to specific use cases. You can leverage your own training data and strategically dictate how this data should be utilized during the training process.

One of the stand-out features of Cohere is its playground - a space where you can explore and experiment with the various facets of the platform. Whether you're aiming to generate human-like text, classify text into predefined categories, or measure the semantic similarity between different pieces of text, the playground provides a conducive environment for experimentation and learning.

Cohere's capabilities make it an ideal tool for a wide array of applications. If you're building a chatbot, a content recommendation system, a text classification tool, or any application that requires understanding or generating text, Cohere can prove to be an invaluable asset.

Introduction to Chroma and Embeddings

Chroma is an open-source database specifically designed for the efficient storage and retrieval of embeddings, a crucial component in the development of AI-powered applications and services, particularly those utilizing Large Language Models (LLMs). Chroma's design is centered around simplicity and developer productivity, providing tools for storing and querying embeddings, as well as for embedding documents.

Developers can interact with Chroma through its Python client SDK, Javascript/Typescript client SDK, or a server application. The database can operate in-memory or in client/server mode, with additional support for notebook environments.

But what are embeddings? In the realm of AI, and more specifically within machine learning and natural language processing, an 'embedding' is a representation of data in a vector space. Word embeddings, for example, represent words as high-dimensional vectors, with similar words occupying close proximity in this vector space. Embeddings are highly favored in machine learning models because they allow these models to understand the semantic content of data. In natural language processing, embeddings empower models to comprehend the meaning of words based on their context within a sentence or a document.

These embeddings are usually generated by training a model on a vast amount of data. The model learns to associate each piece of data (like a word) with a specific point in a high-dimensional space. Once the model is trained, it can generate an embedding for any given piece of data. Chroma takes advantage of embeddings to represent documents or queries in a manner that encapsulates their semantic content. These embeddings can then be efficiently stored in the database and searched, providing a powerful tool for managing and leveraging high-dimensional data.

Prerequisites

  • Basic knowledge of Python
  • Access to Cohere API
  • A Chroma database set up

Outline

  1. Initializing the Project
  2. Setting Up the Required Libraries
  3. Writing the Project Files
  4. Testing the Help Desk App
  5. Setting Up Chroma Database
  6. Testing the Help Desk App
  7. Discussion

Initializing the Project

Having covered the introductions, it's time to delve into the practical part - let's start coding! Our project will be named chroma-cohere. Open your preferred terminal, navigate to your development projects directory, and create a new directory for our project.

Next, we're going to create a new virtual environment specifically for this project. Creating and using virtual environments in Python development is considered a best practice. It isolates the dependencies of our current project from the global environment and from other Python projects, preventing any potential conflicts.

To create a virtual environment, use the following command:

python -m venv env

Once the virtual environment is created, we need to activate it. The process differs depending on your operating system.

If you're using Windows, enter the following command in your terminal:

.
\env\Scripts\activate

If you're on Linux or MacOS, use this command:

source env/bin/activate

After running the appropriate command, you should see the name of your environment (in this case, env) appear in parentheses at the start of your terminal prompt, indicating it's activated and ready for use!

Setting Up the Required Libraries

In this step, we will install all the libraries required by our project. Firstly, ensure that your virtual environment is activated. Then, here's a quick rundown of the libraries we'll be installing:

  • cohere library: We'll use the Cohere SDK to classify user input based on training examples.
  • chromadb library: We'll use ChromaDB to store expansive training data and retrieve it based on semantic similarities with user input.
  • halo library: Requests to Cohere's API will take a moment, and this library provides an engaging loading indicator while users wait.

Let's proceed with installing these libraries:

pip install cohere chromadb halo

After installing these libraries, we can start working on our project files.

Writing the Project Files

Time to get back to the code! Open your preferred IDE or code editor and create a new file named main.py. In this file, we'll implement several components to achieve our objectives.

Step 1. Import Necessary Libraries

We start by importing the necessary libraries such as cohere, halo, os, dotenv, colorama, and pprint. Then, we load the environment variables stored in a .env file, which will contain sensitive information like API keys.

Step 2. Define Response Generation Function

This function receives user messages as input, generates a loading animation, and initializes the Cohere API client. It classifies the user's mood and the responsible department based on inputs and stops the loading animation.

Step 3. Define the Classification Functions (Mood and Department in Charge)

The get_department_classification and get_mood_classification functions classify user messages into categories using example objects sent as requests to the Cohere model for prediction.

Step 4. Define the Project's Entrypoint

The main function begins an infinite loop that asks for user input and generates a response using the generation function, only breaking on user input 'quit'.

Setting Up Chroma Database

To solve our problem, we will be using ChromaDB. Let's begin by importing chromadb and necessary libraries to manage embeddings efficiently.

We initiate the database along with embedding functions and proceed to replace hard-coded examples in classification functions with dynamic results from Chroma.

Testing the Help Desk App

Put the Help Desk App to the test! You will classify the mood associated with inquiries and allocate them to appropriate departments.

Assuming the COHERE_KEY is properly configured, input queries to classify moods and test improvements with Chroma-powered examples.

Conclusion

We delved into the powerful capabilities of the Cohere platform paired with Chroma database. This tutorial showcases how to develop an intelligent app capable of evolving based on user interactions.

Back to blog

Leave a comment