title: Project Charles
emoji: 👀
colorFrom: gray
colorTo: green
sdk: streamlit
python_version: 3.11
sdk_version: 1.26.0
app_file: ui_app.py
pinned: true
license: mit
models: ["laion/CLIP-ViT-L-14-DataComp.XL-s13B-b90K"]

# Project Charles

Toy app for voice based agent

Video Demo -> [Early Test](https://www.linkedin.com/posts/sohojoe_ray-vosk-chatgpt-activity-7100365711226671104-c2Nv?utm_source=share&utm_medium)

## Required Environment Variables/Keys

* OPENAI_API_KEY - required for ChatGPT
* ELEVENLABS_API_KEY - required for ElevenLabs TTS

## Optional Environment Variables/Keys

* TWILIO_ACCOUNT_SID - reduces time for WebRTC connection
* TWILIO_AUTH_TOKEN - reduces time for WebRTC connection

## How to install

pip install -r requirements.txt

Install packages from packages.txt

macOS (Homebrew)
xargs brew install < packages.txt

Linux (Ubuntu, apt)
sudo xargs -a packages.txt apt-get install -y

Linux (Fedora, dnf)
sudo xargs -a packages.txt dnf install -y

Windows (Chocolatey)
Get-Content packages.txt | ForEach-Object { choco install $_ -y }

## How to run

streamlit run ui_app.py

## Known Issues

* First run maybe slow due to downloading of model. You may want to refresh the page after the first run.
* Audio errors may occur due to the way the app converts from ElevenLabs stream to WebRTC audio
* Audio error may happen if the server is running slow
* May hang and server needs a hard reset

## Architecture

![Image of the architecture](./images/ProjectCharlesCommunicationArchitecture.jpg)

Key Technologies:

* Ray Actors & Queues - backbone of interprocess communication
* Streamlit - UI & WebRTC connection
* Vosk - speech to text
* ChatGPT - text to text
* ElevenLabs TTS - text to speech
* Twilio - optional faster WebRTC connection