DevArt keeps this article discoverable at a fast, self-canonical URL and links clearly to the original DEV publication.

This is a submission for the AssemblyAI Voice Agents Challenge for Domain Expert Voice Agent

🧠 What I Built

As a Philosophy graduate, I’ve always enjoyed discussing ideas that help make life more meaningful. So, for this challenge, I built a Philosophy Voice AI Agent using Flask, AssemblyAI, and Gemini API.

This voice-based web app allows users to ask philosophical questions and receive thoughtful spoken responses, making it feel like you're having a conversation with Socrates himself.

Tech Stack Used:

Flask: Core backend framework
Gemini API: To generate thoughtful philosophical replies
AssemblyAI: For transcribing voice to text asynchronously.
JavaScript: To handle voice recording and speech output
AWS EC2 & Nginx: For secure deployment and hosting

🔁 Application Workflow

User clicks Start Recording and speaks a question
The recorded audio is sent to AssemblyAI for transcription
The text is passed to Gemini API, which generates a philosophical reply
The response is rendered on the screen and also spoken aloud using JavaScript’s Speech Synthesis API

💻 Demo

The application is live at:
👉 https://philosophy.praveshsudha.com
It’s hosted on an AWS EC2 instance with Nginx as a reverse proxy.

Watch the full video walkthrough here 👇

The Video doesn't explain the Universal Streaming for AssemblyAI, the video was shot earlier 😅

📁 GitHub Repository

Pravesh-Sudha / dev-to-challenges

Registry to Store all my code related to Dev.TO Challenges

🏗️ Dev.to Challenges – by Pravesh Sudha

This repository contains my submissions for various Dev.to Challenges. Each folder in this repo includes a hands-on project built around specific tools, APIs, or themes — from infrastructure to frontend and AI voice agents.

📁 Projects

⚙️ `pulumi-challenge/`

An infrastructure-as-code project built using Pulumi.
It automates cloud infrastructure setup using Python and TypeScript across AWS services.

🎨 `frontend-challenge/`

A UI/UX-focused project that demonstrates creative frontend solutions using HTML, CSS, and JavaScript — optimized for responsiveness and accessibility.

📩 `postmark-challenge/`

A transactional email solution built with the Postmark API, showcasing email templates, delivery tracking, and webhook handling.

🧠 `philo-agent/`

A voice-based AI Philosopher built with AssemblyAI + Gemini — part of the World’s Largest Hackathon.

🗂️ Project Structure

dev-to-challenges/
│
├── pulumi-challenge/
├── frontend-challenge/
├── postmark-challenge/
├── philo-agent/
└── README.md

🙌 Why This Repo?

This repo is my playground to:

View on GitHub

Navigate to the philo-agent directory for all project files.

🔍 Folder & File Structure

app.py: Flask app entry point
services/transcription.py: Uses AssemblyAI for Universal-Streaming with domain-specific vocabulary for accurate philosophical speech recognition.
services/gemini.py: Fetches philosophical responses
static/: Contains frontend assets (JS, favicon, background image)
templates/index.html: HTML template with embedded CSS
venv/: Virtual environment
requirements.txt: All Python dependencies

🚀 Deployment with EC2 & Nginx

To make deployment easier, I wrote a simple bash script that:

Installs required packages
Sets up a Python virtual environment
Configures Gunicorn and Systemd
Creates an Nginx config
Secures the site using Let’s Encrypt SSL

Here's the full script:

#!/bin/bash

# Update system

sudo apt update -y
sudo apt upgrade -y

# Install Python, pip, venv, nginx, git

sudo apt install -y python3 python3-pip python3-venv nginx git

# Clone your GitHub project (REPLACE with your repo)

cd /home/ubuntu
git clone https://github.com/Pravesh-Sudha/dev-to-challenges.git
cd dev-to-challenges/philo-agent

# Set up Python virtual environment

python3 -m venv venv
source venv/bin/activate

# Install requirements

pip install -r requirements.txt
pip install gunicorn

# Test gunicorn (run once, ctrl+c after checking)

gunicorn -w 4 app:app --bind 0.0.0.0:8000

# Set up systemd service for gunicorn

sudo tee /etc/systemd/system/voiceapp.service > /dev/null <<EOF
[Unit]
Description=Gunicorn instance to serve Philosophy Voice App
After=network.target

[Service]
User=ubuntu
Group=www-data
WorkingDirectory=/home/ubuntu/dev-to-challenges/philo-agent
Environment="PATH=/home/ubuntu/dev-to-challenges/philo-agent/venv/bin"
ExecStart=/home/ubuntu/dev-to-challenges/philo-agent/venv/bin/gunicorn --workers 4 --bind 127.0.0.1:8000 app:app

[Install]
WantedBy=multi-user.target
EOF

# Enable and start the Gunicorn service

sudo systemctl daemon-reexec
sudo systemctl daemon-reload
sudo systemctl start voiceapp
sudo systemctl enable voiceapp

# Configure Nginx

sudo tee /etc/nginx/sites-available/voiceapp > /dev/null <<EOF
server {
    server_name philosophy.praveshsudha.com;

    location / {
    proxy_pass http://127.0.0.1:8000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_cache_bypass $http_upgrade;
    }

    location /static/ {
        alias /home/ubuntu/dev-to-challenges/philo-agent/static/;
    }

    client_max_body_size 20M;

    access_log /var/log/nginx/voiceapp_access.log;
    error_log /var/log/nginx/voiceapp_error.log;


    listen 443 ssl; # managed by Certbot
    ssl_certificate /etc/letsencrypt/live/philosophy.praveshsudha.com/fullchain.pem; # managed by Certbot
    ssl_certificate_key /etc/letsencrypt/live/philosophy.praveshsudha.com/privkey.pem; # managed by Certbot
    include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
    ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot

}
server {
    if ($host = philosophy.praveshsudha.com) {
        return 301 https://$host$request_uri;
    } # managed by Certbot


    listen 80;
    server_name philosophy.praveshsudha.com;
    return 404; # managed by Certbot


}
EOF

# Set correct permissions for all files
sudo chmod -R 755 /home/ubuntu/dev-to-challenges/philo-agent/static

# Make sure all files are owned by the same user running the app (usually ubuntu)
sudo chown -R ubuntu:ubuntu /home/ubuntu/dev-to-challenges/philo-agent/static
sudo chmod +x /home/ubuntu
sudo chmod +x /home/ubuntu/dev-to-challenges
sudo chmod +x /home/ubuntu/dev-to-challenges/philo-agent


# Enable Nginx config

sudo ln -s /etc/nginx/sites-available/voiceapp /etc/nginx/sites-enabled/
sudo rm /etc/nginx/sites-enabled/default
sudo nginx -t && sudo systemctl restart nginx

echo "✅ Deployment complete. Access your app via EC2 public IP!"

This setup helps run the Flask app efficiently behind a secure HTTPS connection.

🧠 AssemblyAI Integration

The transcription.py file streams audio from a WAV file and transcribes it in real time using AssemblyAI’s Universal-Streaming model. It is optimized for philosophical conversations by including a custom vocabulary of domain-specific terms (e.g., "Nietzsche", "epistemology").

Here’s a short snippet:

async def simulate_audio_stream(file_path, chunk_size=3200):
    with wave.open(file_path, 'rb') as wf:
        while True:
            data = wf.readframes(chunk_size)
            if not data:
                break
            yield data
            await asyncio.sleep(0.08) 

async def transcribe_audio_stream(file_path):
    config = aai.RealtimeConfig(
        language_code="en_us",
        custom_vocabulary=PHILOSOPHY_PHRASES,
        speech_model="universal-v2",
        disfluencies=False,
        punctuate=True
    )

    transcriber = aai.RealtimeTranscriber(config=config)
    transcript_text = ""

    async def on_data(transcript: aai.RealtimeTranscript):
        nonlocal transcript_text
        if isinstance(transcript, aai.RealtimeFinalTranscript):
            transcript_text += transcript.text + " "

    await transcriber.connect()
    transcriber.on("transcript", on_data)

    async for chunk in simulate_audio_stream(file_path):
        await transcriber.send(chunk)

    await transcriber.close()
    return transcript_text.strip()

I was genuinely impressed with how smoothly AssemblyAI worked—everything just clicked on the first try.

🧘🏻‍♂️ Conclusion

Thanks to Dev.to and AssemblyAI for hosting this challenge. It gave me the perfect reason to build a project that aligns with both my technical and philosophical interests.

With this project, I now have a digital buddy to discuss life, existence, and purpose.

If you found this useful, react, comment your thoughts, and follow me!

🔗 Connect with Me

🌐 Website: https://praveshsudha.com
🐦 Twitter/X: @praveshstwt
💼 LinkedIn: Pravesh Sudha
📺 YouTube: @pravesh-sudha