Afterwords — Local Voice-Cloning TTS

Setup

Five minutes to your first voice

The setup script checks your hardware, installs dependencies, walks you through cloning a voice from YouTube, and starts the server.

git clone https://github.com/adrianwedd/afterwords.git
cd afterwords
bash setup.shclick to copy

If Claude Code is installed, setup wires a Stop hook so every response is spoken aloud. Without it, you get a standalone TTS API at localhost:7860.

Voice Gallery

16 voices, each cloned from a 15-second clip

Each says “You are absolutely right. Your Claude Code session could sound like me.” — generated locally on an 8 GB M1.

Audrey

Audrey Hepburn, 1961

Aurora

AURORA, Shower Thoughts

Avasarala

Shohreh Aghdashloo, The Expanse

Bardem

Javier Bardem, Vicky Cristina Barcelona

Claudia

Claudia Black, Dragon Age

Depp

Johnny Depp, interview

Eartha

Eartha Kitt, interview

Galadriel

Cate Blanchett, LOTR

Picard

Patrick Stewart, Star Trek

Loki

Tom Hiddleston, Avengers

Marla

Helena Bonham Carter, Fight Club

Samantha

Scarlett Johansson, Her

Snape

Alan Rickman, Harry Potter

Spock

Leonard Nimoy, Star Trek

Tilda

Tilda Swinton, interview

Vesper

Eva Green, Casino Royale

Add your own:

bash clone-voice.sh "https://youtube.com/watch?v=..." myvoice 30click to copy

How It Works

Input meets output

You speak

→

/voice

→

Claude responds

→

Stop hook

→

TTS server

→

Speaker

/voice handles input. This project handles output. Together: voice conversations.

The server uses Qwen3-TTS (0.6B, 8-bit) on MLX. Zero-shot voice cloning — no training. A 15-second reference + transcript = cloned voice.

Why local? Nothing leaves your machine. No API key, no rate limits, no bill. The voice is yours.

API Reference

A plain HTTP interface

The server runs on localhost:7860. No authentication. Use it from curl, scripts, other editors, web apps — anything that speaks HTTP.

GET /health

Server status, loaded voices, and readiness.

curl localhost:7860/health | jq .click to copy

{
  "status": "ok",
  "model": "mlx-community/Qwen3-TTS-12Hz-0.6B-Base-8bit",
  "backend": "mlx",
  "model_loaded": true,
  "ready": true,
  "voices": ["audrey", "aurora", "avasarala", "bardem", ...],
  "default_voice": "galadriel"
}

200 OK

GET /synthesize

Generate speech from text. Returns 16-bit PCM WAV audio.

text required — string, max 5000 chars

voice optional — defaults to galadriel. Any name from /health

# Synthesize and play
curl "localhost:7860/synthesize?text=Hello+world&voice=snape" -o out.wav
afplay out.wav

# Pipe directly to speaker (macOS)
curl -s "localhost:7860/synthesize?text=Testing" | afplay -click to copy

Response includes timing headers: X-Synthesis-Time, X-Duration, X-Sample-Rate.

200 audio/wav 400 unknown voice 400 text empty / too long 503 warming up

CLI clone-voice.sh

Clone a new voice from a YouTube clip. The server auto-discovers new voices on restart.

# Interactive
bash clone-voice.sh

# Non-interactive (URL, name, start-second)
bash clone-voice.sh "https://youtube.com/watch?v=..." mycustomvoice 30

# Fully automated (skip transcript confirmation)
bash clone-voice.sh "https://youtube.com/watch?v=..." mycustomvoice 30 --yesclick to copy

Each voice is a 700 KB WAV + JSON profile in voices/. Adding voices costs zero extra memory.

Per-Project

Different voice, different repo

Drop a .afterwords file in any project root. The hook reads it before each synthesis — no server restart.

echo "galadriel" > ~/work/frontend/.afterwords
echo "snape"     > ~/work/backend/.afterwords
echo "loki"      > ~/fun/side-project/.afterwordsclick to copy

Performance

On an 8 GB M1

Model load

~5s cached

Per sentence

~20s

Peak memory

~6 GB

Adding a voice

0 extra RAM

Requirements

What you need

Hardware

Apple Silicon M1+

Memory

8 GB+ RAM

Python

3.11+

Disk

~2 GB

The setup script installs everything else. Claude Code is optional — use --server-only for the API without hooks.

Credits

Qwen3-TTS (Alibaba, Apache 2.0) · mlx-audio · MLX (Apple) · Claude Code (Anthropic)

Originally built for SPARK, a robot with an inner life. Full tutorial: Voice Cloning with Qwen3-TTS.

Give your codea voice

Give your code
a voice