Deepgram

Deepgram provides real-time speech recognition with excellent latency and streaming support.

Setup

1. Get API Key

Go to Deepgram
Create an account (includes $200 free credit)
Navigate to API Keys
Create a new key

2. Configure Environment

STT_PROVIDER=deepgram
DEEPGRAM_API_KEY=...your-api-key

Configuration Options

# Required
STT_PROVIDER=deepgram
DEEPGRAM_API_KEY=...

# Model selection
DEEPGRAM_MODEL=nova-2              # Best model

# Features
DEEPGRAM_DEFAULT_LANGUAGE=en-US
DEEPGRAM_SMART_FORMAT=true
DEEPGRAM_PUNCTUATE=true
DEEPGRAM_DIARIZE=false             # Speaker detection

Available Models

Model	Best For	Accuracy
`nova-2`	General use	Highest
`nova`	Previous gen	High
`enhanced`	Noisy audio	Good
`base`	Cost savings	Moderate

Pricing

Metric	Pay-as-you-go	Growth Plan
Per minute	$0.0077	$0.0065
Per hour	$0.46	$0.39
3-min session	~$0.008	~$0.007

Features

Real-time Streaming

Deepgram can transcribe audio in real-time as it's spoken:

// Audio streams in → Text streams out

Smart Formatting

Automatically formats:

Numbers (123 → "one hundred twenty-three")
Dates and times
Currency

Speaker Diarization

Identify different speakers in the conversation:

DEEPGRAM_DIARIZE=true

Deepgram

Setup

1. Get API Key

2. Configure Environment

Configuration Options

Available Models

Pricing

Features

Real-time Streaming

Smart Formatting

Speaker Diarization

Troubleshooting

"Invalid credentials"

Poor accuracy

Setup​

1. Get API Key​

2. Configure Environment​

Configuration Options​

Available Models​

Pricing​

Features​

Real-time Streaming​

Smart Formatting​

Speaker Diarization​

Troubleshooting​

"Invalid credentials"​

Poor accuracy​

Setup

1. Get API Key

2. Configure Environment

Configuration Options

Available Models

Pricing

Features

Real-time Streaming

Smart Formatting

Speaker Diarization

Troubleshooting

"Invalid credentials"

Poor accuracy