Skip to main content

Deepgram

Deepgram provides real-time speech recognition with excellent latency and streaming support.

Setup

1. Get API Key

  1. Go to Deepgram
  2. Create an account (includes $200 free credit)
  3. Navigate to API Keys
  4. Create a new key

2. Configure Environment

STT_PROVIDER=deepgram
DEEPGRAM_API_KEY=...your-api-key

Configuration Options

# Required
STT_PROVIDER=deepgram
DEEPGRAM_API_KEY=...

# Model selection
DEEPGRAM_MODEL=nova-2 # Best model

# Features
DEEPGRAM_DEFAULT_LANGUAGE=en-US
DEEPGRAM_SMART_FORMAT=true
DEEPGRAM_PUNCTUATE=true
DEEPGRAM_DIARIZE=false # Speaker detection

Available Models

ModelBest ForAccuracy
nova-2General useHighest
novaPrevious genHigh
enhancedNoisy audioGood
baseCost savingsModerate

Pricing

MetricPay-as-you-goGrowth Plan
Per minute$0.0077$0.0065
Per hour$0.46$0.39
3-min session~$0.008~$0.007

Features

Real-time Streaming

Deepgram can transcribe audio in real-time as it's spoken:

// Audio streams in → Text streams out

Smart Formatting

Automatically formats:

  • Numbers (123 → "one hundred twenty-three")
  • Dates and times
  • Currency

Speaker Diarization

Identify different speakers in the conversation:

DEEPGRAM_DIARIZE=true

Troubleshooting

"Invalid credentials"

  • Verify your API key at Deepgram console
  • Check for expired keys

Poor accuracy

  • Use nova-2 model for best results
  • Ensure audio quality is good