Deepgram
Deepgram provides real-time speech recognition with excellent latency and streaming support.
Setup
1. Get API Key
- Go to Deepgram
- Create an account (includes $200 free credit)
- Navigate to API Keys
- Create a new key
2. Configure Environment
STT_PROVIDER=deepgram
DEEPGRAM_API_KEY=...your-api-key
Configuration Options
# Required
STT_PROVIDER=deepgram
DEEPGRAM_API_KEY=...
# Model selection
DEEPGRAM_MODEL=nova-2 # Best model
# Features
DEEPGRAM_DEFAULT_LANGUAGE=en-US
DEEPGRAM_SMART_FORMAT=true
DEEPGRAM_PUNCTUATE=true
DEEPGRAM_DIARIZE=false # Speaker detection
Available Models
| Model | Best For | Accuracy |
|---|---|---|
nova-2 | General use | Highest |
nova | Previous gen | High |
enhanced | Noisy audio | Good |
base | Cost savings | Moderate |
Pricing
| Metric | Pay-as-you-go | Growth Plan |
|---|---|---|
| Per minute | $0.0077 | $0.0065 |
| Per hour | $0.46 | $0.39 |
| 3-min session | ~$0.008 | ~$0.007 |
Features
Real-time Streaming
Deepgram can transcribe audio in real-time as it's spoken:
// Audio streams in → Text streams out
Smart Formatting
Automatically formats:
- Numbers (123 → "one hundred twenty-three")
- Dates and times
- Currency
Speaker Diarization
Identify different speakers in the conversation:
DEEPGRAM_DIARIZE=true
Troubleshooting
"Invalid credentials"
- Verify your API key at Deepgram console
- Check for expired keys
Poor accuracy
- Use
nova-2model for best results - Ensure audio quality is good