Detailed API Costs
This page provides detailed pricing for all API services used by SmarterAvatar.
Avatar Streaming - HeyGen LiveAvatar
HeyGen is the primary cost driver for most deployments.
Credit Pricing
| Tier | Monthly Cost | Credits | Per Credit |
|---|---|---|---|
| Pro | $99 | 100 | $0.99 |
| Scale | $330 | 660 | $0.50 |
| Enterprise | Custom | Custom | Negotiated |
Streaming Rates
API streaming: 1 credit = 5 minutes
| Duration | Pro Tier | Scale Tier |
|---|---|---|
| 30 sec (minimum) | $0.099 | $0.05 |
| 1 minute | $0.198 | $0.10 |
| 2 minutes | $0.396 | $0.20 |
| 3 minutes | $0.594 | $0.30 |
| 5 minutes | $0.99 | $0.50 |
| 10 minutes | $1.98 | $1.00 |
Billing Details
- Minimum charge: 30 seconds per session
- After minimum: billed per second
- Credits expire 30 days after issue
Speech-to-Text
OpenAI Whisper
| Metric | Cost |
|---|---|
| Per minute | $0.006 |
| Per hour | $0.36 |
| Billing increment | Per second |
| Minimum charge | None |
Deepgram Nova-2
| Plan | Per Minute | Per Hour |
|---|---|---|
| Pay-as-you-go | $0.0077 | $0.46 |
| Growth | $0.0065 | $0.39 |
Free tier: $200 credit (never expires)
AWS Transcribe
If using hybrid architecture with AWS:
| Volume/Month | Per Minute |
|---|---|
| First 250K min | $0.024 |
| 250K - 1M min | $0.015 |
| 1M - 5M min | $0.0102 |
| 5M+ min | $0.0078 |
LLM Providers
Google Gemini
| Model | Input/1M tokens | Output/1M tokens |
|---|---|---|
| gemini-3.0-flash | $0.50 | $3.00 |
| gemini-2.5-flash | $0.15 | $0.60 |
| gemini-2.5-pro | $1.25 | $5.00 |
RAG (File Search):
- Storage: FREE
- Indexing: $0.15 per 1M tokens (one-time)
- Query embeddings: FREE
OpenAI
| Model | Input/1M tokens | Output/1M tokens |
|---|---|---|
| gpt-4o-mini | $0.15 | $0.60 |
| gpt-4o | $2.50 | $10.00 |
| gpt-4-turbo | $10.00 | $30.00 |
Anthropic
| Model | Input/1M tokens | Output/1M tokens |
|---|---|---|
| claude-3-5-haiku | $0.80 | $4.00 |
| claude-sonnet-4 | $3.00 | $15.00 |
| claude-opus-4 | $15.00 | $75.00 |
AWS Bedrock (Claude)
Same as Anthropic direct, accessed through AWS:
| Model | Input/1M tokens | Output/1M tokens |
|---|---|---|
| Claude 3.5 Sonnet | $3.00 | $15.00 |
| Claude 3 Haiku | $0.25 | $1.25 |
Session Cost Examples
Example 1: Budget Configuration
3-minute session, 4 interactions
| Component | Calculation | Cost |
|---|---|---|
| HeyGen (Scale) | 3 min × $0.10 | $0.30 |
| Whisper | 1 min speech × $0.006 | $0.006 |
| Gemini Flash | ~4K tokens | $0.01 |
| Total | $0.316 |
Example 2: Premium Configuration
3-minute session, 4 interactions
| Component | Calculation | Cost |
|---|---|---|
| HeyGen (Scale) | 3 min × $0.10 | $0.30 |
| Deepgram | 1 min speech × $0.0077 | $0.008 |
| Claude Sonnet | ~4K tokens | $0.08 |
| Total | $0.388 |
Example 3: AWS Hybrid Configuration
3-minute session, using your Bedrock
| Component | Calculation | Cost |
|---|---|---|
| HeyGen (Scale) | 3 min × $0.10 | $0.30 |
| AWS Transcribe | 1 min × $0.024 | $0.024 |
| Bedrock (Claude) | ~4K tokens | $0.08 |
| Total | $0.404 |
Cost Optimization Tips
1. Use Scale Tier for HeyGen
Pro: $0.198/min → Scale: $0.10/min = 49% savings
2. Choose Efficient LLM
Gemini Flash: $0.01/session vs Claude: $0.08/session = 87% savings
3. Optimize Session Length
Set reasonable timeouts to avoid idle sessions burning credits.
4. Cache Common Responses
Use response overrides for FAQs to skip LLM calls entirely.
5. Monitor Usage
Use the admin dashboard to track costs and identify optimization opportunities.