LLM Providers
The LLM (Large Language Model) provider powers the conversational intelligence of your avatar.
Supported Providers
| Provider | Models | Context Window | Best For |
|---|---|---|---|
| Google Gemini | gemini-3.0-flash, gemini-2.5-pro | 1M tokens | RAG, long documents |
| OpenAI | gpt-4o, gpt-4o-mini | 128K tokens | General purpose |
| Anthropic | claude-sonnet-4, claude-opus-4 | 200K tokens | Complex reasoning |
Feature Comparison
| Feature | Gemini | OpenAI | Anthropic |
|---|---|---|---|
| Streaming | ✅ | ✅ | ✅ |
| Function calling | ✅ | ✅ | ✅ |
| Vision/Images | ✅ | ✅ | ✅ |
| Native RAG | ✅ File Search | ❌ | ❌ |
| Max context | 1M tokens | 128K tokens | 200K tokens |
Pricing Comparison
Cost per 1 million tokens (approximate):
| Provider | Model | Input | Output |
|---|---|---|---|
| Gemini | 3.0-flash | $0.50 | $3.00 |
| Gemini | 2.5-pro | $1.25 | $5.00 |
| OpenAI | gpt-4o | $2.50 | $10.00 |
| OpenAI | gpt-4o-mini | $0.15 | $0.60 |
| Anthropic | claude-sonnet-4 | $3.00 | $15.00 |
| Anthropic | claude-haiku-3.5 | $0.80 | $4.00 |
Per-Session Cost Estimate
Based on ~500 input tokens + ~300 output tokens per interaction, 4 interactions per 3-minute session:
| Provider/Model | Cost per Session |
|---|---|
| Gemini Flash | ~$0.01 |
| GPT-4o Mini | ~$0.01 |
| GPT-4o | ~$0.05 |
| Claude Sonnet | ~$0.08 |
Choosing a Provider
Choose Gemini if:
- You want the best RAG support (native File Search)
- You need to process long documents (1M context)
- Cost efficiency is important
- You're starting fresh without existing infrastructure
Choose OpenAI if:
- You already have OpenAI infrastructure
- You need the GPT-4o ecosystem
- You want broad model selection
Choose Anthropic if:
- You need Claude's reasoning capabilities
- You have complex, nuanced conversations
- You prefer Anthropic's approach to safety
Configuration
See the individual provider pages for detailed setup: