Skip to main content

Data Flow

Understanding how data flows through SmarterAvatar helps with debugging, optimization, and customization.

User Interaction Flow

┌──────────┐     ┌──────────┐     ┌──────────┐     ┌──────────┐
│ User │────▶│ Client │────▶│ API │────▶│Providers │
│ Input │ │ App │ │ Layer │ │ │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Local │ │ Database │ │ External │
│ State │ │(Postgres)│ │ APIs │
└──────────┘ └──────────┘ └──────────┘

Voice Input Flow

When a user speaks:

  1. Browser captures audio via Web Audio API
  2. Client sends audio to /api/transcribe endpoint
  3. STT Provider (Whisper/Deepgram) transcribes audio to text
  4. Text is processed same as typed input
  5. Transcript is logged to database (if enabled)
[Microphone] → [Audio Buffer] → [STT API] → [Text] → [Chat Processing]

Chat Processing Flow

For each user message:

  1. Message received at /api/chat endpoint
  2. RAG retrieval searches knowledge base for context
  3. Prompt construction combines system prompt + context + user message
  4. LLM inference generates response
  5. Response logged to database
  6. Response sent to avatar for synthesis
[User Message]


[RAG Retrieval] ──▶ [Relevant Documents]
│ │
▼ ▼
[System Prompt] + [Context] + [User Message]


[LLM Provider] ──▶ [Generated Response]


[Avatar Provider] ──▶ [Video Stream]

Avatar Synthesis Flow

The avatar rendering uses a hybrid approach:

  1. Session created via server-side API (secure)
  2. Token issued to client
  3. Client SDK connects directly to HeyGen
  4. Text sent for synthesis
  5. Video streams back to client
[Server]                          [Client]                        [HeyGen]
│ │ │
│──── Create Session ────────────▶│ │
│◀─── Session Token ──────────────│ │
│ │ │
│ │──── Connect with Token ──────▶│
│ │◀─── WebSocket Established ────│
│ │ │
│──── Send Response Text ────────▶│──── Synthesize ─────────────▶│
│ │◀─── Video Stream ─────────────│

Knowledge Base Flow

Documents are processed asynchronously:

  1. Upload via Admin → File saved to storage
  2. Processing triggered → File parsed and chunked
  3. Embeddings generated → Vectors created for each chunk
  4. Index updated → Ready for retrieval
[Document Upload] → [Parser] → [Chunker] → [Embedder] → [Vector Store]


[RAG Retrieval]

Analytics Flow

Session data is captured throughout:

[Session Start] ──▶ [Conversation] ──▶ [Session End]
│ │ │
▼ ▼ ▼
[Start Time] [Messages] [Duration]
[IP Address] [Topics] [Email (opt)]
[User Agent] [Citations] [Feedback]


[Database Storage]


[Admin Dashboard]

Security Considerations

  • API keys never exposed to client
  • Session tokens are short-lived and scoped
  • Database contains no sensitive credentials
  • Audio is processed but not stored (unless configured)
  • Analytics can be configured for privacy compliance