Overview
Omi’s chat system is a sophisticated agentic AI pipeline that enables users to have intelligent conversations about their recorded memories, calendar events, health data, and more. This document provides a complete technical understanding of how questions flow through the system.Classifies
Determines if context is needed
Routes
Simple, Agentic, or Persona path
Tools
22+ integrated data sources
Retrieves
Vector search & metadata filters
Cites
Links to source conversations
Streams
Real-time thinking & response
System Architecture Diagram
The Three Routing Paths
- No Context
- Agentic
- Persona
Path 1: No Context Conversation
When triggered: Simple greetings, general advice, brainstorming questionsClassification criteria (fromrequires_context() function):- Greetings: “Hi”, “Hello”, “How are you?”
- General knowledge: “What’s the capital of France?”
- Advice without personal context: “Tips for productivity”
Classification Logic
Therequires_context() function determines the routing path:
The Agentic Tool System
How Tool Calling Works
The LangGraph ReAct agent follows this cycle:Receive Question
System prompt provides tool descriptions, user’s timezone, and citation instructions
Decide Tools
LLM autonomously decides which tool(s) to call based on question intent
Execute Tools
Tool calls are executed and results returned to the agent
Synthesize or Continue
Agent synthesizes response OR makes additional tool calls if more context needed
Generate Answer
Final answer generated with proper
[1][2] citations linking to source conversationsAvailable Tools (22+)
Tools are loaded dynamically based on user’s enabled integrations and installed apps.
Conversation & Memory Tools
Conversation & Memory Tools
Core tools for retrieving user’s conversations and extracted memories.
| Tool | Purpose | Key Parameters |
|---|---|---|
get_conversations_tool | Retrieve by date range | start_date, end_date, limit, include_transcript |
vector_search_conversations_tool | Semantic search | query, start_date, end_date, limit |
search_conversations_tool | Full-text keyword search | query, limit, offset |
get_memories_tool | Personal facts about user | limit, offset |
Action Item Tools
Action Item Tools
Manage tasks and to-dos extracted from conversations.
| Tool | Purpose |
|---|---|
get_action_items_tool | Retrieve pending tasks |
create_action_item_tool | Create new task |
update_action_item_tool | Mark complete/update |
Calendar Tools (Google Calendar)
Calendar Tools (Google Calendar)
Full CRUD operations on user’s Google Calendar.
| Tool | Purpose |
|---|---|
get_calendar_events_tool | Fetch events by date/person |
create_calendar_event_tool | Create meetings with attendees |
update_calendar_event_tool | Modify existing events |
delete_calendar_event_tool | Cancel meetings |
Integration Tools
Integration Tools
Connect to external services for richer context.
| Tool | Service | Purpose |
|---|---|---|
get_gmail_messages_tool | Gmail | Search emails |
get_whoop_sleep_tool | Whoop | Sleep data |
get_whoop_recovery_tool | Whoop | Recovery scores |
get_whoop_workout_tool | Whoop | Workout history |
search_notion_pages_tool | Notion | Search workspace |
get_twitter_tweets_tool | Twitter/X | Recent tweets |
get_github_pull_requests_tool | GitHub | Open PRs |
get_github_issues_tool | GitHub | Open issues |
perplexity_search_tool | Perplexity | Web search |
Dynamic App Tools
Dynamic App Tools
Third-party apps can define custom tools that become available when users enable them.
Safety Guards
Vector Search Deep Dive
Configuration
| Setting | Value |
|---|---|
| Database | Pinecone (serverless) |
| Embedding Model | text-embedding-3-large (OpenAI) |
| Vector Dimensions | 3,072 |
| Namespace | "ns1" |
| Vector ID Format | {uid}-{conversation_id} |
What Gets Embedded vs Stored as Metadata
| Data | Embedded? | Metadata? |
|---|---|---|
| Title | Yes | No |
| Overview/Summary | Yes | No |
| Action Items | Yes | No |
| Full Transcript | No (too large) | No |
| People Mentioned | No | Yes |
| Topics | No | Yes |
| Entities | No | Yes |
| Dates Mentioned | No | Yes |
created_at | No | Yes (Unix timestamp) |
Vector Creation (Write Path)
Vectors are created ONCE during initial processing, not on every edit. Reprocessed conversations do NOT create new vectors.
Vector Query (Read Path)
Memories System
Memories are distinct from Conversations. They are structured facts extracted about the user over time.Memory Categories
| Category | Examples |
|---|---|
interesting | Hobbies, opinions, stories |
system | Preferences, habits |
manual | User-defined facts |
Extraction Rules
Memory Retrieval in Chat
Chat Sessions & Context
Session Structure
Context Window
Citation System
The LLM generates citations in[1][2] format:
System Prompt Structure
The main system prompt includes:DateTime Formatting Rules
Conversation Retrieval Strategy
The system prompt guides the LLM through a 5-step strategy:- Assess the question - Determine type (temporal, topic, person, etc.)
- Choose primary tool -
get_conversationsfor date-based,vector_searchfor topic-based - Apply filters - Use start_date/end_date when temporal bounds are known
- Request transcripts - Only when detailed content is needed
- Cite sources - Always cite conversations used in the answer
LLM Models Used
| Model | Use Case | Location |
|---|---|---|
gpt-4.1-mini | Fast classification, date extraction | requires_context(), filters |
gpt-4.1 | Medium complexity, initial QA | QA with RAG context |
gpt-5.1 | Agentic workflows with tool calling | Main chat agent |
text-embedding-3-large | Vector embeddings (3,072 dims) | Pinecone queries |
| Gemini Flash 1.5 | Persona responses | Via OpenRouter |
| Claude 3.5 Sonnet | Persona responses | Via OpenRouter |
Streaming Response Format
The backend streams responses in Server-Sent Events (SSE) format:- Loading indicators with tool names
- Streaming response text
- Final message with linked memories
Key File Locations
| Component | File Path |
|---|---|
| Chat Router | backend/routers/chat.py |
| LangGraph Router | backend/utils/retrieval/graph.py |
| Agentic System | backend/utils/retrieval/agentic.py |
| Tools Directory | backend/utils/retrieval/tools/ |
| Conversation Tools | backend/utils/retrieval/tools/conversation_tools.py |
| Memory Tools | backend/utils/retrieval/tools/memory_tools.py |
| Calendar Tools | backend/utils/retrieval/tools/calendar_tools.py |
| App Tools Loader | backend/utils/retrieval/tools/app_tools.py |
| LLM Clients | backend/utils/llm/clients.py |
| Chat Prompts | backend/utils/llm/chat.py |
| Vector Database | backend/database/vector_db.py |
Example: Question Flow
User asks: “What did I discuss with John yesterday about the project?”Classification
requires_context() → TRUE (temporal + person + topic reference)Route to: agentic_context_dependent_conversationAgent Decides Tools
System prompt provides: current datetime, tool descriptionsAgent thinks: “Need conversations from yesterday about project with John”Agent calls:
vector_search_conversations_toolquery: “John project discussion”start_date: “2024-01-19T00:00:00-08:00”end_date: “2024-01-19T23:59:59-08:00”
Tool Execution
- Embed query →
[0.012, -0.034, 0.056, ...] - Query Pinecone with uid filter + date range
- Fetch full conversations from Firestore
- Format for LLM context
Response Generation
LLM synthesizes answer with citations:“Yesterday you discussed the Q1 roadmap with John[1]. He mentioned the frontend refactoring is ahead of schedule[1][2]…”
Post-Processing
- Extract citations →
memories_id: ["conv_456", "conv_789"] - Save message to Firestore
- Stream final response with linked conversation cards