Agents provide enhanced chat experiences by automatically selecting between Simple Chat and RAG Chat based on document availability and dynamically assembling tools.
Processing Flow
1. Agent Selection and Routing
- User submits query with selected agent
- Python API routes request to Service Controller
- Service Controller selects appropriate model service
- Agent ID and configuration must be fully loaded
- Agent metadata (tools, prompt, document presence) verified
2. Document-Based Path Selection
With Documents (RAG Chat):- Agent’s custom prompt
- RAG Tool
- Web Analysis Tool
- Image Generation Tool (GPT models only)
- Web Search Tool (GPT-4.1 search only)
- Agent’s prompt
- Simple Chat Tool
- Image Generation Tool (GPT models only)
- Web Analysis Tool
- Web Search Tool (GPT-4.1 search only)
- Path selection handled by Agent Routing Layer
- Document presence tracked via
agent.doc_ids
or similar flag
3. Context Construction
Combined context elements:- Chat History - Previous conversation messages
- Agent Prompt - Custom agent instructions
- User Query - Current user input
- Overflow handled using rolling window strategy
- Context trimmed or batched as needed
4. Response Generation and Storage
Processing:- LLM generates response streamed live to user
- MongoDB storage via Cost Callback tracker
- LLM response content
- Agent ID and query metadata
- Token cost and usage metrics
- Model configuration details
5. RAG Implementation (Document-Based Agents)
Document Processing:- Text extraction from uploaded files
- Content split into chunks
- Embedding generation using embedding model
- Storage in Qdrant (or Pinecone)
- RAG Tool queries Qdrant for similar chunks
- Retrieved chunks used as LLM context
- Enhanced responses based on document content
- Consistent embedding model for upload and retrieval
- Top-k vector search with agent-level metadata filtering
Architecture

Agent Processing Architecture
Tool Activation Matrix
Tool | Simple Chat | RAG Chat | Model Requirement |
---|---|---|---|
RAG Tool | ❌ | ✅ | Any |
Simple Chat Tool | ✅ | ❌ | Any |
Web Analysis Tool | ✅ | ✅ | Any |
Image Generation | ✅ | ✅ | GPT models only |
Web Search | ✅ | ✅ | GPT-4.1 search only |
Key Components
- Agent Routing Layer: Path selection logic
- Service Controller: Model service management
- RAG Tool: Document-based context retrieval
- LLM Chain: Response generation pipeline
- Cost Callback: Token usage and pricing tracking
Tool activation depends on both selected agent and model. Validate tool permissions and document presence during implementation.
Troubleshooting
Agent RAG Not Triggering
- Confirm agent has active documents linked in backend
- Verify embeddings generated and stored correctly in Qdrant
- Check model selection supports required tools
Agent Prompt Issues
- Review agent metadata and prompt formatting
- Verify context builder includes agent prompt and history
- Check LLM Chain handler logs for prompt construction issues