This diagram showcases the dual-service processing pipeline that handles incoming user queries using both Python (LLM + streaming) and Node.js (chat orchestration, storage).
This architecture separates compute-heavy AI logic from chat state management to ensure scalability, maintainability, and fast streaming UX.