Conversation Flow
End-to-end flow from user message to AI response, including streaming.
Services Involved
llm-core, completion-service, (optionally) agent-service, mcp-client-service
Simple Conversation (No Tools)
1. Client Sends Message
Client calls POST /api/v1/conversation on llm-core.
{
"message": "Explain kubernetes pods",
"conversationId": "uuid-or-null",
"agentId": "uuid"
}
If conversationId is null, a new conversation is created.
2. LLM Core Resolves Context
llm-core does the following:
- Creates or loads the conversation
- Stores the user message
- Resolves the agent configuration (model, instructions, tools)
- Loads conversation history
- Fetches any attached documents or files
3. Call Completion Service
If the agent has no tools configured, llm-core calls completion-service directly:
llm-core --POST /api/v1/completions--> completion-service
completion-service routes the request to the appropriate LLM provider (OpenAI, Anthropic, Google, etc.) based on the model name.
4. Stream Response
For streaming (default):
- completion-service returns NDJSON chunks
- llm-core converts them to SSE events
- Client receives text chunks in real time
For non-streaming:
- completion-service returns the full response
- llm-core wraps it and returns JSON
5. Store and Return
llm-core stores the assistant message, updates the conversation, and finalizes the response.
Conversation with Agent Tools
When the agent has tools configured, llm-core delegates to agent-service instead.
1-2. Same as Above
Client sends message, llm-core resolves context.
3. Call Agent Service
llm-core --POST /api/v1/agent/run--> agent-service
Includes: messages, model, tool names, instructions, stream flag.
4. Agentic Loop
agent-service runs the agentic loop:
agent-service --GET /mcp/v1/tools/list-tools--> mcp-client-service
agent-service --POST /api/v1/completions--> completion-service
(LLM returns tool calls)
agent-service --POST /mcp/v1/tools/call-tool--> mcp-client-service
(tool results fed back)
agent-service --POST /api/v1/completions--> completion-service
(repeat until LLM returns text or 15 iterations)
5. Stream Response
agent-service streams SSE events back through llm-core to the client. Events include:
- Text chunks from the LLM
- Tool call notifications
- Tool result summaries
- Final response
6. Store and Return
Same as simple flow -- llm-core stores everything and finalizes.
Token Usage Tracking
After every completion (both simple and agent flows), completion-service publishes a transaction event to RabbitMQ. llm-core consumes it and stores it in model_transactions.
completion-service --[transaction queue]--> llm-core
Full Diagram
Client
|
| POST /api/v1/conversation
v
llm-core
|
|--- No tools? --> completion-service --> LLM API --> response
|
|--- Has tools? --> agent-service
|
+--> mcp-client-service (list tools)
|
+--> completion-service (LLM call)
| |
| v
| LLM API
| |
| v
| tool_calls?
| |
| yes --> mcp-client-service (call tool)
| |
| v
| MCP tool server
| |
| v
| results --> loop back to LLM
|
+--> final text response
|
v
llm-core stores messages
|
v
Client (SSE stream)