Conversation Flow

End-to-end flow from user message to AI response, including streaming.

Services Involved

llm-core, completion-service, (optionally) agent-service, mcp-client-service

Simple Conversation (No Tools)

1. Client Sends Message

Client calls POST /api/v1/conversation on llm-core.

{
  "message": "Explain kubernetes pods",
  "conversationId": "uuid-or-null",
  "agentId": "uuid"
}

If conversationId is null, a new conversation is created.

2. LLM Core Resolves Context

llm-core does the following:

Creates or loads the conversation
Stores the user message
Resolves the agent configuration (model, instructions, tools)
Loads conversation history
Fetches any attached documents or files

3. Call Completion Service

If the agent has no tools configured, llm-core calls completion-service directly:

llm-core --POST /api/v1/completions--> completion-service

completion-service routes the request to the appropriate LLM provider (OpenAI, Anthropic, Google, etc.) based on the model name.

4. Stream Response

For streaming (default):

completion-service returns NDJSON chunks
llm-core converts them to SSE events
Client receives text chunks in real time

For non-streaming:

completion-service returns the full response
llm-core wraps it and returns JSON

5. Store and Return

llm-core stores the assistant message, updates the conversation, and finalizes the response.

Conversation with Agent Tools

When the agent has tools configured, llm-core delegates to agent-service instead.

1-2. Same as Above

Client sends message, llm-core resolves context.

3. Call Agent Service

llm-core --POST /api/v1/agent/run--> agent-service

Includes: messages, model, tool names, instructions, stream flag.

4. Agentic Loop

agent-service runs the agentic loop:

agent-service --GET /mcp/v1/tools/list-tools--> mcp-client-service
agent-service --POST /api/v1/completions--> completion-service
  (LLM returns tool calls)
agent-service --POST /mcp/v1/tools/call-tool--> mcp-client-service
  (tool results fed back)
agent-service --POST /api/v1/completions--> completion-service
  (repeat until LLM returns text or 15 iterations)

5. Stream Response

agent-service streams SSE events back through llm-core to the client. Events include:

Text chunks from the LLM
Tool call notifications
Tool result summaries
Final response

6. Store and Return

Same as simple flow -- llm-core stores everything and finalizes.

Token Usage Tracking

After every completion (both simple and agent flows), completion-service publishes a transaction event to RabbitMQ. llm-core consumes it and stores it in model_transactions.

completion-service --[transaction queue]--> llm-core

Full Diagram

Client
  |
  | POST /api/v1/conversation
  v
llm-core
  |
  |--- No tools? --> completion-service --> LLM API --> response
  |
  |--- Has tools? --> agent-service
                        |
                        +--> mcp-client-service (list tools)
                        |
                        +--> completion-service (LLM call)
                        |      |
                        |      v
                        |    LLM API
                        |      |
                        |      v
                        |    tool_calls?
                        |      |
                        |      yes --> mcp-client-service (call tool)
                        |              |
                        |              v
                        |            MCP tool server
                        |              |
                        |              v
                        |            results --> loop back to LLM
                        |
                        +--> final text response
                        |
                        v
                   llm-core stores messages
                        |
                        v
                     Client (SSE stream)

Services Involved​

Simple Conversation (No Tools)​

1. Client Sends Message​

2. LLM Core Resolves Context​

3. Call Completion Service​

4. Stream Response​

5. Store and Return​

Conversation with Agent Tools​

1-2. Same as Above​

3. Call Agent Service​

4. Agentic Loop​

5. Stream Response​

6. Store and Return​

Token Usage Tracking​

Full Diagram​