Agent Execution Flow

Detailed view of how agent-service runs the agentic loop with MCP tool calling.

Services Involved

agent-service, completion-service, mcp-client-service, MCP tool servers

Steps

1. Receive Request

llm-core calls POST /api/v1/agent/run on agent-service with:

{
  "model": "gpt-4o",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant..." },
    { "role": "user", "content": "Find documents about kubernetes and summarize them" }
  ],
  "stream": true,
  "tools": ["search_documents", "web_search"],
  "toolChoice": "auto"
}

2. Discover Tools

Agent-service calls mcp-client-service:

GET /mcp/v1/tools/list-tools
Header: x-jeen-mcp-service-secret: <secret>

Returns all available MCP tools with their schemas. Agent-service filters to only the tools listed in the request (search_documents, web_search in this example).

3. First LLM Call

Agent-service sends messages + filtered tool schemas to completion-service:

POST /api/v1/completions
{
  "model": "gpt-4o",
  "messages": [...],
  "tools": [
    { "name": "search_documents", "description": "...", "inputSchema": {...} },
    { "name": "web_search", "description": "...", "inputSchema": {...} }
  ],
  "stream": true
}

4. LLM Decides to Call Tools

The LLM response includes tool calls:

{
  "type": "tool_call",
  "name": "search_documents",
  "arguments": {
    "queries": ["kubernetes architecture overview"]
  }
}

5. Execute Tool

Agent-service calls mcp-client-service:

POST /mcp/v1/tools/call-tool
{
  "name": "search_documents",
  "arguments": {
    "queries": ["kubernetes architecture overview"]
  }
}

The gateway routes this to the mcp-rag-tool server, which calls the RAG service, which embeds the query, runs vector search, reranks, and returns relevant document chunks.

6. Feed Results Back

Tool results are appended to the message history:

{
  "role": "tool",
  "content": "Found 3 relevant chunks:\n1. Kubernetes uses a master-worker...\n2. ...",
  "toolCallId": "call_abc123"
}

7. Next LLM Call

Agent-service sends the updated messages (now including tool results) back to completion-service.

The LLM may:

Return text -- The loop ends. The text is the final response.
Call more tools -- Go back to step 5.

8. Loop Terminates

The loop ends when:

The LLM returns a text response (no tool calls)
The iteration limit (15) is reached
An error occurs

Streaming During the Loop

When streaming, each iteration emits SSE events:

Tool call events (so the client knows a tool is being called)
Tool result events (so the client sees intermediate results)
Text delta events (as the LLM generates text)

Diagram

          agent-service
               |
     +---------+---------+
     |                   |
  Iteration 1         mcp-client-service
     |                   |
     v                   | GET list-tools
  completion-service     |
     |                   v
     v              Tool registry
   LLM API              |
     |                   |
     v                   |
  tool_calls? ---yes---> | POST call-tool
     |                   |     |
     no                  |     v
     |                   | mcp-rag-tool / mcp-web-search / ...
     v                   |     |
  Final text             |     v
     |                   | Tool result
     v                   |     |
  Return to client  <----+-----+
                         |
                    Iteration 2...N (max 15)

Services Involved​

Steps​

1. Receive Request​

2. Discover Tools​

3. First LLM Call​

4. LLM Decides to Call Tools​

5. Execute Tool​

6. Feed Results Back​

7. Next LLM Call​

8. Loop Terminates​

Streaming During the Loop​

Diagram​