Agent Service

Runs agentic loops with tool calling. Receives a request with messages, a model, and allowed tool names, then iterates: call LLM, check for tool calls, execute tools via MCP, feed results back, repeat until done.

Tech: NestJS 11
Port: 4000
Auth: JWT (HS256), API Key, Public
Database: None (stateless)
Max iterations: 15

Endpoints

Method	Path	Auth	Description
POST	`/api/v1/agent/run`	Public	Execute an agentic flow
GET	`/api/v1/health`	Public	Health check

POST /api/v1/agent/run

Request

{
  "model": "gpt-4o",
  "messages": [
    { "role": "user", "content": "Search for documents about kubernetes" }
  ],
  "stream": true,
  "instructions": "You are a helpful assistant with access to tools.",
  "tools": ["search_documents", "web_search"],
  "toolChoice": "auto",
  "context": { "userId": "uuid", "conversationId": "uuid" }
}

Field	Type	Required	Description
model	string	Yes	LLM model name
messages	MessageInput[]	Yes	Conversation history
stream	boolean	No	Default: true. SSE streaming or JSON response
instructions	string	No	System instructions prepended to messages
tools	string[]	No	Allowed MCP tool names. If omitted, no tools available.
toolChoice	ToolChoice	No	Tool selection strategy (auto, none, required)
context	object	No	Tracking metadata passed through

Execution Flow

Fetch tools -- Calls MCP client service GET /mcp/v1/tools/list-tools to get tool schemas. Filters to only the tools listed in the tools field.
Call LLM -- Sends messages + tool schemas to completion service POST /api/v1/completions.
Check response -- If the LLM returns text only (no tool calls), the loop ends.
Execute tools -- If the LLM returns tool calls, each tool is executed via MCP POST /mcp/v1/tools/call-tool.
Append results -- Tool results are added to the message history as tool-result messages.
Repeat -- Go to step 2. Continue until the LLM produces a final text response or the iteration limit (15) is reached.

Streaming Response

When stream: true, the service returns an SSE stream. During the agentic loop:

LLM text chunks are forwarded as SSE events in real-time
Tool call events are emitted so the client can show tool execution status
Tool results are emitted after execution

Non-Streaming Response

When stream: false, the service runs the full loop and returns the final result as JSON.

Inter-Service Communication

Target	Protocol	Purpose
completion-service	HTTP	`POST /api/v1/completions` -- LLM reasoning (streaming and non-streaming NDJSON)
mcp-client-service	HTTP	`GET /mcp/v1/tools/list-tools` -- Discover available tools
mcp-client-service	HTTP	`POST /mcp/v1/tools/call-tool` -- Execute tool calls

Environment Variables

Variable	Description
COMPLETION_SERVICE_URL	Base URL of completion service
COMPLETION_SERVICE_SECRET	Bearer token for completion service
COMPLETION_SERVICE_TIMEOUT	Request timeout (default: 60s)
MCP_SERVICE_URL	Base URL of MCP client service
MCP_SERVICE_SECRET	`x-jeen-mcp-service-secret` header value
MCP_SERVICE_TIMEOUT	Request timeout (default: 60s)

Endpoints​

POST /api/v1/agent/run​

Request​

Execution Flow​

Streaming Response​

Non-Streaming Response​

Inter-Service Communication​

Environment Variables​