Agent Service
Runs agentic loops with tool calling. Receives a request with messages, a model, and allowed tool names, then iterates: call LLM, check for tool calls, execute tools via MCP, feed results back, repeat until done.
- Tech: NestJS 11
- Port: 4000
- Auth: JWT (HS256), API Key, Public
- Database: None (stateless)
- Max iterations: 15
Endpoints
| Method | Path | Auth | Description |
|---|---|---|---|
| POST | /api/v1/agent/run | Public | Execute an agentic flow |
| GET | /api/v1/health | Public | Health check |
POST /api/v1/agent/run
Request
{
"model": "gpt-4o",
"messages": [
{ "role": "user", "content": "Search for documents about kubernetes" }
],
"stream": true,
"instructions": "You are a helpful assistant with access to tools.",
"tools": ["search_documents", "web_search"],
"toolChoice": "auto",
"context": { "userId": "uuid", "conversationId": "uuid" }
}
| Field | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | LLM model name |
| messages | MessageInput[] | Yes | Conversation history |
| stream | boolean | No | Default: true. SSE streaming or JSON response |
| instructions | string | No | System instructions prepended to messages |
| tools | string[] | No | Allowed MCP tool names. If omitted, no tools available. |
| toolChoice | ToolChoice | No | Tool selection strategy (auto, none, required) |
| context | object | No | Tracking metadata passed through |
Execution Flow
- Fetch tools -- Calls MCP client service
GET /mcp/v1/tools/list-toolsto get tool schemas. Filters to only the tools listed in thetoolsfield. - Call LLM -- Sends messages + tool schemas to completion service
POST /api/v1/completions. - Check response -- If the LLM returns text only (no tool calls), the loop ends.
- Execute tools -- If the LLM returns tool calls, each tool is executed via MCP
POST /mcp/v1/tools/call-tool. - Append results -- Tool results are added to the message history as tool-result messages.
- Repeat -- Go to step 2. Continue until the LLM produces a final text response or the iteration limit (15) is reached.
Streaming Response
When stream: true, the service returns an SSE stream. During the agentic loop:
- LLM text chunks are forwarded as SSE events in real-time
- Tool call events are emitted so the client can show tool execution status
- Tool results are emitted after execution
Non-Streaming Response
When stream: false, the service runs the full loop and returns the final result as JSON.
Inter-Service Communication
| Target | Protocol | Purpose |
|---|---|---|
| completion-service | HTTP | POST /api/v1/completions -- LLM reasoning (streaming and non-streaming NDJSON) |
| mcp-client-service | HTTP | GET /mcp/v1/tools/list-tools -- Discover available tools |
| mcp-client-service | HTTP | POST /mcp/v1/tools/call-tool -- Execute tool calls |
Environment Variables
| Variable | Description |
|---|---|
| COMPLETION_SERVICE_URL | Base URL of completion service |
| COMPLETION_SERVICE_SECRET | Bearer token for completion service |
| COMPLETION_SERVICE_TIMEOUT | Request timeout (default: 60s) |
| MCP_SERVICE_URL | Base URL of MCP client service |
| MCP_SERVICE_SECRET | x-jeen-mcp-service-secret header value |
| MCP_SERVICE_TIMEOUT | Request timeout (default: 60s) |