Skip to main content

Agent Service

Runs agentic loops with tool calling. Receives a request with messages, a model, and allowed tool names, then iterates: call LLM, check for tool calls, execute tools via MCP, feed results back, repeat until done.

  • Tech: NestJS 11
  • Port: 4000
  • Auth: JWT (HS256), API Key, Public
  • Database: None (stateless)
  • Max iterations: 15

Endpoints

MethodPathAuthDescription
POST/api/v1/agent/runPublicExecute an agentic flow
GET/api/v1/healthPublicHealth check

POST /api/v1/agent/run

Request

{
"model": "gpt-4o",
"messages": [
{ "role": "user", "content": "Search for documents about kubernetes" }
],
"stream": true,
"instructions": "You are a helpful assistant with access to tools.",
"tools": ["search_documents", "web_search"],
"toolChoice": "auto",
"context": { "userId": "uuid", "conversationId": "uuid" }
}
FieldTypeRequiredDescription
modelstringYesLLM model name
messagesMessageInput[]YesConversation history
streambooleanNoDefault: true. SSE streaming or JSON response
instructionsstringNoSystem instructions prepended to messages
toolsstring[]NoAllowed MCP tool names. If omitted, no tools available.
toolChoiceToolChoiceNoTool selection strategy (auto, none, required)
contextobjectNoTracking metadata passed through

Execution Flow

  1. Fetch tools -- Calls MCP client service GET /mcp/v1/tools/list-tools to get tool schemas. Filters to only the tools listed in the tools field.
  2. Call LLM -- Sends messages + tool schemas to completion service POST /api/v1/completions.
  3. Check response -- If the LLM returns text only (no tool calls), the loop ends.
  4. Execute tools -- If the LLM returns tool calls, each tool is executed via MCP POST /mcp/v1/tools/call-tool.
  5. Append results -- Tool results are added to the message history as tool-result messages.
  6. Repeat -- Go to step 2. Continue until the LLM produces a final text response or the iteration limit (15) is reached.

Streaming Response

When stream: true, the service returns an SSE stream. During the agentic loop:

  • LLM text chunks are forwarded as SSE events in real-time
  • Tool call events are emitted so the client can show tool execution status
  • Tool results are emitted after execution

Non-Streaming Response

When stream: false, the service runs the full loop and returns the final result as JSON.

Inter-Service Communication

TargetProtocolPurpose
completion-serviceHTTPPOST /api/v1/completions -- LLM reasoning (streaming and non-streaming NDJSON)
mcp-client-serviceHTTPGET /mcp/v1/tools/list-tools -- Discover available tools
mcp-client-serviceHTTPPOST /mcp/v1/tools/call-tool -- Execute tool calls

Environment Variables

VariableDescription
COMPLETION_SERVICE_URLBase URL of completion service
COMPLETION_SERVICE_SECRETBearer token for completion service
COMPLETION_SERVICE_TIMEOUTRequest timeout (default: 60s)
MCP_SERVICE_URLBase URL of MCP client service
MCP_SERVICE_SECRETx-jeen-mcp-service-secret header value
MCP_SERVICE_TIMEOUTRequest timeout (default: 60s)