Database Architecture
The platform uses PostgreSQL as the primary database. Four separate databases serve different service groups. Redis is used for caching and sessions.
Document Database
Shared by: document-service, parser-service, embedding-service, rag-service
Default name: document_service (configurable via PG_NAME env var per service). This database stores all document-related data including parsed content, vector embeddings, and processing job state.
documents
| Column | Type | Description |
|---|---|---|
| id | UUID (PK) | Document identifier |
| userId | UUID | Uploader |
| status | ENUM | PENDING_UPLOAD, UPLOADED, PROCESSING, PROCESSED, FAILED |
| fileName | VARCHAR | Original file name |
| fileSize | BIGINT | Size in bytes |
| storagePath | VARCHAR | Path in blob storage |
| storageType | VARCHAR | azure_blob or s3 |
| contentType | VARCHAR | MIME type |
| metadata | JSONB | Custom metadata |
| folderId | UUID (FK) | Parent folder |
| parsingTechniqueId | UUID | Selected parsing method |
| createdAt | TIMESTAMPTZ | -- |
| updatedAt | TIMESTAMPTZ | -- |
| deletedAt | TIMESTAMPTZ | Soft delete |
chunks
| Column | Type | Description |
|---|---|---|
| id | UUID (PK) | Chunk identifier |
| documentId | UUID (FK) | Parent document |
| chunkIndex | INT | Position in document |
| content | TEXT | Raw chunk text |
| translatedContent | TEXT | English translation (if applicable) |
| contentHash | VARCHAR | Deduplication hash |
| contentType | ENUM | text, heading, table, code, list |
| pageNumber | INT | Source page |
| charCount | INT | Character count |
| wordCount | INT | Word count |
| metadata | JSONB | Extra data |
| createdAt | TIMESTAMPTZ | -- |
embeddings
| Column | Type | Description |
|---|---|---|
| id | UUID (PK) | Embedding identifier |
| chunkId | UUID (FK) | Parent chunk |
| documentId | UUID (FK) | Parent document |
| embedding | VECTOR(1024) | pgvector embedding |
| modelName | VARCHAR | Model used for generation |
| createdAt | TIMESTAMPTZ | -- |
HNSW index on the embedding column for fast cosine similarity search.
folders
| Column | Type | Description |
|---|---|---|
| id | UUID (PK) | Folder identifier |
| folderName | VARCHAR | Display name |
| parentId | UUID (FK) | Parent folder (self-referencing) |
| folderType | ENUM | document, agent, default |
| userId | UUID | Owner |
parsing_techniques
| Column | Type | Description |
|---|---|---|
| id | UUID (PK) | Technique identifier |
| name | VARCHAR | Display name |
| description | TEXT | Description |
| icon | VARCHAR | Icon reference |
| category | VARCHAR | Grouping |
| isEnabled | BOOLEAN | Active flag |
parsing_jobs (parser-service)
| Column | Type | Description |
|---|---|---|
| id | UUID (PK) | Job identifier |
| status | ENUM | queued, processing, completed, failed |
| fileName | VARCHAR | File being parsed |
| documentId | UUID | Linked document |
| userId | UUID | Requesting user |
| parserMethod | VARCHAR | Parser backend used |
| resultData | JSONB | Parsed output |
| errorMessage | TEXT | Failure details |
| idempotencyKey | VARCHAR | Prevents duplicate jobs |
llm-core DB
Owned by: llm-core only
conversations
| Column | Type | Description |
|---|---|---|
| id | UUID (PK) | Conversation identifier |
| userId | UUID | Owner |
| title | VARCHAR | Conversation title |
| metadata | JSONB | Custom data |
| createdAt | TIMESTAMPTZ | -- |
| updatedAt | TIMESTAMPTZ | -- |
| deletedAt | TIMESTAMPTZ | Soft delete |
messages
| Column | Type | Description |
|---|---|---|
| id | UUID (PK) | Message identifier |
| conversationId | UUID (FK) | Parent conversation |
| role | ENUM | user, assistant, system, tool |
| createdAt | TIMESTAMPTZ | -- |
message_contents
Stores the actual content blocks of a message (text, tool calls, tool results).
agents
| Column | Type | Description |
|---|---|---|
| id | UUID (PK) | Agent identifier |
| userId | UUID | Creator |
| name | VARCHAR | Agent name |
| type | ENUM | simple, cortex, workflow, system, organization |
| instructions | TEXT | System prompt |
| model | VARCHAR | Default model |
| settings | JSONB | Agent-specific config |
| folderId | UUID (FK) | Parent folder |
models
Core model definitions (GPT-4, Claude, Gemini, etc.) with capabilities and configuration.
providers
LLM provider configurations (OpenAI, Azure, Anthropic, Google, Mistral, Jamba, Ollama, vLLM, Remote).
models_providers (junction)
Maps which models are available from which providers, with provider-specific configuration.
canvases / canvas_versions
Interactive editable content areas within conversations, with version history.
templates
Predefined prompt templates with categories and configuration.
model_transactions
| Column | Type | Description |
|---|---|---|
| id | UUID (PK) | Transaction identifier |
| model | VARCHAR | Model used |
| provider | VARCHAR | Provider used |
| inputTokens | INT | Tokens in prompt |
| outputTokens | INT | Tokens in response |
| userId | UUID | Requesting user |
| createdAt | TIMESTAMPTZ | -- |
conversions
Text transformation records (grammar correction, translation, summarization).
admin-base DB
Owned by: admin-base-ms only
organizations
| Column | Type | Description |
|---|---|---|
| id | UUID (PK) | Organization identifier |
| name | VARCHAR | Organization name |
| externalId | VARCHAR | External system ID |
| isActive | BOOLEAN | Active flag |
roles
| Column | Type | Description |
|---|---|---|
| id | UUID (PK) | Role identifier |
| name | VARCHAR | Role name |
| organizationId | UUID (FK) | Parent organization |
| parentId | UUID (FK) | Parent role (hierarchy) |
| level | INT | Role hierarchy level |
| isSuperAdmin | BOOLEAN | Super admin flag |
| isSystemRole | BOOLEAN | System-defined role |
| applicationType | VARCHAR | Application scope |
permissions
Links actions to modules and features. Supports conditions and inverted permissions.
role_permissions (junction)
Maps roles to permissions (many-to-many).
Organization configuration tables
Each table stores per-organization settings for a specific resource type:
| Table | What it configures |
|---|---|
| organization_settings | General settings (name, help center, default role) |
| organization_languages | Available languages per org |
| organization_models | Available AI models per org |
| organization_agents | Available agents per org |
| organization_connectors | Available connectors per org |
| organization_templates | Available templates per org |
| organization_workflows | Available workflows per org |
| organization_parsing | Available parsing techniques per org |
user-base DB
Owned by: user-base-ms only
users
| Column | Type | Description |
|---|---|---|
| id | UUID (PK) | User identifier |
| VARCHAR(255) | Unique email | |
| username | VARCHAR(50) | Unique username |
| firstName | VARCHAR(100) | First name |
| lastName | VARCHAR(100) | Last name |
| avatarUrl | VARCHAR(500) | Profile image URL |
| metadata | JSONB | Custom data |
| zitadelUserId | VARCHAR(255) | External IdP user ID |
| organizationId | UUID | Linked organization |
| isOwner | BOOLEAN | Organization owner flag |
| preferredLanguage | VARCHAR(10) | FK to languages.code |
| createdAt | TIMESTAMPTZ | -- |
| updatedAt | TIMESTAMPTZ | -- |
| deletedAt | TIMESTAMPTZ | Soft delete |
Check constraint: owners have no organizationId, non-owners must have one.
Preference tables
| Table | Purpose |
|---|---|
| tags | User-defined tags with color and target type |
| tag_entities | Links tags to target entities (conversations, documents, agents) |
| favorites | User favorites by target type/ID |
| pins | Pinned items by target type/ID |
| searches | Search history with deduplication |
| activity_logs | User activity tracking with hide support |
Sharing tables
| Table | Purpose |
|---|---|
| features | Shareable resources (conversation, source/folder/document, agent) |
| feature_shares | Grants access to a feature for a subject (user/group) with a role |
| feature_link_settings | Link sharing configuration (access level, role, expiry, allowed users) |
| feature_locks | Editing locks with TTL (default 5 min) |
Reference tables
| Table | Purpose |
|---|---|
| user_roles | Maps users to roles from admin-base-ms |
| integration_tools | External tool configurations (key, name, base URL, enabled) |
| langflow_users | Langflow account credentials per user |
| connectors | Available connectors (provider, type, config schema) |
| languages | Language catalog (code + display name) |
Redis
| Service | Key Pattern | Purpose | TTL |
|---|---|---|---|
| auth-service | zitadel/{provider}/{orgId}/{userId} | ZITADEL sessions | Configurable |
| llm-core | Various | Conversation search cache, Langflow flow cache | Configurable |