Interactive API documentation
All Python-based services have a Swagger UI for exploring and testing the API:| Service | Swagger UI URL | OpenAPI JSON |
|---|---|---|
| RAG | http://localhost:8001/docs | http://localhost:8001/openapi.json |
| Crawler | http://localhost:8002/docs | http://localhost:8002/openapi.json |
RAG API
The RAG API handles document indexing and search. It is the engine behind the knowledge base.Upload a document
sync=true to wait for indexing to complete before the response returns.
Check document statuses
queued, running, completed, failed.
Search the knowledge base
file_ids parameter is required and scopes the search to specific documents.
Delete a document
Get document content
Compare documents
Crawler API
Register a website for crawling
scan_interval is in seconds. Minimum value is 60.
Fetch page content
Get website info
Deregister a website
List website URLs
Platform API
The Platform service exposes a public API at/api/v1/* for programmatic access to your data. Authenticate using an API key from Settings > API Keys.
OpenAI-compatible chat completions
The platform provides an interface fully compatible with the OpenAI Chat Completions API. Any client or SDK that supports OpenAI (Python, Node, curl, LiteLLM, etc.) can connect by pointingbase_url to your Tale instance.
Quick start
Authentication
All requests require a Bearer token in theAuthorization header:
Headers
| Header | Required | Description |
|---|---|---|
Authorization | Yes | Bearer <api-key> |
X-Organization-Slug | No | Organization slug. Auto-resolved if user belongs to one org. |
X-Thread-Id | No | Reuse a conversation thread across requests. |
Endpoints
POST /api/v1/chat/completions
Send a chat message and receive a response. Supports streaming and tool calling. Request body:| Field | Type | Description |
|---|---|---|
model | string | Required. Agent slug (e.g., chat-agent). |
messages | array | Required. Conversation messages with role and content. |
stream | boolean | Enable SSE streaming. Default: false. |
temperature | number | Sampling temperature (0–2). |
max_tokens | number | Maximum tokens to generate. |
top_p | number | Nucleus sampling parameter. |
frequency_penalty | number | Penalize repeated tokens. |
presence_penalty | number | Penalize tokens already present. |
stop | string or array | Stop sequences. |
response_format | object | Set {"type": "json_object"} for JSON mode. |
tools | array | Tool definitions for client-side tool calling. |
tool_choice | string or object | "auto", "required", "none", or {"type":"function","function":{"name":"..."}}. |
- Agent mode (no
tools): The agent uses its pre-configured server-side tools (RAG, web search, etc.) and auto-executes them. The response contains the final text. - Client tool mode (
toolsprovided): Only the client-defined tools are available. The model returnstool_callsfor the client to execute. Send results back withrole: "tool"messages.