Skip to main content

Monitoring

All Tale services expose a Prometheus /metrics endpoint on the internal Docker network. To enable access from outside, set a bearer token in your .env file:
METRICS_BEARER_TOKEN=your-secret-token-here
Metrics are then available at these endpoints:
ServiceMetrics endpoint
Crawlerhttps://yourdomain.com/metrics/crawler
RAGhttps://yourdomain.com/metrics/rag
Platform (Bun)https://yourdomain.com/metrics/platform
Convexhttps://yourdomain.com/metrics/convex
Note: The Convex backend exposes over 260 built-in metrics covering query latency, mutation throughput, and scheduler performance.
When the token is unset, all /metrics/* endpoints return 401.

Prometheus scrape config

scrape_configs:
  - job_name: tale-crawler
    scheme: https
    metrics_path: /metrics/crawler
    authorization:
      credentials: your-secret-token-here
    static_configs:
      - targets: ['your-tale-host.com']

  # Repeat for: tale-rag, tale-platform, tale-convex
  # changing metrics_path accordingly

Error tracking

Tale supports Sentry and compatible alternatives such as GlitchTip for error tracking. Set your DSN in .env:
SENTRY_DSN=https://your-key@your-sentry-host/project-id
If SENTRY_DSN is not set, error tracking is off and errors only appear in Docker logs.

Viewing logs

All service logs go to Docker stdout with automatic rotation at 10 MB per file, keeping 3 files per service.
# Stream all service logs
docker compose logs -f

# Stream logs for a specific service
docker compose logs -f rag

# View recent logs without streaming
docker compose logs --tail=100 platform

Database backups

To create a database snapshot:
docker exec tale-db pg_dump -U tale tale > backup-$(date +%Y%m%d).sql
To restore from a backup:
docker exec -i tale-db psql -U tale tale < backup-20260101.sql

Health checks

Each service has a health check endpoint:
EndpointWhat it checks
GET /healthProxy is running and listening
GET /api/healthPlatform is up and Convex backend is reachable
http://localhost:8001/healthRAG service is running and database pool is connected
http://localhost:8002/healthCrawler service and browser engine are ready

Container health validation

To validate that all containers are healthy after a deployment or configuration change, run the container smoke test:
bun run docker:test
This builds all images, starts them on non-conflicting ports, validates health endpoints and inter-service connectivity, then tears down. It is the same test that runs in CI on every pull request. For image-level validation (OCI labels, no secrets, size budgets):
bun run docker:test:image

Image size monitoring

Each container image has a size budget enforced by CI. Current sizes and budgets:
ServiceCurrent sizeBudget
Crawler~1.85 GB2.1 GB
RAG~515 MB600 MB
Platform~2.58 GB2.9 GB
DB~1.06 GB1.2 GB
Proxy~88 MB100 MB
If an image exceeds its budget after a change, bun run docker:test:image will fail. See the container architecture page for details on multi-stage build strategies that keep images lean.
Last modified on April 6, 2026