Skip to main content

Rate limits

deepface.dev applies rate limiting and queue controls at the gateway.

Enforcement points

  • Per-account requests per minute
  • Maximum inflight compute requests
  • Queue depth
  • Queue timeout

Retry behavior

429 rate_limited responses include:
  • Retry-After
  • RateLimit-Limit
  • RateLimit-Remaining
  • RateLimit-Reset
503 queue_full and 503 queue_timeout indicate temporary capacity pressure. Treat them as retryable with backoff unless your own SLA policy says otherwise.

Capacity planning

If you need higher sustained throughput, dedicated queue capacity, or custom limits, contact the deepface.dev team before scaling traffic abruptly.