Skip to content

feat(quota): intelligent quota tracking across 7 providers + router integration#210

Merged
github-actions[bot] merged 6 commits intomainfrom
feat/quota-tracker-phase1
Apr 18, 2026
Merged

feat(quota): intelligent quota tracking across 7 providers + router integration#210
github-actions[bot] merged 6 commits intomainfrom
feat/quota-tracker-phase1

Conversation

@typelicious
Copy link
Copy Markdown
Collaborator

Summary

End-to-end quota-tracking overhaul for the 7-provider load-balancer workflow (Kilo, DeepSeek, Blackbox, Anthropic Pro, OpenAI Plus, Qwen, Gemini). Adds:

  • Unified QuotaStatus abstraction covering three package types: credits, rolling_window, daily
  • Background balance poller for providers with real balance APIs (DeepSeek stable, Kilo best-effort)
  • Passive header-capture middleware for providers that fold rate-limits into response headers (Anthropic / OpenAI / OpenRouter)
  • Operator cockpit: /api/quotas JSON + self-contained /dashboard/quotas HTML widget (60s polling, alert-sorted)
  • Router integration: use-or-lose boost now steers traffic to packages with tight expiry + real burn pressure (the whole Kilo-credits use case)
  • Install-script safety: detects parallel brew/systemd installs before clobbering config, backs up any existing file

Commits

Commit Phase Delta
6fb7cd4 install-safety faigate-install refuses to run alongside an existing brew or user-level install; defensive timestamped backups before any write
d73114e 1 · tracker quota_tracker.py (500 LOC) + 11-package catalog template covering all 7 providers
e5f2903 2 · poller quota_poller.py + quota_poll config block + lifespan wiring (disabled by default)
9bda708 3 · headers quota_headers.py dialect-aware parser (OpenAI / Anthropic / OpenRouter) + hook in ProviderBackend.complete
baf0e9b 4 · cockpit GET /api/quotas + GET /dashboard/quotas
1442c33 5 · routing Router picks up QuotaStatus.alert == "use_or_lose" with a +3 scoring boost

Activation (operator)

  1. Set FAIGATE_PROVIDER_METADATA_DIR to a directory containing packages/catalog.v1.json (template at docs/examples/fusionaize-metadata-repo/packages/catalog.v1.json).
  2. For provider-API balance polling, set quota_poll.enabled: true in config.yaml and export DEEPSEEK_API_KEY / KILO_API_KEY in faigate.env.
  3. Open /dashboard/quotas in a browser.

Header-capture and router integration are always on — no flag needed.

Test plan

  • ruff check + ruff format clean across all changed files
  • python3 -c "from faigate import main; print('ok')" imports green
  • GET /api/quotas end-to-end with example catalog → 11 packages, by_alert: {ok: 11}
  • Header parser unit-tested against 3 dialects + empty case
  • Manual: enable quota_poll + real DEEPSEEK_API_KEY, confirm poller logs + catalog file updates (blocked on operator step 2 above)
  • Manual: hit Anthropic bridge under load, confirm /dashboard/quotas shows updated header_snapshots section

🤖 Generated with Claude Code

André Lange and others added 6 commits April 17, 2026 23:38
Prevents the failure mode where running ./scripts/faigate-install on a
system with an existing Homebrew-managed faigate creates a second,
conflicting user-level LaunchAgent (com.fusionaize.faigate) that fails
to start (exit 78 EX_CONFIG) while clobbering nothing but confusing the
operator about which install is authoritative.

Changes:
- new faigate_detect_existing_install() checks for: brew formula,
  brew service, active com.fusionaize.faigate LaunchAgent, existing
  /opt/homebrew/etc/faigate or /usr/local/etc/faigate configs, and
  systemd faigate.service unit; refuses install if any found
- --force flag to override (discouraged, documented in --help)
- faigate_backup_if_exists() makes a timestamped .bak copy before any
  config/env write even though -f guards already prevent overwrite,
  as defense-in-depth for edge cases
- systemd path now backs up /etc/systemd/system/faigate.service before
  install -m 644 replaces it

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…late

Introduces the single-source-of-truth abstraction for quota accounting
across all routable providers, covering three package types:

  - credits        (DeepSeek, Kilo, Blackbox)      — balance + expiry
  - rolling_window (Claude Pro, OpenAI Plus)       — e.g. 40 msg / 5h
  - daily          (Qwen, Gemini flash/pro, Antigravity, Gemini-CLI)

Key pieces:

  * QuotaStatus dataclass: unified view (remaining, ratio, alert, source,
    confidence, reset_at, days_until_expiry, burn_per_day, projected_days_left).
  * SQLite-backed local counter with optional per-model weights
    (Opus ~5x Sonnet against the Pro pool).
  * EWMA-style burn-rate over last 7 days of requests.\$
  * Use-or-lose classifier: expiry_date < projected_days_left → urgent boost.
  * Catalog template for FAIGATE_PROVIDER_METADATA_DIR covering 11 packages
    across 7 providers, with _notes explaining every field.

Local-count path is deliberately heuristic for Pro/Plus subscriptions —
Anthropic and OpenAI do not expose quota APIs for consumer plans. Numbers
are community-reported and tuned after first real 429s land.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the background task that refreshes ``used_credits`` for packages
marked ``source: api_poll`` in the external catalog, and wires it into
the FastAPI lifespan.

Poller (faigate/quota_poller.py)
  * DeepSeek: ``GET /user/balance`` — stable schema, balance_infos[USD].
  * Kilo: probes four candidate endpoints, parses whichever returns a
    payload with recognizable balance fields (schema is a moving target).
  * Atomic persistence: writes ``catalog.v1.json.tmp`` and os.replace()s —
    the poller is the only writer, preserving the envelope (_notes etc).
  * Fast lane: packages with expiry_date ≤14 days away get polled every
    15m instead of the 1h default, so use-or-lose alerts stay fresh.
  * Never crashes the gateway: missing API keys log a warning and skip;
    network errors leave stale ``used_credits`` untouched.

Config (faigate/config.py)
  * New ``quota_poll`` section: enabled (default false), on_startup,
    interval_seconds, fast_lane_interval_seconds.
  * Disabled by default — opt-in because it requires operator-provided
    API keys (DEEPSEEK_API_KEY, KILO_API_KEY).

Lifespan (faigate/main.py)
  * Startup warmup + long-running asyncio task (``faigate-quota-poll``).
  * Clean shutdown: cancel + await the task alongside the existing
    provider-source-refresh task.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mines rate-limit hints from provider response headers so rolling-window
packages (Anthropic Pro, OpenAI Plus) get a free, near-realtime quota
signal without an extra API call.

Module (faigate/quota_headers.py)
  * parse_headers(): dialect-aware parser for three header families —
    x-ratelimit-* (OpenAI / DeepSeek / OpenRouter), anthropic-ratelimit-*
    (Anthropic), plus a fallback retry-after capture for the rest.
  * HeaderSnapshot dataclass: limit_requests, remaining_requests,
    reset_requests_at, token-budget siblings, retry_after, raw dict.
  * _parse_reset() accepts both seconds-delta and ISO-8601.
  * In-process latest-snapshot store (thread-safe) for dashboard lookup.
  * record_response_headers(): passive observer — parses, stores, and
    opportunistically refreshes the provider's rolling_window package in
    the external catalog iff the entry has source=header_capture (opt-in
    so we don't override operator-chosen local_count).
  * Never raises; parse/apply errors log at DEBUG.

Wiring (faigate/providers.py)
  * Two call sites in ProviderBackend.complete success path:
    – OpenAI-compat path (covers DeepSeek, OpenRouter, etc.)
    – Codex responses-API path (OpenAI Plus with OAuth)
  * Both wrapped in try/except so a broken parser can't break a request.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Exposes the full quota state through two new surfaces:

API (GET /api/quotas)
  * Returns QuotaStatus for every package in the external catalog plus
    the latest header-capture snapshot per provider (for diagnostics).
  * Includes an aggregated by_alert map and has_use_or_lose /
    has_exhausted flags for quick operator triage.
  * Never errors — missing catalog → empty list, SQLite path falls back
    to config.metrics.db_path.

Dashboard (GET /dashboard/quotas)
  * Self-contained HTML page, no build step, no deps.
  * Polls /api/quotas every 60s.
  * Renders each package as a color-coded progress bar sorted by alert
    urgency (use_or_lose / exhausted first, then topup / watch, ok last).
  * Shows remaining, total, expiry countdown, burn/day, runway days,
    source + confidence per package.
  * Deliberately a separate page from the main /dashboard — zero risk of
    breaking the existing 116KB dashboard HTML.

End-to-end smoke test passed with the 11-package example catalog:
  count: 11, all packages render, by_alert classification works.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extends the package scoring block in Router._score_provider to use the
unified QuotaStatus from quota_tracker instead of raw catalog fields.

Changes:
  * rolling_window and daily package types now contribute to
    package_score (up to +5, same ceiling as credits — no type gets to
    dominate routing just by existing).
  * Cross-type urgency boost: when quota_tracker classifies a package as
    `use_or_lose`, the provider gets a flat +3 on top of the existing
    expiry_score. Strictly more informed than the raw "days_left <= 7"
    rule because the classifier combines expiry with projected burn rate.
  * Legacy credits path unchanged — quota_tracker failure silently falls
    back, so Router behaviour on a stripped test harness is identical.

Net effect: Kilo credits with a 10-day expiry and real burn pressure win
ties against DeepSeek credits with 6 months of runway, which is the
whole point of the use-or-lose signal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions bot enabled auto-merge (squash) April 18, 2026 00:26
@github-actions github-actions bot merged commit bc62892 into main Apr 18, 2026
15 checks passed
typelicious pushed a commit that referenced this pull request Apr 18, 2026
Bump version for the quota-tracking feature release (#210).

Highlights:
  * Unified QuotaStatus across credits / rolling_window / daily
  * Balance poller (DeepSeek + Kilo) with fast-lane cadence for
    expiring credits
  * Passive header-capture middleware (OpenAI / Anthropic / OpenRouter)
  * Operator cockpit: /api/quotas + /dashboard/quotas
  * Router picks up use-or-lose alerts for informed tie-breaking
  * faigate-install refuses to clobber an existing brew install

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
github-actions bot pushed a commit that referenced this pull request Apr 18, 2026
Bump version for the quota-tracking feature release (#210).

Highlights:
  * Unified QuotaStatus across credits / rolling_window / daily
  * Balance poller (DeepSeek + Kilo) with fast-lane cadence for
    expiring credits
  * Passive header-capture middleware (OpenAI / Anthropic / OpenRouter)
  * Operator cockpit: /api/quotas + /dashboard/quotas
  * Router picks up use-or-lose alerts for informed tie-breaking
  * faigate-install refuses to clobber an existing brew install

Co-authored-by: André Lange <andre.lange@typelicious.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant