feat(quota): intelligent quota tracking across 7 providers + router integration#210
Merged
github-actions[bot] merged 6 commits intomainfrom Apr 18, 2026
Merged
Conversation
Prevents the failure mode where running ./scripts/faigate-install on a system with an existing Homebrew-managed faigate creates a second, conflicting user-level LaunchAgent (com.fusionaize.faigate) that fails to start (exit 78 EX_CONFIG) while clobbering nothing but confusing the operator about which install is authoritative. Changes: - new faigate_detect_existing_install() checks for: brew formula, brew service, active com.fusionaize.faigate LaunchAgent, existing /opt/homebrew/etc/faigate or /usr/local/etc/faigate configs, and systemd faigate.service unit; refuses install if any found - --force flag to override (discouraged, documented in --help) - faigate_backup_if_exists() makes a timestamped .bak copy before any config/env write even though -f guards already prevent overwrite, as defense-in-depth for edge cases - systemd path now backs up /etc/systemd/system/faigate.service before install -m 644 replaces it Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…late
Introduces the single-source-of-truth abstraction for quota accounting
across all routable providers, covering three package types:
- credits (DeepSeek, Kilo, Blackbox) — balance + expiry
- rolling_window (Claude Pro, OpenAI Plus) — e.g. 40 msg / 5h
- daily (Qwen, Gemini flash/pro, Antigravity, Gemini-CLI)
Key pieces:
* QuotaStatus dataclass: unified view (remaining, ratio, alert, source,
confidence, reset_at, days_until_expiry, burn_per_day, projected_days_left).
* SQLite-backed local counter with optional per-model weights
(Opus ~5x Sonnet against the Pro pool).
* EWMA-style burn-rate over last 7 days of requests.\$
* Use-or-lose classifier: expiry_date < projected_days_left → urgent boost.
* Catalog template for FAIGATE_PROVIDER_METADATA_DIR covering 11 packages
across 7 providers, with _notes explaining every field.
Local-count path is deliberately heuristic for Pro/Plus subscriptions —
Anthropic and OpenAI do not expose quota APIs for consumer plans. Numbers
are community-reported and tuned after first real 429s land.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the background task that refreshes ``used_credits`` for packages
marked ``source: api_poll`` in the external catalog, and wires it into
the FastAPI lifespan.
Poller (faigate/quota_poller.py)
* DeepSeek: ``GET /user/balance`` — stable schema, balance_infos[USD].
* Kilo: probes four candidate endpoints, parses whichever returns a
payload with recognizable balance fields (schema is a moving target).
* Atomic persistence: writes ``catalog.v1.json.tmp`` and os.replace()s —
the poller is the only writer, preserving the envelope (_notes etc).
* Fast lane: packages with expiry_date ≤14 days away get polled every
15m instead of the 1h default, so use-or-lose alerts stay fresh.
* Never crashes the gateway: missing API keys log a warning and skip;
network errors leave stale ``used_credits`` untouched.
Config (faigate/config.py)
* New ``quota_poll`` section: enabled (default false), on_startup,
interval_seconds, fast_lane_interval_seconds.
* Disabled by default — opt-in because it requires operator-provided
API keys (DEEPSEEK_API_KEY, KILO_API_KEY).
Lifespan (faigate/main.py)
* Startup warmup + long-running asyncio task (``faigate-quota-poll``).
* Clean shutdown: cancel + await the task alongside the existing
provider-source-refresh task.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mines rate-limit hints from provider response headers so rolling-window
packages (Anthropic Pro, OpenAI Plus) get a free, near-realtime quota
signal without an extra API call.
Module (faigate/quota_headers.py)
* parse_headers(): dialect-aware parser for three header families —
x-ratelimit-* (OpenAI / DeepSeek / OpenRouter), anthropic-ratelimit-*
(Anthropic), plus a fallback retry-after capture for the rest.
* HeaderSnapshot dataclass: limit_requests, remaining_requests,
reset_requests_at, token-budget siblings, retry_after, raw dict.
* _parse_reset() accepts both seconds-delta and ISO-8601.
* In-process latest-snapshot store (thread-safe) for dashboard lookup.
* record_response_headers(): passive observer — parses, stores, and
opportunistically refreshes the provider's rolling_window package in
the external catalog iff the entry has source=header_capture (opt-in
so we don't override operator-chosen local_count).
* Never raises; parse/apply errors log at DEBUG.
Wiring (faigate/providers.py)
* Two call sites in ProviderBackend.complete success path:
– OpenAI-compat path (covers DeepSeek, OpenRouter, etc.)
– Codex responses-API path (OpenAI Plus with OAuth)
* Both wrapped in try/except so a broken parser can't break a request.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Exposes the full quota state through two new surfaces:
API (GET /api/quotas)
* Returns QuotaStatus for every package in the external catalog plus
the latest header-capture snapshot per provider (for diagnostics).
* Includes an aggregated by_alert map and has_use_or_lose /
has_exhausted flags for quick operator triage.
* Never errors — missing catalog → empty list, SQLite path falls back
to config.metrics.db_path.
Dashboard (GET /dashboard/quotas)
* Self-contained HTML page, no build step, no deps.
* Polls /api/quotas every 60s.
* Renders each package as a color-coded progress bar sorted by alert
urgency (use_or_lose / exhausted first, then topup / watch, ok last).
* Shows remaining, total, expiry countdown, burn/day, runway days,
source + confidence per package.
* Deliberately a separate page from the main /dashboard — zero risk of
breaking the existing 116KB dashboard HTML.
End-to-end smoke test passed with the 11-package example catalog:
count: 11, all packages render, by_alert classification works.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extends the package scoring block in Router._score_provider to use the
unified QuotaStatus from quota_tracker instead of raw catalog fields.
Changes:
* rolling_window and daily package types now contribute to
package_score (up to +5, same ceiling as credits — no type gets to
dominate routing just by existing).
* Cross-type urgency boost: when quota_tracker classifies a package as
`use_or_lose`, the provider gets a flat +3 on top of the existing
expiry_score. Strictly more informed than the raw "days_left <= 7"
rule because the classifier combines expiry with projected burn rate.
* Legacy credits path unchanged — quota_tracker failure silently falls
back, so Router behaviour on a stripped test harness is identical.
Net effect: Kilo credits with a 10-day expiry and real burn pressure win
ties against DeepSeek credits with 6 months of runway, which is the
whole point of the use-or-lose signal.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
typelicious
pushed a commit
that referenced
this pull request
Apr 18, 2026
Bump version for the quota-tracking feature release (#210). Highlights: * Unified QuotaStatus across credits / rolling_window / daily * Balance poller (DeepSeek + Kilo) with fast-lane cadence for expiring credits * Passive header-capture middleware (OpenAI / Anthropic / OpenRouter) * Operator cockpit: /api/quotas + /dashboard/quotas * Router picks up use-or-lose alerts for informed tie-breaking * faigate-install refuses to clobber an existing brew install Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Merged
github-actions bot
pushed a commit
that referenced
this pull request
Apr 18, 2026
Bump version for the quota-tracking feature release (#210). Highlights: * Unified QuotaStatus across credits / rolling_window / daily * Balance poller (DeepSeek + Kilo) with fast-lane cadence for expiring credits * Passive header-capture middleware (OpenAI / Anthropic / OpenRouter) * Operator cockpit: /api/quotas + /dashboard/quotas * Router picks up use-or-lose alerts for informed tie-breaking * faigate-install refuses to clobber an existing brew install Co-authored-by: André Lange <andre.lange@typelicious.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
End-to-end quota-tracking overhaul for the 7-provider load-balancer workflow (Kilo, DeepSeek, Blackbox, Anthropic Pro, OpenAI Plus, Qwen, Gemini). Adds:
credits,rolling_window,daily/api/quotasJSON + self-contained/dashboard/quotasHTML widget (60s polling, alert-sorted)Commits
6fb7cd4d73114equota_tracker.py(500 LOC) + 11-package catalog template covering all 7 providerse5f2903quota_poller.py+quota_pollconfig block + lifespan wiring (disabled by default)9bda708quota_headers.pydialect-aware parser (OpenAI / Anthropic / OpenRouter) + hook inProviderBackend.completebaf0e9bGET /api/quotas+GET /dashboard/quotas1442c33QuotaStatus.alert == "use_or_lose"with a +3 scoring boostActivation (operator)
FAIGATE_PROVIDER_METADATA_DIRto a directory containingpackages/catalog.v1.json(template atdocs/examples/fusionaize-metadata-repo/packages/catalog.v1.json).quota_poll.enabled: trueinconfig.yamland exportDEEPSEEK_API_KEY/KILO_API_KEYinfaigate.env./dashboard/quotasin a browser.Header-capture and router integration are always on — no flag needed.
Test plan
ruff check+ruff formatclean across all changed filespython3 -c "from faigate import main; print('ok')"imports greenGET /api/quotasend-to-end with example catalog → 11 packages,by_alert: {ok: 11}DEEPSEEK_API_KEY, confirm poller logs + catalog file updates (blocked on operator step 2 above)/dashboard/quotasshows updatedheader_snapshotssection🤖 Generated with Claude Code