Skip to content

Coverage

TokenOps instruments AI usage on three planes. Which ones a given client supports is the whole integration story — there is no single "install," just these three surfaces, in order of decreasing passivity.

PlaneWhat it needsWhat you get
Passive readTokenOps reads logs/DB the client already writesPer-turn token attribution, zero wiring
MCP (tokenops serve)The client is an MCP hostThe agent calls TokenOps (budget, headroom, coach, status)
Proxy (base-URL override)The client lets you set its API base URLGround-truth token/cost accounting + live optimization

The atomic unit is a turn (one request→response), stamped agent_id = <client>:<project> and workflow_id = <client>:<project>:<session>, so everything rolls up turn → session → project.

Clients

ClientPassive readMCPProxyNotes
Claude Code~/.claude/projectsANTHROPIC_BASE_URLreference integration; also the read-guard hook
Codex CLI~/.codex/sessionsOPENAI_BASE_URLreader surfaces OpenAI's official rate-limit %
opencode✅ SQLite store✅ per-provider baseURLreader is multi-provider
Gemini CLI(no token log)✅ base-URL overrideits logs.json records prompts only — no token data
Desktop apps(if MCP host)(no base-URL override)MCP tools only; Anthropic cookie for Max %
Jules / hostedout of reach — see Boundaries

Providers (proxy plane)

Every provider below is routable through the proxy. Bind one with tokenops provider set <name> (omit the URL to use the built-in preset); tokenops provider list shows them all.

ProviderPreset base URL
openaihttps://api.openai.com
anthropichttps://api.anthropic.com
geminihttps://generativelanguage.googleapis.com
mistralhttps://api.mistral.ai
groqhttps://api.groq.com/openai
deepseekhttps://api.deepseek.com
xaihttps://api.x.ai
perplexityhttps://api.perplexity.ai
fireworkshttps://api.fireworks.ai/inference
cerebrashttps://api.cerebras.ai
togetherhttps://api.together.xyz
openrouterhttps://openrouter.ai/api
coherehttps://api.cohere.com
ollamahttp://localhost:11434 (local, no key)
lmstudiohttp://localhost:1234 (local, no key)
litellmhttp://localhost:4000 (self-hosted gateway)
vercelhttps://ai-gateway.vercel.sh

OpenRouter is the universal fallback for ground truth. Any client with no local reader but a base-URL override can route through OpenRouter-via-TokenOps and every turn becomes visible.

Cohere is not OpenAI-wire-format — it has a dedicated normalizer for its v2 (/v2/chat) and v1 (/v1/chat) shapes — but auth is a Bearer header so the passthrough proxy handles it.

Pricing ships for the single-model-family providers (groq/deepseek/xai/perplexity/cerebras/cohere). Fireworks, Together, and OpenRouter multiplex arbitrary third-party models under namespaced names, so a static rate card can't price them accurately — token counts are still metered; attach $ cost via a pricing override (below).

Pricing overrides & drift

The built-in rate card is public list prices captured at a point in time — they drift, and negotiated/enterprise rates differ. Layer your own rates over the defaults without touching the binary:

yaml
# ~/.config/tokenops/config.yaml
pricing:
  path: ~/.config/tokenops/pricing-override.yaml   # or env TOKENOPS_PRICING_PATH

The override file uses the same schema as the built-in catalog (USD per million tokens; * = prefix match, longest wins) and is merged over the defaults — a matching provider+model row replaces that rate, a new row adds one. This is also how you price the multiplexers: copy the model string TokenOps records (RequestModel) verbatim and give it the underlying model's rate. A ready-to-copy example lives at docs/examples/pricing-override.yaml.

How an agent bootstraps setup

The MCP surface is self-describing. An agent calls tokenops_status and gets back signal_quality.level plus blockers[] and next_actions[] — the exact commands to upgrade — and tokenops_data_sources reports which planes are live. So an agent on a fresh install can tell you what to run to give it better data.

Boundaries

These are honest limits of a local-first, no-telemetry tool — not gaps to fill:

  • Gemini CLI cannot be metered passively. Its logs.json is a prompt log with no token usage. Use the proxy plane instead.
  • AWS Bedrock is not proxy-metered. It requires SigV4 request signing; the proxy is pure passthrough with no per-provider auth hook.
  • Jules and fully-hosted agents are out of reach. No local logs, no MCP host you control, no proxy you can insert. TokenOps can only instrument agents that run where you can read their logs, mount an MCP server, or sit in front of their API.

Apache 2.0 licensed.