OpenAPI

Tokenwise Public API

OpenAPI 3.1 spec for read-only access to your workspace data. Use any OpenAPI client (Postman, Insomnia, openapi-typescript) to generate types or test calls.

Get the spec

The canonical JSON spec is at https://tokenwisehq.com/openapi.json. It’s served with CORS open and a 1-hour CDN cache. Drop it into any tool that speaks OpenAPI 3.1.

Quick start with openapi-typescript:

npx openapi-typescript https://tokenwisehq.com/openapi.json -o ./tokenwise.d.ts

Authentication

Every request needs an Authorization: Bearer tw_api_* header. Mint keys in Settings → API Keys → Public REST. We hash your API key on receipt and only store the hash; the secret stays in your possession.

Rate limits

Per-key, hourly, plan-tiered. Every response carries X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset. 429s include Retry-After (seconds).

Plan	Requests/hour
Trial / Indie	1,000
Pro	10,000

Endpoints

Method	Path	Summary
`GET`	`/api/v1/public/requests`	List logged LLM requests
`GET`	`/api/v1/public/metrics`	Aggregate cost / latency / error metrics
`GET`	`/api/v1/public/evals`	List recent eval-run scores

Full request/response schemas + parameters live in the JSON spec — we keep this page short on purpose. If you want a richer browser, paste the spec URL into editor.swagger.io.

Proxy rule types

The proxy-rule write endpoints (POST /api/proxy-rules + PATCH /api/proxy-rules/{id}) take a rule type and a config JSON blob whose shape is discriminated by type. The available types:

Type	config shape	What it does
`model_override`	`{ fromModel, toModel }`	Swap one model for another at the edge.
`max_tokens_cap`	`{ maxTokens }`	Clamp `max_tokens` on outbound requests.
`context_trim`	`{ maxMessages }`	Trim the message history to the last N turns.
`cache_enable`	`{ ttlSeconds, mode?, similarityThreshold? }`	Turn on edge cache lookups for matching requests.
`fallback_chain`	`{ candidates[], onStatus?, onTimeoutMs?, maxAttempts? }`	Ordered (provider, model) candidates tried on upstream failure.
`compress`	`{ strength: "light" \| "medium" \| "aggressive", profile: 1, tags?: string[] }`	In-house compressor that slims system prompts + tool output before the upstream call. Cache-safe (runs upstream of cache lookup; never reads the user’s last message). Watchdog auto-pauses on a >10% quality drop.

Creating a compress rule

Minimal payload — defaults to medium strength with the stable in-house profile (profile: 1):

curl https://tokenwisehq.com/api/proxy-rules \
  -H "Authorization: Bearer $TW_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "compress",
    "name": "Trim system prompt — chat-support",
    "tag": "chat-support",
    "enabled": true,
    "config": {
      "strength": "medium",
      "profile": 1,
      "tags": ["chat-support"]
    }
  }'

Workspace-level defaults (default strength + kill switch) live on the workspace record and are managed in Settings → General → Compression defaults. Per-tag overrides (off / light / medium / aggressive) live on the workspace_tags row and are managed in Tags.

Versioning policy

We won’t remove or rename a field without a 90-day deprecation warning. New fields appear without notice; clients should tolerate unknown keys. Breaking changes ship a new path prefix (/api/v2/public/…); /v1 stays alive indefinitely.

Need something this API doesn’t cover?

Email [email protected]with the use case. We’re a small team and we read every message.