Adeline 1 ships. The API ships with it.

Our compaction model goes live today, and the proxy now works with Claude Code, Codex, and any OpenAI or Anthropic SDK.

Last Saturday we said Adeline would ship this week. It has. Adeline 1 is the compaction model now. The Claude Code curl from note 001 still works, and condense.chat now also exposes a route per provider so any OpenAI or Anthropic client (Codex included) is a drop-in. Point your base_url and nothing else changes.

Before getting to Adeline and the API, last week in numbers.

First week in beta

Our first week in beta, measured on the live proxy. Aggregates across every account, no per-user breakdown and nothing identifiable. This is what real Claude Code and SDK traffic looked like on the way upstream.

$730

saved this week

The “money saved” figure each account sees on its dashboard, summed across the week.

65%

fewer input tokens

2.26B tokens of conversation compressed to 788M on the wire before they reached the provider.

8.2k

requests through the proxy

Real beta traffic, Claude Code and SDK sessions routed through condense.chat this week.

Adeline 1

Adeline zero was a placeholder, an open model with a compaction adapter, enough to prove the proxy worked end to end while we built the real thing. Adeline 1 is the real thing. It resolves the compacted rewrite in a handful of parallel passes instead of token by token. That is where the latency win comes from. The full mechanism is a later post.

If you are already running through the proxy, there is nothing to do. Adeline 1 is the default compaction engine as of today; your next session uses it automatically.

How Adeline 1 compares

We promised the hard numbers would land in this post, so here they are. We ran Adeline 1 head to head against Claude Opus 4.7, the model Adeline learns to imitate, and against The Token Company’s bear-1.2, compacting long contexts none of them saw in training. Two metrics per track: how many input tokens it cut, and whether the answer survived the cut.

94.2%

facts kept on agent traces

At 90.2% input token reduction. Within a point of Claude Opus 4.7 on faithfulness, while compacting slightly harder.

90.2% vs 7.6%

input token reduction on agent traces

Adeline 1 against The Token Company’s bear-1.2 on the same input.

method	LongBench v2		CoderForge (SWE)
method	accuracy	reduction	faithfulness	reduction
Adeline 1	27.5%	76%	94.2%	90.2%
Claude Opus 4.7	30%	82%	95.0%	87.6%
The Token Companybear-1.2	25%	13%	72.5%	7.6%

Method. Two tracks of unseen, 30k–100k-token inputs. Long-document QA is LongBench v2 — the long-context benchmark The Token Company publishes on — scored by whether a downstream model answers the question correctly from the compacted context alone. Agentic coding sessions are CoderForge SWE traces, scored by atomic-fact recall against the original. Token reduction is measured in Claude tokens; higher means more compression. The Token Company runs at its 0.30 “light” setting; Opus and Adeline at their normal compaction.

On the agent sessions Adeline is built for, it keeps 94.2% of the facts while cutting 90.2% of the tokens — within a point of the Opus teacher it learned from, and far ahead of a service that strips only ~8%. On adversarial long-document QA — hard enough that every method lands in the 25–30% range — Adeline stays close to Opus and beats The Token Company, while compressing several times harder.

One proxy, both SDKs

Until now condense.chat was an Anthropic-shaped endpoint. As of this release the proxy carries a route for each provider, and it is a true drop-in: keep your existing SDK, keep your upstream key, change one line.

The whole integration is three steps:

Mint a key. Grab an ak_… token from your dashboard.
Point your base_url at the provider route — /openai/v1 or /anthropic — instead of the provider's own URL.
Send the key in the X-Condense-Auth-Token header. Your upstream provider key keeps flowing through untouched; condense never stores it.

OpenAI — point the client at the /openai/v1 route:

pythonfrom openai import OpenAI

client = OpenAI(
    base_url="https://api.condense.chat/openai/v1",
    api_key="sk-...",
    default_headers={"X-Condense-Auth-Token": "ak_..."},
)

Anthropic — point it at the /anthropic route:

pythonfrom anthropic import Anthropic

client = Anthropic(
    base_url="https://api.condense.chat/anthropic",
    api_key="sk-ant-...",
    default_headers={"X-Condense-Auth-Token": "ak_..."},
)

Your upstream provider key is still forwarded verbatim and never stored. The only thing condense needs is your own key — an ak_… token from your dashboard — in the X-Condense-Auth-Token header. Endpoints we do not compact (model lists, embeddings, anything else on the provider's surface) pass straight through untouched, so the SDK behaves exactly as it would talking to the provider directly.

Compact, or just rewrite

A new header, X-Condense-Function, picks what the proxy does with a request:

proxy (the default) — compact the conversation, forward it upstream, stream the answer back. This is the normal path.
rewrite — compact the conversation and hand the rewritten request body straight back to you as JSON, without calling the provider. You see exactly what would have gone upstream, and can inspect it, cache it, or route it yourself.

bashcurl https://api.condense.chat/anthropic/v1/messages \
  -H "X-Condense-Auth-Token: ak_..." \
  -H "X-Condense-Function: rewrite" \
  -H "content-type: application/json" \
  -d @conversation.json

Same two-axis choice on the OpenAI route. Unknown header values fall back to proxy, so the safe default is always the working one.

Full API reference Provider routes, headers, auth, and copy-paste examples for curl and both SDKs.

Read the API docs

See you next week.