API reference
condense.chat sits between your app and the upstream LLM provider.
Point your SDK at api.condense.chat/{provider},
add your condense key, keep your provider key.
We track the conversation as a content-addressed chain and
transparently compress repeated context on the way upstream.
Overview
Base URL: https://api.condense.chat.
Every endpoint requires a condense API key. The dialect is
selected by the path prefix; an
X-Condense-Function header picks
whether we forward to the upstream or just return the rewritten
body.
| Anthropic dialect | POST /anthropic/v1/messages — drop-in for the Anthropic SDK / Claude Code. |
|---|---|
| OpenAI dialect | POST /openai/v1/chat/completions — drop-in for the OpenAI SDK. |
| Pass-through | Any other path under /{provider}/… (models list, embeddings, etc.) is forwarded verbatim. |
| Function | Header X-Condense-Function: proxy | rewrite — default proxy. |
Sign up & mint a key
- Open login.condense.chat and sign in with Google or email.
- Visit helm.condense.chat → New API key.
- Copy the generated
ak_…secret. It is shown once; the dashboard only retains a suffix abbreviation afterwards.
Keep your upstream provider key (sk-ant-… for Anthropic,
sk-… for OpenAI) handy — it travels alongside the condense key on every request.
Authentication
Two keys travel on every request: your condense key (gates access to condense.chat) and your upstream provider key (charged for the model call). We never store the upstream key — only a sha256 fingerprint in the usage ledger.
The condense API key (ak_…) always
travels in the X-Condense-Auth-Token
header — for both providers. The upstream key goes in whatever
header the provider expects: x-api-key
for Anthropic, Authorization: Bearer
for OpenAI. Using a custom header for the condense key means
Authorization is always available for
the upstream, with no precedence conflicts in any SDK.
Anthropic
httpPOST /anthropic/v1/messages
X-Condense-Auth-Token: ak_<your-condense-key>
x-api-key: sk-ant-<your-anthropic-key>
anthropic-version: 2023-06-01
Content-Type: application/json
OpenAI
httpPOST /openai/v1/chat/completions
X-Condense-Auth-Token: ak_<your-condense-key>
Authorization: Bearer sk-<your-openai-key>
Content-Type: application/json
Quickstart
OpenAI SDK (Python)
pythonfrom openai import OpenAI
client = OpenAI(
api_key="sk-...", # your upstream OpenAI key
base_url="https://api.condense.chat/openai/v1",
default_headers={"X-Condense-Auth-Token": "ak_..."},
)
resp = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "hi"}],
)
Anthropic SDK (Python)
pythonfrom anthropic import Anthropic
client = Anthropic(
api_key="sk-ant-...", # your upstream Anthropic key
base_url="https://api.condense.chat/anthropic",
default_headers={"X-Condense-Auth-Token": "ak_..."},
)
msg = client.messages.create(
model="claude-haiku-4-5",
max_tokens=1024,
messages=[{"role": "user", "content": "hi"}],
)
proxy vs. rewrite
The same pipeline runs for both; only the last step differs.
proxy (default) |
Forward the rewritten body to the upstream provider and stream the response back to you. You pay for the upstream call. |
|---|---|
rewrite |
Return the rewritten body as JSON. No upstream call, no upstream cost. Useful for debugging which messages got packed, or for piping the result into a different runtime. |
bashcurl https://api.condense.chat/anthropic/v1/messages \
-H "X-Condense-Auth-Token: ak_..." \
-H "x-api-key: sk-ant-..." \
-H "anthropic-version: 2023-06-01" \
-H "X-Condense-Function: rewrite" \
-d '{"model":"claude-haiku-4-5","max_tokens":256,"messages":[...]}'
Per-request headers
| Header | Effect |
|---|---|
| X-Condense-Auth-Token | Required. Your condense API key (ak_…). Used for both Anthropic and OpenAI paths. |
| X-Condense-Function | proxy (default) or rewrite. Unknown values fall back to proxy. |
| X-Condense-Upstream-Key | Override the upstream key without putting it in the dialect's normal header slot. |
| Authorization / x-api-key | Your upstream provider key (Anthropic uses x-api-key, OpenAI uses Authorization: Bearer). Forwarded verbatim; never stored. |
Claude Code
The installer at cc.condense.chat
wraps the official claude CLI and points it at us:
bash# Linux / macOS
curl -fsSL https://cc.condense.chat | sh
# Windows (PowerShell)
iwr https://cc.condense.chat/windows | iex
The installer walks you through device-code authorization
(no key copy/paste), stashes the token under
~/.condense/, and exports
ANTHROPIC_BASE_URL=https://api.condense.chat/anthropic
so claude talks to us instead of going direct.
Bring your own ANTHROPIC_API_KEY — it forwards through
verbatim as the upstream key.
Worked examples
Six short guides — one per provider ×
function. Each is self-contained: a paragraph of what the
mode does, the smallest possible curl
that exercises it, then the same call in Python (no SDK, just
urllib) wired into a three-turn tool
loop that computes (17 + 25) * 3. Pick
a tab; copy the snippet; replace the keys; run.
The two condense-managed functions:
- proxy — condense compresses your messages, forwards to the upstream provider, and returns the provider's normal completion. You pay for the upstream call. This is the default and what most callers want.
- rewrite — condense returns the compressed request body as JSON without contacting the upstream. No upstream key needed. Use it to inspect what condense would send, or to drive the upstream call yourself.
direct tabs are baselines — they don't touch condense at all, included so you can compare and see exactly which headers change.
Baseline: talk to OpenAI directly. Sends one OpenAI key in
Authorization; no condense
involvement. Use this as the control variant when comparing
against the proxy tab.
bash# single-turn smoke
curl https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"hi"}]}'
python# three-turn tool loop; pip-free
import json, os
from urllib.request import Request, urlopen
URL = "https://api.openai.com/v1/chat/completions"
KEY = os.environ["OPENAI_API_KEY"]
TOOLS = [
{"type":"function","function":{"name":"add","parameters":{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}}}}},
{"type":"function","function":{"name":"mul","parameters":{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}}}}},
{"type":"function","function":{"name":"final_answer","parameters":{"type":"object","properties":{"text":{"type":"string"}}}}},
]
def tool(name, args):
if name == "add": return str(args["a"] + args["b"])
if name == "mul": return str(args["a"] * args["b"])
return args["text"]
def post(messages):
body = json.dumps({"model":"gpt-4o-mini","messages":messages,"tools":TOOLS,"tool_choice":"auto"}).encode()
req = Request(URL, method="POST", data=body, headers={
"Authorization": f"Bearer {KEY}",
"Content-Type": "application/json",
})
return json.loads(urlopen(req).read())
messages = [{"role":"user","content":"Compute (17 + 25) * 3 using add and mul, then call final_answer."}]
for _ in range(6):
msg = post(messages)["choices"][0]["message"]
if not msg.get("tool_calls"): print(msg.get("content") or ""); break
messages.append(msg)
for tc in msg["tool_calls"]:
args = json.loads(tc["function"]["arguments"] or "{}")
result = tool(tc["function"]["name"], args)
print(f"{tc['function']['name']}({args}) -> {result}")
messages.append({"role":"tool","tool_call_id":tc["id"],"content":result})
if tc["function"]["name"] == "final_answer": raise SystemExit
Same shape as direct, but the URL is condense's
/openai/v1/chat/completions, the
condense key rides in
X-Condense-Auth-Token, and your
OpenAI key stays in
Authorization. Condense compresses,
forwards, streams back the upstream response — your code doesn't
change.
bashcurl https://api.condense.chat/openai/v1/chat/completions \
-H "X-Condense-Auth-Token: $CONDENSE_KEY" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"hi"}]}'
python# identical to the direct example, only URL + headers change
import json, os
from urllib.request import Request, urlopen
URL = "https://api.condense.chat/openai/v1/chat/completions"
OAI = os.environ["OPENAI_API_KEY"]
CON = os.environ["CONDENSE_KEY"]
TOOLS = [
{"type":"function","function":{"name":"add","parameters":{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}}}}},
{"type":"function","function":{"name":"mul","parameters":{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}}}}},
{"type":"function","function":{"name":"final_answer","parameters":{"type":"object","properties":{"text":{"type":"string"}}}}},
]
def tool(name, args):
if name == "add": return str(args["a"] + args["b"])
if name == "mul": return str(args["a"] * args["b"])
return args["text"]
def post(messages):
body = json.dumps({"model":"gpt-4o-mini","messages":messages,"tools":TOOLS,"tool_choice":"auto"}).encode()
req = Request(URL, method="POST", data=body, headers={
"X-Condense-Auth-Token": CON,
"Authorization": f"Bearer {OAI}",
"Content-Type": "application/json",
})
return json.loads(urlopen(req).read())
messages = [{"role":"user","content":"Compute (17 + 25) * 3 using add and mul, then call final_answer."}]
for _ in range(6):
msg = post(messages)["choices"][0]["message"]
if not msg.get("tool_calls"): print(msg.get("content") or ""); break
messages.append(msg)
for tc in msg["tool_calls"]:
args = json.loads(tc["function"]["arguments"] or "{}")
result = tool(tc["function"]["name"], args)
print(f"{tc['function']['name']}({args}) -> {result}")
messages.append({"role":"tool","tool_call_id":tc["id"],"content":result})
if tc["function"]["name"] == "final_answer": raise SystemExit
Two-step: condense rewrites your body and hands it back without
calling upstream, then you forward it yourself. Header switch
is X-Condense-Function: rewrite.
No upstream key needed for the first call. Useful when you
want to inspect, log, or batch the compressed payload before
spending tokens.
bash# step 1 — get the compressed body
curl https://api.condense.chat/openai/v1/chat/completions \
-H "X-Condense-Auth-Token: $CONDENSE_KEY" \
-H "X-Condense-Function: rewrite" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"hi"}]}' \
> rewritten.json
# step 2 — forward it to OpenAI yourself
curl https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d @rewritten.json
python# tool loop where every turn is rewrite-then-forward
import json, os
from urllib.request import Request, urlopen
REWRITE = "https://api.condense.chat/openai/v1/chat/completions"
UPSTREAM = "https://api.openai.com/v1/chat/completions"
OAI = os.environ["OPENAI_API_KEY"]
CON = os.environ["CONDENSE_KEY"]
TOOLS = [
{"type":"function","function":{"name":"add","parameters":{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}}}}},
{"type":"function","function":{"name":"mul","parameters":{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}}}}},
{"type":"function","function":{"name":"final_answer","parameters":{"type":"object","properties":{"text":{"type":"string"}}}}},
]
def tool(name, args):
if name == "add": return str(args["a"] + args["b"])
if name == "mul": return str(args["a"] * args["b"])
return args["text"]
def turn(messages):
body = json.dumps({"model":"gpt-4o-mini","messages":messages,"tools":TOOLS,"tool_choice":"auto"}).encode()
rewritten = urlopen(Request(REWRITE, method="POST", data=body, headers={
"X-Condense-Auth-Token": CON,
"X-Condense-Function": "rewrite",
"Content-Type": "application/json",
})).read()
return json.loads(urlopen(Request(UPSTREAM, method="POST", data=rewritten, headers={
"Authorization": f"Bearer {OAI}",
"Content-Type": "application/json",
})).read())
messages = [{"role":"user","content":"Compute (17 + 25) * 3 using add and mul, then call final_answer."}]
for _ in range(6):
msg = turn(messages)["choices"][0]["message"]
if not msg.get("tool_calls"): print(msg.get("content") or ""); break
messages.append(msg)
for tc in msg["tool_calls"]:
args = json.loads(tc["function"]["arguments"] or "{}")
result = tool(tc["function"]["name"], args)
print(f"{tc['function']['name']}({args}) -> {result}")
messages.append({"role":"tool","tool_call_id":tc["id"],"content":result})
if tc["function"]["name"] == "final_answer": raise SystemExit
Baseline: talk to Anthropic directly. The Anthropic dialect
puts the upstream key in x-api-key
(not Authorization) and requires
the anthropic-version header.
Tools are typed; the model's tool calls come back as
tool_use blocks; tool results go
back as tool_result blocks.
bashcurl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{"model":"claude-haiku-4-5-20251001","max_tokens":256,"messages":[{"role":"user","content":"hi"}]}'
pythonimport json, os
from urllib.request import Request, urlopen
URL = "https://api.anthropic.com/v1/messages"
KEY = os.environ["ANTHROPIC_API_KEY"]
TOOLS = [
{"name":"add","input_schema":{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}},"required":["a","b"]}},
{"name":"mul","input_schema":{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}},"required":["a","b"]}},
{"name":"final_answer","input_schema":{"type":"object","properties":{"text":{"type":"string"}},"required":["text"]}},
]
def tool(name, args):
if name == "add": return str(args["a"] + args["b"])
if name == "mul": return str(args["a"] * args["b"])
return args["text"]
def post(messages):
body = json.dumps({
"model":"claude-haiku-4-5-20251001","max_tokens":1024,
"messages":messages,"tools":TOOLS,
}).encode()
req = Request(URL, method="POST", data=body, headers={
"x-api-key": KEY,
"anthropic-version": "2023-06-01",
"Content-Type": "application/json",
})
return json.loads(urlopen(req).read())
messages = [{"role":"user","content":"Compute (17 + 25) * 3 using add and mul, then call final_answer."}]
for _ in range(6):
resp = post(messages)
blocks = resp["content"]
messages.append({"role":"assistant","content":blocks})
calls = [b for b in blocks if b["type"] == "tool_use"]
if not calls:
print("".join(b.get("text", "") for b in blocks)); break
results = []
for b in calls:
result = tool(b["name"], b["input"])
print(f"{b['name']}({b['input']}) -> {result}")
results.append({"type":"tool_result","tool_use_id":b["id"],"content":result})
if b["name"] == "final_answer": raise SystemExit
messages.append({"role":"user","content":results})
Anthropic-dialect proxy. The condense key rides in
X-Condense-Auth-Token (same as the
OpenAI path); your Anthropic key stays in
x-api-key. Streaming
("stream": true) works
identically to the direct path.
bashcurl https://api.condense.chat/anthropic/v1/messages \
-H "X-Condense-Auth-Token: $CONDENSE_KEY" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{"model":"claude-haiku-4-5-20251001","max_tokens":256,"messages":[{"role":"user","content":"hi"}]}'
pythonimport json, os
from urllib.request import Request, urlopen
URL = "https://api.condense.chat/anthropic/v1/messages"
ANT = os.environ["ANTHROPIC_API_KEY"]
CON = os.environ["CONDENSE_KEY"]
TOOLS = [
{"name":"add","input_schema":{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}},"required":["a","b"]}},
{"name":"mul","input_schema":{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}},"required":["a","b"]}},
{"name":"final_answer","input_schema":{"type":"object","properties":{"text":{"type":"string"}},"required":["text"]}},
]
def tool(name, args):
if name == "add": return str(args["a"] + args["b"])
if name == "mul": return str(args["a"] * args["b"])
return args["text"]
def post(messages):
body = json.dumps({"model":"claude-haiku-4-5-20251001","max_tokens":1024,"messages":messages,"tools":TOOLS}).encode()
req = Request(URL, method="POST", data=body, headers={
"X-Condense-Auth-Token": CON,
"x-api-key": ANT,
"anthropic-version": "2023-06-01",
"Content-Type": "application/json",
})
return json.loads(urlopen(req).read())
messages = [{"role":"user","content":"Compute (17 + 25) * 3 using add and mul, then call final_answer."}]
for _ in range(6):
resp = post(messages); blocks = resp["content"]
messages.append({"role":"assistant","content":blocks})
calls = [b for b in blocks if b["type"] == "tool_use"]
if not calls: print("".join(b.get("text", "") for b in blocks)); break
results = []
for b in calls:
result = tool(b["name"], b["input"])
print(f"{b['name']}({b['input']}) -> {result}")
results.append({"type":"tool_result","tool_use_id":b["id"],"content":result})
if b["name"] == "final_answer": raise SystemExit
messages.append({"role":"user","content":results})
Anthropic two-step. The compressed body comes back from
condense and you forward it to
api.anthropic.com yourself.
x-api-key and
anthropic-version are only needed
on the upstream forward; the rewrite step takes the condense
key alone.
bash# step 1 — get the compressed body
curl https://api.condense.chat/anthropic/v1/messages \
-H "X-Condense-Auth-Token: $CONDENSE_KEY" \
-H "X-Condense-Function: rewrite" \
-H "Content-Type: application/json" \
-d '{"model":"claude-haiku-4-5-20251001","max_tokens":256,"messages":[{"role":"user","content":"hi"}]}' \
> rewritten.json
# step 2 — forward it to Anthropic yourself
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d @rewritten.json
pythonimport json, os
from urllib.request import Request, urlopen
REWRITE = "https://api.condense.chat/anthropic/v1/messages"
UPSTREAM = "https://api.anthropic.com/v1/messages"
ANT = os.environ["ANTHROPIC_API_KEY"]
CON = os.environ["CONDENSE_KEY"]
TOOLS = [
{"name":"add","input_schema":{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}},"required":["a","b"]}},
{"name":"mul","input_schema":{"type":"object","properties":{"a":{"type":"number"},"b":{"type":"number"}},"required":["a","b"]}},
{"name":"final_answer","input_schema":{"type":"object","properties":{"text":{"type":"string"}},"required":["text"]}},
]
def tool(name, args):
if name == "add": return str(args["a"] + args["b"])
if name == "mul": return str(args["a"] * args["b"])
return args["text"]
def turn(messages):
body = json.dumps({"model":"claude-haiku-4-5-20251001","max_tokens":1024,"messages":messages,"tools":TOOLS}).encode()
rewritten = urlopen(Request(REWRITE, method="POST", data=body, headers={
"X-Condense-Auth-Token": CON,
"X-Condense-Function": "rewrite",
"Content-Type": "application/json",
})).read()
return json.loads(urlopen(Request(UPSTREAM, method="POST", data=rewritten, headers={
"x-api-key": ANT,
"anthropic-version": "2023-06-01",
"Content-Type": "application/json",
})).read())
messages = [{"role":"user","content":"Compute (17 + 25) * 3 using add and mul, then call final_answer."}]
for _ in range(6):
resp = turn(messages); blocks = resp["content"]
messages.append({"role":"assistant","content":blocks})
calls = [b for b in blocks if b["type"] == "tool_use"]
if not calls: print("".join(b.get("text", "") for b in blocks)); break
results = []
for b in calls:
result = tool(b["name"], b["input"])
print(f"{b['name']}({b['input']}) -> {result}")
results.append({"type":"tool_result","tool_use_id":b["id"],"content":result})
if b["name"] == "final_answer": raise SystemExit
messages.append({"role":"user","content":results})
claude-agent-sdk-python
is the official Anthropic SDK for spawning long-running
tool-using agents. It shells out to claude
under the hood, so all you need to do is point its
ANTHROPIC_BASE_URL at condense's
/anthropic prefix and pass your
condense key via the SDK's
env option.
direct and rewrite modes aren't relevant
here — the agent loop needs upstream completions.
bash# install once
pip install claude-agent-sdk
npm install -g @anthropic-ai/claude-code # the SDK shells out to this
pythonimport anyio, os
from claude_agent_sdk import query, ClaudeAgentOptions
opts = ClaudeAgentOptions(env={
"ANTHROPIC_BASE_URL": "https://api.condense.chat/anthropic",
"ANTHROPIC_CUSTOM_HEADERS": f"X-Condense-Auth-Token: {os.environ['CONDENSE_KEY']}",
"ANTHROPIC_API_KEY": os.environ["ANT_API_TOKEN"],
})
async def main():
async for msg in query(prompt="Compute (17 + 25) * 3 step by step.", options=opts):
print(msg)
anyio.run(main)
Every tool turn passes through condense, which records the
chain and (if a compression strategy is configured) compacts
older spans before forwarding upstream. Set per-request
tuning headers with ANTHROPIC_CUSTOM_HEADERS
if you want to override the engine for one session.
Status today. Anthropic completes all three modes
end-to-end; proxy may use 1–2
extra turns for tool-heavy conversations due to minor metadata
drift. OpenAI completes direct
cleanly; proxy and
rewrite currently fail on the second
turn of a tool loop because the OpenAI dialect mapper drops
assistant tool_calls on re-emit.
Tracked; single-turn calls work fine through both functions.
Errors
Errors follow standard HTTP status codes. Bodies are the dialect's
native error JSON for upstream-origin failures (so SDK error parsers
keep working); condense-origin failures use a small
{"error": {"type": "...", "message": "..."}} shape.
| Status | Meaning |
|---|---|
| 401 | Missing or invalid condense key. |
| 400 | Bad request body, or a malformed Authorization header. |
| 429 | Rate-limited. Back off and retry. |
| 503 | Condense pipeline failed quality gate (only on the explicit /condense-compact path). |
| 5xx | Upstream provider error — the upstream status and body are forwarded verbatim. |
Endpoints
Identical request/response shape to api.anthropic.com/v1/messages.
Streaming (stream: true) is preserved end-to-end; SSE chunks
flow through live. All anthropic-* headers
(prompt caching, betas) forward through unchanged.
Identical request/response shape to
api.openai.com/v1/chat/completions.
Streaming is preserved; tool calls round-trip.
Any path under /anthropic/ or
/openai/ that we don't claim explicitly is
forwarded verbatim to the upstream — e.g.
/anthropic/v1/models,
/openai/v1/embeddings. The condense pipeline does
not touch the body; this is purely an SDK-compat convenience so a
single base URL works for the whole provider surface.