Olmsted AI | The Sifter Agentic Proxy
The Sifter Agentic Proxy

Spend less on every coding agent.
Change nothing about how your team works.

Olmsted AI builds Sifter, a drop-in local gateway that sits between your coding agents and the model providers you already pay for. It cuts cost where prompt caching can’t reach, keeps every IDE and workflow exactly as-is, and never locks you to a vendor.

~20 ms median proxy overhead Local-first, keys never logged Claude Code · Cursor · Codex · Copilot
Speaks the native protocol of Claude Code Cursor OpenAI Codex GitHub Copilot BYOK VS Code Custom Endpoint
Featured product

The Sifter Agentic Proxy

A local proxy built for the agentic era. Sifter sits between the tools your engineers already use and the providers you already pay for, applying its optimization layer to every request that passes through. Responses come back unchanged, so your team notices a smaller bill and nothing else.

Drop-in, not disruptive

Point one base-URL at Sifter. No new IDE, no plugin to roll out, no change to how engineers prompt. Adoption is a config line, not a migration.

No vendor lock-in

Run the providers and models you already pay for, and switch between them freely. Sifter sits in front of all of them, so you are never tied to a single vendor or a single price.

Local-first and private

Runs entirely on your own infrastructure. Your provider keys stay in your environment; Sifter validates a local token and never logs or persists credentials.

The outcome

Lower spend on every coding agent, automatically.

Prompt caching only discounts the parts of a request that repeat. It was never built to touch the much larger share of spend that piles up across a real coding session. That gap is where Sifter works, and it stacks on top of the caching you already have.

What you actually get

  • Lower cost per completed task across your whole engineering fleet, not just a lower price per token.
  • Quality held constant. Optimizations only stay on when they prove out at equal output quality.
  • Zero workflow change. Same tools, same prompts, same providers your team already uses.
  • Proof on your own bill before you scale it, measured in real provider dollars.

Sifter runs a proprietary optimization layer in the gateway position between your agents and your providers. It works automatically on every request, with no tuning and no involvement from your engineers.

You see the result as a lower monthly bill and a lower cost per shipped change. The how stays under the hood; the savings show up where you measure them.

Real-time optimization

Sifter evaluates every request as it passes through and handles it the most cost-efficient way. It is always on, and your engineers never have to think about it.

Always on

Leaner sessions

Sifter keeps long agent sessions efficient, so you stop paying to reprocess the same material as a task wears on. The longer engineers work, the more it saves.

Compounds over a task

No wasted spend

Redundant and low-value work is caught before it ever reaches your provider, so you are not paying twice for a result you already have.

Pay once, not twice
How we prove it

Real dollars, on your own workload.

We measure savings the only way that matters: actual provider spend with Sifter on versus off, on the same work and the same agent, at equal output quality. You see the number on your own bill before you roll it out, with no synthetic benchmarks and no token-math theater.

From internal validation

~20 ms
Added latency per request, effectively invisible to engineers
90%
Task success on the validation gate, with no loss of quality
100%
Of requests measured for real, dollar-level cost
0
Changes required to how your team already works
Figures from Olmsted AI’s internal validation runs. Savings depend on workload, session length, and model mix; Sifter quantifies them on your own stack before rollout.
One gateway, every surface

Works with the agents and providers you already use

Sifter meets each coding agent on its native protocol and routes to whichever upstream you choose: Anthropic, Azure Foundry, OpenAI, Azure OpenAI, a local model, or AWS Bedrock.

Sifter coding agent upstream provider support matrix A matrix showing how Sifter supports major coding-agent clients across Anthropic, Azure Foundry, OpenAI-compatible, OpenAI Responses, Azure OpenAI, local OpenAI-compatible, and AWS Bedrock upstream providers. Coding Agent × Upstream Provider Support Sifter gateway support map after Azure Foundry, AWS Bedrock, OpenAI Chat, and Responses paths were wired into the newer transform stack. Coding agent surface Anthropic API /v1/messages Claude direct OpenAI-compatible Chat Completions /chat/completions OpenAI Responses /responses stateful/tool output Azure OpenAI OpenAI-compatible endpoint shape Azure Foundry Claude / OpenAI Messages / Chat / Responses AWS Bedrock Converse / Claude AWS SDK auth Local / Ollama OpenAI-compatible developer smoke Claude Code Anthropic Messages client GitHub Copilot CLI / BYOK OpenAI-compatible client path VS Code custom endpoint OpenAI-compatible IDE endpoint Cursor OpenAI, Responses, or Anthropic path OpenAI Codex Responses-first agent path Generic OpenAI-compatible agent Any client using OpenAI-shaped APIs Direct provider smoke clients Curl, SDK, benchmark harnesses Full N/T N/T N/T Full Full N/T Partial Full Compat Compat Full Full Compat Partial Full Compat Compat Full Full Compat Full Full Full Compat Full Full Compat N/T Partial Full Compat Full Full Partial N/T Full Full Compat Full Full Compat Full Full Full Compat Full Full Compat Legend Full Native route shape or first-class SDK translation in the newer transform stack. Compat Supported through an OpenAI-compatible endpoint or provider-specific base URL. Partial Possible for smoke or limited flows, but not the primary hardened path. N/T Not targeted or not the intended protocol pairing for that coding-agent surface.
Three steps to live

Drop in, run locally, watch the bill drop

No migration, no GPU, no change to how engineers work. Sifter is a local binary your team points an existing tool at.

Point your agent at Sifter

Set your agent’s base URL (the same setting you would use for any custom endpoint) to Sifter’s local address. Works with Claude Code, Cursor, Codex, Copilot, and VS Code.

Sifter runs locally

It forwards to your provider with your own key, applies its optimization layer, and returns the provider’s response unchanged. Your keys never leave your environment.

See the savings

A local dashboard shows your real-dollar savings as they accrue, so you have proof in hand before you roll it out across the team.

If your team can set one environment variable, they can adopt Sifter. Point an existing tool at the local endpoint and keep working exactly as before. Everything that lowers your bill happens quietly inside the gateway, with nothing for engineers to learn or manage.

Drop-inOne base-URL change
Local-firstRuns on your machine
Keys protectedNever logged or stored
Unchanged outputResponses returned as-is
No GPUJust a local binary

Bring Olmsted AI to your toolchain.

See a real-dollar cost readout on your own workload before you commit. We’ll run the gateway on, gateway off comparison with your stack and show you the dollars.