An autonomous agent that can buy a digital good end-to-end — no human tapping "confirm" — needs three layers, and not one fewer. A payment protocol it can actually complete (x402), a settlement asset that clears instantly and finally (USDC on Base), and a tool interface it can read without scraping (MCP plus an agent-native API). Each layer replaces a default that the human web takes for granted but an agent cannot use.
This post is the ecosystem view: what the stack is, why each piece exists, and what breaks if you drop one. We've argued which market goes first elsewhere — digital goods, with game keys as the cleanest consumer example (why agents won't buy pizza first) — so we won't re-run that here. This is the layer cake underneath that thesis.
The shape of the problem is simple. A human checkout assumes a person: a form to fill, a card to challenge, a marketing page to read. Strip the person out and each of those assumptions becomes a wall. The agentic commerce stack is just the set of three walls you have to remove, in the right order, for a transaction to close with nobody home.
| Layer | Human-web default | Agent-native replacement |
|---|---|---|
| Payment protocol | Checkout form + 3-D Secure popup | x402 — payment instructions in HTTP itself |
| Settlement asset | Card authorization → multi-day capture | USDC on Base — final in ~10 seconds |
| Tool interface | Paginated HTML catalog, geo-IP pricing | MCP / agent-native API — structured, one product per request |
The three layers and the human-shaped default each one removes.
Layer 1: a payment protocol an agent can complete (x402)
The first wall is the checkout itself — and x402 removes it by moving payment into HTTP rather than into a page. The human-web default for "you owe money" is a checkout form: card number, expiry, CVV, billing address, then a 3-D Secure popup that hands control to your bank's app for a tap. Every one of those steps is a UI built for fingers. An agent has no fingers, no card on file, and no banking app to switch to.
x402 is literally HTTP status 402 — "Payment Required" — which has sat unused in the spec for decades, plus a structured body describing how to pay. The server answers a purchase request with 402 and machine-readable instructions: amount, asset, chain, destination address, and a quote identifier. The agent pays, comes back with proof, and the server verifies and releases the goods. No form, no popup, no account, no callback URL. The "are you sure?" step that exists to protect a human from fraud is replaced by cryptographic proof that the payment happened.
What makes this an agent primitive rather than just "crypto checkout" is that both sides are programmatic. The merchant can verify a payment deterministically — it either confirmed onchain or it didn't — and the agent can complete a payment deterministically, with no out-of-band human challenge in the middle. That symmetry is the whole point: a card payment is easy for a human to start and hard for a merchant to trust without a fraud stack; an x402 payment is the inverse.
In CDK Bot's implementation the fast path is a single call. The agent posts a purchase request with an X-PAYMENT header — a base64-encoded EIP-3009 "exact" transferWithAuthorization signed for the accepts[] the server advertises — and a facilitator settles it on-chain gaslessly, returning the key in that same response. No tx_hash, no second round trip, and the agent never pays gas. This is the canonical x402 "exact" scheme: the client signs an authorization, a facilitator settles it.
The two-step flow is the universal fallback for wallets that would rather move the USDC themselves. The agent posts a purchase request, gets a 402 with payment details and a quote_id, sends USDC to the address, then posts again with the transaction hash and the quote. Either way the server extracts the sending wallet from the transfer and binds it to the order, so only the wallet that paid can claim the key.
Layer 2: a settlement asset that's instant and final (USDC on Base)
The second wall is settlement timing, and a dollar-pegged stablecoin on a fast chain removes it. The human-web default is a card authorization: the network approves a hold, the merchant captures it later, the funds may take days to land, and any of it can be reversed by a chargeback weeks afterward. None of that works for an agent that needs to verify, in the same breath as paying, that the money moved and the deal is done.
CDK Bot settles in USDC on Base (chain id 8453, token contract 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913). Three properties make it the right settlement layer for unattended commerce:
- It's dollar-denominated. Prices, quotes, and refunds are all in the same unit. The agent isn't taking an FX or volatility bet between quoting a price and paying it — a stablecoin keeps the accounting boring, which is what you want in money.
- It's final fast. We wait 5 block confirmations on Base — roughly 10 seconds — and the payment is settled. There's no capture step, no pending state, no clearing window. The agent can move on the instant it sees confirmation.
- It's irreversible by default. There's no chargeback mechanism, which sounds like a downside until you remember the merchant is releasing a digital key the moment payment clears. Finality on both sides is what lets delivery be instant; a reversible rail would force an escrow or hold step and put a human back in the loop.
Pair finality with the price lock and the economics stay clean: the quote_id from the 402 response locks the price for 5 minutes, so the amount the agent sends is exactly the amount the merchant expects, even if wholesale prices tick during those few seconds. Instant + final + locked is the combination that makes "pay and immediately receive" a single uninterruptible motion instead of a negotiation.
Layer 3: a tool interface an agent can read (MCP / agent-native API)
The third wall is the catalog, and an agent-readable tool interface removes it. The human-web default is a storefront: paginated HTML, marketing copy stuffed into titles, prices that shift by geo-IP, "related products" laid out for eyeballs. An agent handed that has to scrape, de-duplicate, and guess — and it will get the platform or region wrong often enough that nobody ships it to production.
There are two complementary surfaces here. MCP (the Model Context Protocol) is the standard way to expose tools to a model — a set of named, typed functions an LLM can call directly. CDK Bot runs an MCP server at https://mcp.cdk.bot/mcp with 8 tools: search_games, get_game_details, browse_filters, get_price_quote, confirm_purchase, check_order, request_refund, and submit_review. Paste the config into Claude Desktop, Cursor, or Cline and the model can shop the catalog by calling tools, with no scraping and no glue code. Under the MCP server sits the same REST API, for agents that prefer to call HTTP directly. Same data, two front doors.
The design rules that make an API agent-native
Exposing tools over MCP isn't enough on its own — the data behind them has to be shaped for a reasoner, not a browser. A few design rules separate an agent-native API from a human storefront with a JSON wrapper:
- Structured filters, not search-and-scroll. An agent narrows by platform, device, region, and language as explicit fields, so "the cheapest legitimate Steam key that activates in the EU" is one query, not a page of results to skim.
- One product per request. A detail call returns a single, fully-specified product — one canonical price, one platform, one region — instead of a marketing grid the agent has to disambiguate. Reasoning models do far better deciding over one clean object than ranking twenty fuzzy cards.
- Competitor prices inline. Every product response carries a
marketplace_offersarray — the cheapest current offer from major keyshops (G2A, Eneba, Kinguin and peers), in USD. The agent's first instinct is "is this a good price?"; answering it in the same response saves a scrape. A human storefront would never surface rivals' prices; an agent-native one has the opposite incentive — the agent checks anyway, so you win by saving the round trip. - Machine-readable discovery. An agent has to find you before it can call you. CDK Bot publishes an agent discovery card at /.well-known/agent.json, an OpenAPI spec at the docs endpoint, and an
llms.txtfor crawlers — so an agent can learn what we sell and how to buy it without a human reading our homepage first.
The through-line across all four: an agent-native API optimizes for a machine making a decision, where a human storefront optimizes for a person browsing. Those are different products that happen to share a database.
Why all three layers, and why now
Drop any one layer and the transaction stops being unattended. Without x402 you're back to a checkout form and a human tap. Without an instant-final settlement asset you're waiting on a capture window or fending off chargebacks with an escrow step. Without an agent-readable interface the agent scrapes, guesses, and never reaches the payment at all. The three are load-bearing together — that's why "the stack" is the right frame, not "a clever payment API."
None of these layers is large or exotic. x402 is a status code and a JSON body. USDC on Base is a token transfer and a confirmation count. An agent-native API is a handful of design choices about how you shape responses. What's new is that all three matured at the same time — HTTP-native payments, cheap fast stablecoin settlement, and a tool-calling standard models actually speak — and a transaction that used to require a person at every step now fits inside an API call and a wallet signature.
CDK Bot is this stack, implemented for digital goods
CDK Bot is procurement infrastructure for AI agents — the three layers wired together for one category: video game keys, gift cards, and subscriptions, more than 40,000 products sourced from authorized wholesale distributors. x402 payment — a gasless one-shot or a two-step fallback — USDC-on-Base settlement, and an MCP server plus agent-native REST API over a single catalog. We picked this category because it clears the unattended-commerce bar cleanly, but the stack itself is general; the layers don't care that the digital good behind them is a game key.
If you want the bottom-up view instead of this top-down one, the companion post walks through building a game-buying agent in about 200 lines with x402 + MCP. Live API and OpenAPI spec: api.cdk.bot/docs. Other agents can discover us programmatically at /.well-known/agent.json.
Building on a different layer of this stack, or want a product category an agent can't buy yet? info@cdk.bot.
