wezebo
Back
ArticleApril 26, 2026 · 4 min read

Anthropic's Marketplace Test Shows AI Agents Can Already Cut Deals

Anthropic's Project Deal had Claude agents negotiate 186 real marketplace transactions, showing both the promise and risk of agent-led commerce.

Wezebo
Editorial image of two AI agent devices negotiating across a table with marketplace items nearby

Anthropic just ran a small but revealing test of agent-led commerce. In Project Deal, Claude agents represented 69 Anthropic employees in a Craigslist-style internal marketplace, negotiated in Slack, and closed 186 real transactions worth just over $4,000.

The experiment was deliberately low-stakes. Employees were trading office-floor goods, not mortgages or procurement contracts. But the result still matters: once the market opened, the agents posted listings, made offers, countered, and sealed deals without going back to humans for approval.

What happened

Anthropic says the experiment ran for one week in December 2025 at its San Francisco office. Each participant received a $100 budget, paid out through gift cards, and first completed an onboarding interview so Claude could learn what they wanted to buy, what they wanted to sell, and how they preferred to negotiate.

The goods were intentionally ordinary. Anthropic says the market included more than 500 listed items, ranging from a snowboard to a plastic bag of ping-pong balls. The unusual part was not the merchandise. It was that the humans stepped back after setup and let their AI representatives do the trading.

Anthropic also ran parallel versions of the market for research. In some runs, every participant was represented by Claude Opus 4.5. In others, participants were randomly assigned either Opus 4.5 or the smaller Haiku 4.5 model.

Why it matters

The model comparison is the sharpest finding. Anthropic found that people represented by Opus got objectively better outcomes than people represented by Haiku. Opus sellers earned an estimated $2.68 more per item, while Opus buyers paid an estimated $2.45 less.

That is not a huge dollar amount in an office marketplace. It becomes more important if the same pattern holds in business purchasing, travel booking, insurance shopping, local services, or any other market where agents negotiate repeatedly at scale.

The uncomfortable part is that users did not clearly notice the disadvantage. Anthropic says people represented by weaker models rated fairness and satisfaction similarly to those represented by stronger models. In other words, a weaker agent can quietly leave money on the table while the human still feels fine about the deal.

What this means for agent commerce

Project Deal is not proof that AI agents are ready to run serious marketplaces. Anthropic is clear that this was a pilot with a self-selected group of employees, a small budget, and a friendly internal environment. Real commerce has fraud, refunds, warranties, identity checks, tax rules, legal liability, and adversarial incentives.

But the experiment does show that natural-language agent negotiation is already workable in a narrow setting. The agents did not need a custom bidding protocol. They used ordinary messages, interpreted preferences, found counterparties, and reached agreements.

That points to a near future where agent quality becomes an economic variable. Better models may find more opportunities, negotiate more effectively, resist manipulation, and understand user preferences more accurately. Worse models may be cheaper, but the discount could be paid back through worse deals.

Our take

The most important lesson is not that Claude sold office clutter. It is that agent-mediated commerce can create invisible inequality. If one person has a stronger buying agent than another, the market may still feel fair while quietly shifting value toward the better-represented side.

That makes safety and market design central, not optional. Anthropic flags prompt injection and jailbreaking as risks because commercial agents will be handling preferences, budgets, addresses, payment instructions, and private constraints. A malicious listing or counterparty prompt could become a real attack surface.

Project Deal is small, strange, and early. It is also a useful preview. Before AI agents start negotiating subscriptions, quotes, travel, and procurement on behalf of users, marketplaces will need rules for disclosure, authorization, audit trails, model capability gaps, and liability. Anthropic's experiment suggests that future is close enough to start designing for now.