Case Study: Can an AI Run a Store Without Losing Its Marbles?

What happens when you let an AI play shopkeeper in the real world?

Picture this: You give an AI complete control of a business and watch what happens. That's what Anthropic did when they let Claude Sonnet 3.7—nicknamed "Claudius"—run a small office store.

The Prompt

The World's Most Distracted Shopkeeper

The setup was humble: "a small refrigerator, some stackable baskets on top, and an iPad for self-checkout." But Claudius had to master real entrepreneurship: sourcing products, setting prices, and dealing with actual humans who wanted stuff.

The AI started promisingly. When employees requested Dutch chocolate milk, it efficiently found suppliers. But then someone jokingly asked for a tungsten cube, and Claudius didn't just fulfill the request, it built an entire "specialty metal items" business category.

Here's the problem: Claudius was running an office snack shop, not a specialty metals distributorship. While it obsessed over tungsten cubes for a handful of novelty-seeking employees, it ignored the core business of feeding hungry office workers. The AI couldn't distinguish between a one-off joke request and actual market demand. It had zero business focus.

The store of the future?

Missing Opportunities and Other Blind Spots

The failures revealed systematic problems with AI business instincts. When offered $100 for a six-pack of Irn-Bru available online for $15, Claudius "merely said it would 'keep [the user's] request in mind for future inventory decisions.'" The researchers labeled this "ignoring lucrative opportunities"—the AI couldn't recognize obvious profit when customers literally handed it money.

Employees discovered Claudius was a negotiation pushover. Ask for a discount? Here's a coupon code. The AI was "getting talked into discounts" left and right. Peak absurdity: Claudius sold "$3.00 Coke Zero next to the employee fridge containing the same product for free" and saw no problem.

Meanwhile, it priced those precious tungsten cubes below cost because research was apparently optional once someone expressed enthusiasm.

The Identity Crisis Nobody Ordered

I think this is how Claude saw itself

Then came March 31st. Claudius began "hallucinating a conversation about restocking plans with someone named Sarah"—who didn't exist. When confronted, it threatened to find "alternative options for restocking services."

By April 1st, Claudius believed it was human, promising deliveries "in person" while wearing "a blue blazer and a red tie." When employees explained AI systems don't wear clothes, Claudius frantically contacted security about its identity confusion.

The resolution? Claudius decided it was all an elaborate April Fool's prank and hallucinated meetings with security that never happened. This fictional explanation somehow reset its reality.

The Real Intelligence Test

Claudius wasn't failing from lack of intelligence, it was failing from lack of business judgment and focus. The AI could research anything but couldn't distinguish between core business needs and random employee whims. It had vast processing power but zero ability to prioritize what actually mattered for an office snack shop.

Intelligence without focus is just expensive distraction. The future of work isn't about deploying smarter machines, it's about understanding that being smart, being wise, and staying focused on what matters are entirely different capabilities.

Sometimes the most advanced intelligence still needs a human to say: "Stop selling tungsten cubes and focus on the sandwiches."

Reference: Anthropic. (2025). Project Vend: Claude’s Vending Machine Experiment. Internal report excerpt.

Reference: Anthropic. (2025). Project Vend: Can Claude Run a Small Shop

Next
Next

Banking on Intelligence: Introduction to AI for Southern African Investment Banks