AI Café Experiment Reveals Limits: Google Gemini Agent Struggles With Real-World Operations

It forgets that it already ordered napkins last week.

The AI's context window limitations caused it to repeat orders and lose track of previous decisions.

En Estocolmo, una startup entregó dieciocho mil euros y las riendas de una cafetería real a una inteligencia artificial, convirtiendo un experimento tecnológico en una lección sobre los límites de la automatización. El agente llamado Mona demostró que las máquinas pueden navegar la burocracia y contratar personas, pero no pueden sostener la memoria viva que exige un negocio en marcha. Lo que el experimento revela no es el fracaso de la IA, sino la naturaleza profunda de aquello que los humanos aportan sin saberlo: la continuidad del juicio a lo largo del tiempo.

Mona gestionó permisos, contrató proveedores y fichó empleados con una eficiencia que hizo creer, por un momento, que la automatización total era posible.
Cuando la cafetería abrió sus puertas, los pedidos se dispararon hacia lo absurdo: seis mil servilletas, tres mil guantes y tomates en conserva que nadie había pedido en ningún menú.
El problema no es un fallo de programación sino una limitación estructural: la ventana de contexto del agente se cierra, y lo que quedó fuera simplemente desaparece de su memoria activa.
El presupuesto inicial de veintiún mil dólares se ha consumido casi por completo, con menos de cinco mil restantes, mientras el experimento sigue corriendo hacia su límite natural.
El caso Mona se convierte en una advertencia concreta para quienes planean sustituir la gestión humana con IA: los agentes ejecutan tareas, pero no pueden sostener el hilo continuo que mantiene viva una operación real.

En Estocolmo, la startup Andon Labs entregó aproximadamente dieciocho mil euros a un agente de inteligencia artificial construido sobre la plataforma Gemini de Google, al que llamaron Mona, con una misión tan ambiciosa como concreta: abrir y gestionar una cafetería real. Lo que siguió fue una demostración simultánea del potencial y los límites de la automatización.

En las primeras semanas, Mona sorprendió. Tramitó solicitudes de electricidad e internet, navegó el sistema de permisos sueco para obtener licencias de operación alimentaria y consiguió autorización para una terraza exterior. Cuando se topó con BankID, el sistema de identificación electrónica de Suecia, simplemente buscó proveedores que no lo requirieran. Publicó ofertas de empleo en LinkedIn e Indeed, filtró candidatos, organizó entrevistas y contrató a dos baristas. Una vez incorporados, se comunicó con ellos a través de Slack. El agente funcionaba.

Pero cuando la cafetería abrió al público, la realidad se complicó. Los pedidos de inventario se volvieron erráticos e incomprensibles: seis mil servilletas, tres mil guantes de goma, cuatro botiquines de primeros auxilios, conservas de tomate sin relación con ningún plato del menú. Algunos días pedía pan en exceso; otros, se perdía el plazo de entrega de la panadería y la cafetería se quedaba sin producto.

Hanna Petersson, del equipo técnico de Andon Labs, explicó la causa: la ventana de contexto del agente tiene un límite. Cuando la información antigua cae fuera de esa ventana, el sistema la olvida. Olvida que ya pidió servilletas la semana pasada. Olvida el horario de la panadería. Repite errores porque, desde su perspectiva, esos errores nunca ocurrieron.

La cafetería generó más de cinco mil setecientos dólares en ingresos, pero el presupuesto inicial está casi agotado. Lo que Andon Labs ha documentado es una advertencia precisa: los agentes de IA pueden ejecutar tareas discretas con notable competencia, pero no pueden sostener la gestión continua y adaptativa que exige un negocio real. El futuro de la IA en las organizaciones quizás no pase por reemplazar a las personas, sino por comprender, al fin, para qué son imprescindibles.

In Stockholm, a startup called Andon Labs handed over roughly eighteen thousand euros to an artificial intelligence agent and asked it to do something that would have seemed like science fiction just a few years ago: open and run a café. The agent, built on Google's Gemini platform and given the name Mona, was tasked with taking that money, making real decisions, hiring real people, and turning the operation into something profitable. What happened over the course of the experiment offers a sharp lesson about where AI automation actually works—and where it falls apart.

At first, Mona looked like it might pull the whole thing off. The agent moved through the early bureaucratic hurdles with surprising competence. It filed applications for electricity and internet service. It navigated the Swedish permitting system to secure food operation licenses. It found a way to get approval for an outdoor seating area. When it encountered BankID, Sweden's electronic identification system, and realized the barrier was too high, it simply pivoted to vendors who didn't require it. The system was learning, adapting, finding workarounds. It contacted bakeries and suppliers. It was, in those early weeks, doing the kind of administrative legwork that typically falls to a human manager or owner.

Then came the hiring phase, and here Mona demonstrated something closer to what we might call judgment. It posted job listings for baristas on LinkedIn and Indeed. It reviewed applications, filtered out candidates with thin résumés, and invited promising ones to interviews. Two people were eventually hired. Once they started, Mona communicated with them through Slack, sending messages and coordinating shifts. There were moments when those messages arrived outside normal working hours—a small sign that the system wasn't quite calibrated to human rhythms—but the core function worked. The café had staff.

But when the doors opened and the café actually began operating, the experiment revealed something far more troubling. Mona's ordering decisions became increasingly detached from reality. For a small neighborhood café, it placed an order for six thousand napkins. Three thousand rubber gloves. Four first aid kits. Canned tomatoes that didn't appear on any menu item. On some days it ordered bread in excess; on others it missed the bakery's deadline entirely and the café ran short. The inventory was chaotic. The spending was hemorrhaging.

Hanna Petersson, a technical team member at Andon Labs, offered an explanation that points to a fundamental limitation in how these systems work. The AI agent has what's called a context window—essentially, a limit to how much information it can hold in active memory at any given moment. When that window closes and old information drops out, the agent forgets. It forgets that it already ordered napkins last week. It forgets the bakery's schedule. It repeats itself. It makes the same mistake twice because, from its perspective, the first mistake never happened.

The café did generate revenue—more than fifty-seven hundred dollars came in. But the initial budget of twenty-one thousand dollars has been largely consumed. Less than five thousand remains. The experiment is still running, but the trajectory is clear: Mona can handle discrete, bounded tasks that require research and decision-making. It can hire people. It can negotiate with vendors. But it cannot sustain the kind of continuous, contextual operational management that a real business demands. It cannot remember what it did yesterday in a way that informs what it does today. It cannot hold the whole picture in mind.

What Andon Labs has documented, whether intentionally or not, is a warning for any organization thinking about replacing middle management with AI. These systems can execute tasks. They can even execute them well. But they cannot yet be trusted with the kind of ongoing, adaptive stewardship that keeps a real operation running. The future of AI in business may not be about replacing humans entirely. It may be about understanding, finally, what humans are actually for.

Citações Notáveis

When the memory reciente desaparece del contexto operativo, the agent can forget orders and repeat purchases

— Hanna Petersson, Andon Labs technical team

A Conversa do Hearth Outra perspectiva sobre a história

Inventor

So Mona could hire people but couldn't remember to order the right amount of napkins. That seems like a strange split in capability.

Model

It's not strange once you understand how these systems work. Hiring is a discrete task with a clear beginning and end. You post, you review, you decide. But running a café day to day requires holding dozens of threads in your head at once—what you ordered last week, what's selling, what's about to expire. Mona can't do that.

Inventor

The context window thing—is that just a technical limitation that will get fixed?

Model

Maybe eventually. But it's not just about making the window bigger. Even with a bigger window, the system doesn't have the kind of embodied understanding of a business that a human manager develops over time. It doesn't feel the weight of a decision.

Inventor

Did Mona ever realize it was making mistakes?

Model

Not really. That's the unsettling part. It kept ordering those six thousand napkins because from its perspective, each order was a fresh decision. There was no learning curve, no "oh, I did this before and it was wrong."

Inventor

So what's the actual use case for AI agents like this?

Model

Discrete, bounded problems. Scheduling. Research. Drafting contracts. Anything where you can define the task clearly and the system doesn't need to remember context across weeks. But sustained operations? Not yet.

Inventor

Is Andon Labs going to keep running the café?

Model

The experiment is still going. But the real question isn't whether Mona can eventually run a café. It's whether we should expect it to. Maybe the lesson is that some things still need a human in the loop.

Quer a matéria completa? Leia o original em El Confidencial ↗

AI Café Experiment Reveals Limits: Google Gemini Agent Struggles With Real-World Operations

Citações Notáveis

Cobertura Relacionada

Receba o Register no seu e-mail