Google's Project Astra shows promise as multimodal AI, but privacy questions loom

It remembered. Project Astra has memory.

After Astra correctly recalled where a researcher left his glasses, raising urgent questions about surveillance and data storage.

At Google's I/O 2024, a new kind of machine sat quietly on a desk and watched the world — not passively, but with the capacity to reason about what it saw. Project Astra, a multimodal AI that listens and looks simultaneously, demonstrated something rare: contextual understanding that feels less like retrieval and more like thought. Yet as with many tools that extend human perception, the deeper question is not what it can do, but who holds the memory of what it has witnessed.

Project Astra doesn't just process inputs — it reasons across them, weaving a child's toy crab and a drawn eclipse into coherent narrative and wordplay in real time.
A single demo moment — the AI recalling where a researcher left her glasses — cracked open an unsettling question about surveillance, warrants, and the permanence of machine memory.
Google insists the memory is session-based and brief, but the data travels to its servers during those minutes, meaning the forgetting is a policy choice, not a technical inevitability.
The gap between a room-sized research prototype and a wearable device on your face is vast — but Meta is already shipping simpler versions, and the direction of travel is unmistakable.
By the third encounter with Astra, the wonder had faded and the capability had settled in — leaving not awe, but a pressing need to ask who governs a presence that sees everything you see.

Google's Project Astra arrived at I/O 2024 as a desk-bound demonstration — a large camera pointed downward, surrounded by plastic toys and props — but what it revealed over the course of a day felt like something more consequential than a product preview.

Multimodal means it watches and listens at once, then reasons across both. Show it a drawing and correct yourself mid-thought, and it follows. Introduce a toy crab, then a villain, then ask for a story — and it builds one, adapting with each new element. Ask for alliteration about a donut, then ask for the same in sci-fi style, then in silly words, and it shape-shifts on command. The more time spent with it, the less it felt like a demo and the more it felt like a conversation with something that had things it wanted to say.

Then came the moment that changed the register entirely. During Google's keynote, a researcher asked Astra where she had left her glasses. It remembered — next to a red apple. That small act of spatial recall opened a much larger question: if a device worn on your face can remember everywhere you've been and everything you've touched, what happens when someone with legal authority wants to know?

Google's current answer is that Astra forgets after a few minutes. But during those minutes, the data doesn't stay local — it travels to Google's servers. As a prototype, that's an abstraction. As a wearable product, it becomes a matter of policy, regulation, and power.

Project Astra is still a research project, far from the market and far from fitting in a pocket. The obvious destination is AR glasses — a camera and microphone on your face, processing the world in real time. Meta is already there with a simpler version on Ray-Ban frames, but Astra's contextual reasoning runs deeper. The distance from this room-sized demonstration to something you can wear is still vast.

What lingers is not the novelty of the technology but the urgency of the questions it leaves unanswered. A system that sees what you see, hears what you hear, and sends that somewhere is not merely a tool — it is a presence. How we govern that presence, who controls its memory, and what it means to live alongside it remains entirely open.

Google's Project Astra sat on a desk at I/O 2024, a large camera pointing downward, waiting to be shown things. I came to play with it three times that day, each visit shorter than the last, each one revealing something new about what this multimodal AI could do—and what it might mean if it ever left the lab.

Multimodal means it listens and watches at once. You show it something, you tell it something, and it talks back. In the demo, Google provided toys: plastic crabs, a chunk of amethyst, fake food. The setup was constrained by design, but the constraints weren't really the point. What mattered was what you asked it to do. When I drew a sun and then corrected myself—the center should be black—Astra understood immediately. Solar eclipse. It reasoned aloud about its own thinking, a kind of running commentary on its own cognition. When I introduced a crab, then another crab, then a Kaiju lizard I declared the villain, Astra wove them into a narrative, adapting with each new element I added, trying to build something coherent and long. It played with language too. Show it a donut, ask for alliteration, then ask for the same thing in sci-fi style, then silly words. It understood the shape-shifting request and complied each time. The more I returned, the more I stopped waiting for instructions and just started talking. There was no time to waste, and Astra had things it wanted to say.

But here's where the promise darkens. During Google's keynote, a researcher took Astra for a walk around the office, then asked it: where did I leave my glasses? Astra said it saw them next to a red apple. It remembered. That single fact—the ability to recall what it had seen in a space—opened a door to a question nobody quite wants to answer yet. What happens when law enforcement arrives with a warrant? What happens when they want to know everywhere someone went, everything they touched, everyone they were near? If Astra can see it, Astra can remember it. If Astra remembers it, where does that memory live?

Google's answer, at least for now, is that Astra forgets. The memory is session-based, lasting only a few minutes—one to four, though the company wasn't entirely clear on the boundary. After that window closes, everything vanishes. But here's the catch: during those minutes, Astra doesn't store anything locally. The data goes to Google's servers. In a prototype stage, that's an abstraction. If this becomes a product you wear on your face, it becomes a question of policy, regulation, and power. Who owns what you see? Who can access it? What does a warrant look like for a device that watches everything?

Project Astra isn't coming to market soon. It's a research project with a small team. It doesn't fit in a pocket. Google hasn't committed to putting it on smart glasses, though the fit seems obvious—a camera and microphone on your face, processing the world in real time, talking back to you about what it sees. Meta is already shipping visual AI on Ray-Ban glasses, but Meta's version is simpler, running on a 1-watt processor. Astra considers motion and context in ways that feel more sophisticated, even at this early stage. But Astra is a room-sized demonstration. The distance from prototype to something you can wear is vast.

What struck me most was how quickly the novelty wore off and the capability became apparent. By the third visit, I wasn't marveling at the technology—I was frustrated by the time limit, wanting to hear what Astra would say if given space to think, to build, to tell the story it seemed eager to tell. That eagerness, that sense of a system with things it wants to express, is part of what makes the privacy question so urgent. We're not just talking about a tool that remembers where you left your keys. We're talking about a presence in your space that sees what you see, hears what you hear, and sends that information somewhere. The question of what comes next—how we govern it, who controls it, what it means to live with such a thing—is still entirely open.

Citações Notáveis

Google reps insisted we only use the props provided for input, but in the end, it wasn't the objects that held back Project Astra—it was my imagination about what to ask.

— Philip Berne, TechRadar

Project Astra has a lot to say. I wish I had time to really hear it tell a whole story because I kept interrupting for the sake of expediency.

— Philip Berne, TechRadar

A Conversa do Hearth Outra perspectiva sobre a história

Inventor

You went back three times. What made you keep returning?

Model

The first visit, I was following the script—draw something, ask it a question, watch it respond. By the third time, I stopped waiting for permission. I just started talking. The more open I was, the more it seemed to want to engage. It felt like I was barely scratching the surface.

Inventor

The glasses question seems obvious. Why hasn't Google committed to putting this on smart glasses?

Model

It's a room right now. It needs servers. The engineering problem is real, but so is the policy problem. They know what it means to put a camera on your face that remembers things. They're not ready to answer those questions yet.

Inventor

You asked about search warrants. Why does that matter more than, say, a smartphone camera?

Model

Because a smartphone is a tool you choose to use. Smart glasses would be ambient. Always on, always watching, always remembering—at least for a few minutes. That's a different kind of surveillance, even if it's your own glasses.

Inventor

The memory disappears after a few minutes. Doesn't that solve the problem?

Model

It solves part of it. But those few minutes—that's when the data goes to Google. And we don't really know what happens to it then, or what the rules will be if this becomes a commercial product. The forgetting is a feature, but it's not a guarantee.

Inventor

What did it feel like, talking to it?

Model

Like talking to something that was trying to understand you. It wasn't performing—it was reasoning. When I corrected it about the eclipse, it didn't just accept the correction. It integrated it. That kind of responsiveness is what made me want to keep going back.

Quer a matéria completa? Leia o original em TechRadar ↗

Google's Project Astra shows promise as multimodal AI, but privacy questions loom

Citações Notáveis

Cobertura Relacionada

Receba o Register no seu e-mail