AI companies have recently been experimenting with interactive, AI-generated worlds. There’s an AI-generated version of Quake. An AI-generated Minecraft. Google DeepMind is also building a team to develop models that “simulate the world.“ Now, an AI startup backed by Pixar cofounder Edwin Catmull is trying to put its own spin on the idea — something it calls “interactive video,” which it’s letting people experience as part of a research preview that’s available today.
The startup, called Odyssey, describes interactive video on its website as “video you can both watch and interact with, imagined entirely by AI in real-time.” The idea is that you can engage with the video in some way — think a first-person video game but in environments that actually look like the real world instead of one made of polygons. Odyssey hypes it up to be an “early version of the Holodeck,” though it acknowledges that “the experience today feels like exploring a glitchy dream — raw, unstable, but undeniably new.”
In motion, Odyssey’s interactive videos feel like walking through a blurry version of Google Street View. You can walk around the startup’s real-time generated worlds using the WASD keys as though it were a game. There are a handful of different worlds you can switch between, like a wooded area with a cabin, a shopping mall, and a parking lot in front of a large building. They’re a little different each time, since the system is regenerating what’s in your vision each time. But the picture quality is generally pretty fuzzy.
For now, you only have two and a half minutes to explore the preview before it stops, but you can reload and hop back in if you’d like.
Odyssey says it’s using clusters of H100 GPUs in the US and Europe to generate the interactive videos. “Using that input and frame history, the model then generates what it thinks the next frame should be, streaming it back to you in real-time,” the company writes on its website, adding that process can happen in “as little as” 40 milliseconds.
The current preview isn’t going to replace Fortnite anytime soon. Objects only sometimes have collision; in one instance, I was stopped by a fence, but when I tried to walk through a large house, I clipped right through it. In another run, I walked down some stairs only to watch the doorway I was heading toward turn into a brick wall. The preview also acts strangely when you’re standing still; I did one full instance where I didn’t touch the controls at all, and the model slowly kept turning me left and inched me closer to a wall.
In an interview with The Verge, Catmull, who sits on Odyssey’s board, couldn’t give me a specific answer for when the image quality might get better. But he says that Odyssey is on “the leading edge” of the work that’s being done and that “they participate in this broader community, so the information about how to do this keeps improving.” He acknowledges that the images are still noisy, but he says that the bulk of the noise, like textures on a building, are “exactly the kind of thing that applying neural network filters to” is meant to solve.
It’s not a great video game, despite how entertaining the quirks and issues can be. And I don’t think this is going to replace movies for a while, either; the way the world morphs and changes in unexpected ways is just too distracting, and I think knowing that what you’re watching won’t melt in front of you is a key part of a good film. It’s not even a good merging of the two mediums — yet.
While messing around with the preview, you can see that there may be something interesting here. With the speed at which AI tools are evolving, it’s not too hard to imagine a version of this that doesn’t have quite so many issues. But it’s no Holodeck yet, and there’s quite a ways to go if AI video is going to get there.