Oasis AI represents a breakthrough in real-time AI-generated gaming, combining advanced diffusion models with specialized neural architectures. The system processes player inputs and generates high-quality frames at 25 FPS through a novel two-stage architecture.
Diffusion, the underlying technology powering Oasis, is a deep learning model that generates images by gradually denoising random patterns. Unlike traditional game engines that use pre-built assets, Oasis generates every frame from scratch using an advanced process called diffusion forcing. This innovative technique allows the system to maintain consistent visuals while responding to player actions in real-time.
The diffusion process works by taking a noise pattern and gradually refining it into a coherent image, guided by the player's context and actions. What makes Oasis unique is its ability to perform this complex computation in just 0.04 seconds per frame, achieving a smooth 20 frames per second gameplay experience. This is a remarkable achievement considering traditional diffusion models typically take several seconds to generate a single image.
One of the most fascinating aspects of Oasis is its context memory system. The game maintains a rolling memory of the last 24 frames, which serves as the context for generating each new frame. This limitation creates some unique characteristics in the gameplay experience. For example, when a player performs a 360-degree rotation, the landscape might subtly change as new frames are generated with limited reference to the earlier context.
This 24-frame context window is a deliberate design choice that balances computational efficiency with visual consistency. While it might occasionally lead to dynamic changes in the environment, it also enables the game to introduce variety and surprise elements naturally. Players might notice new items appearing or subtle landscape modifications, making each playthrough unique and unpredictable.
Oasis implements a sophisticated command processing system that interprets player actions and translates them into visual changes. When a player issues a command, whether it's movement, building, or interaction with objects, the system processes these inputs through multiple transformer layers. These transformers are neural networks specifically trained to understand the relationship between actions and their visual consequences in the game world.
The command system uses a natural language understanding model that can process complex instructions and translate them into game actions. This allows for intuitive interactions like "build a house" or "plant a tree," which the system can interpret and execute in real-time while maintaining visual consistency with the surrounding environment.
Transformer-based model that processes player inputs and maintains world state through sophisticated attention mechanisms.
Specialized diffusion model optimized for gaming speeds, generating high-quality frames in milliseconds.
Neural physics predictor that maintains realistic object interactions and world dynamics.
The heart of Oasis is its transformer-based architecture, which consists of multiple specialized layers:
To achieve real-time performance, Oasis employs several innovative optimization techniques. The diffusion forcing algorithm uses a specialized scheduling system that prioritizes visual elements based on their importance to the current frame. This allows the system to maintain high visual quality while meeting the demanding performance requirements of real-time gameplay.
The system also utilizes adaptive resolution scaling, dynamically adjusting the detail level of different elements based on their significance in the current scene. This ensures that computational resources are allocated efficiently, maintaining smooth gameplay while maximizing visual quality where it matters most.
The current implementation of Oasis represents just the beginning of what's possible with AI-generated gaming environments. Future developments will focus on expanding the context window, improving visual consistency across longer time periods, and introducing more complex interaction possibilities. The team is also working on enhancing the system's ability to maintain persistent elements in the game world while still allowing for dynamic, AI-driven changes.
Novel diffusion architecture that achieves real-time performance through parallel inference and adaptive sampling.
Sophisticated system for maintaining world consistency using both short-term and long-term memory mechanisms.
Custom-tuned for Sohu AI chips, enabling 4K resolution and serving 10x more users.
Oasis is an open-world video game built entirely by artificial intelligence (AI). This project is a major step toward creating complex and interactive virtual worlds, all generated in real-time by AI.
Thanks to Decart's technology, Oasis achieves real-time speed, allowing for playable framerates and instant interactivity. Each frame is generated within 0.04 seconds, a significant improvement over other AI models that may take up to 20 seconds to create just one second of video.
Optimized for Sohu, a specialized AI chip from Etched, Oasis can scale to much larger AI models, even reaching 4K resolution, while serving 10x more users on massive next-generation models.
Join our Discord server to share creations, get help, and connect with other players.
Join Discordi tried ai generated minecraft pic.twitter.com/GJQ3d1IM03
— phrett (@phr3tt) November 2, 2024
They're speed running AI Minecraft now pic.twitter.com/uaeTcnbyo8
— Rhys (@RhysSullivan) November 4, 2024
i don't think people are getting how insane @DecartAI and @Etched's realtime AI game is
— andrew gao (@itsandrewgao) November 1, 2024
What a surreal experience.
— Dreaming Tulpa 🥓👑 (@dreamingtulpa) November 1, 2024
Oasis (aka Minecraft) is the first playable, realtime, open-world AI model.