
InSpatio-WorldFM
An Open-Source Real-Time Generative Frame Model for Spatial Intelligence — delivering multi-view consistent 3D world modeling on consumer GPUs.

Overview
World models that run where the world is.
InSpatio-WorldFM is a generative frame model designed from the ground up for spatial intelligence — not language, not 2D vision, but the full 3D structure of the physical world.
By rethinking the architecture around real-time spatial constraints, WorldFM achieves multi-view consistent generation while remaining deployable on consumer hardware — a breakthrough that eliminates the dependency on data-center-scale compute for world modeling.
Released as open source to accelerate research and enable a new generation of spatially-aware AI applications in robotics, autonomous systems, and embodied AI.
Specifications
Key Capabilities
Real-Time Generation
Generates spatially consistent frames at real-time speeds — no expensive data center hardware required.
Multi-View Consistency
Maintains geometric and semantic consistency across multiple viewpoints, enabling coherent 3D scene understanding.
Edge-Device Ready
Optimized to run on consumer-grade GPUs, bringing frontier world model capabilities to edge deployments.
Spatial Reasoning
Understands depth, geometry, and physical layout — powering downstream tasks in robotics, simulation, and XR.
Why World Models Need to Be 3D
Most AI systems reason about the world in 2D — processing pixels without understanding where objects exist in physical space. This works for classification, but fails for robotics, autonomous systems, and embodied AI where spatial relationships, depth, and physical dynamics matter. A true world model builds a persistent 3D representation of the environment, enabling prediction, planning, and interaction — not just recognition.
A Real-Time World Model for Consumer Hardware
Existing world models require data center infrastructure to run. WorldFM breaks this constraint through a fundamentally more efficient architecture — achieving real-time 3D world modeling on consumer GPUs. This makes frontier world model capabilities accessible to researchers, developers, and edge deployments without cloud dependency.

World Model Applications: Robotics, Embodied AI & Simulation
Downstream Applications
Get Started
Access the model on GitHub
Model weights, inference code, training documentation, and benchmarks are available in the repository. For research access, technical questions, or collaboration:
Frequently Asked Questions
What is a world model in AI?
A world model is an AI system that builds an internal model of the physical structure of the real world, enabling AI to predict, simulate, and interact with 3D environments. Unlike 2D video models that only process pixels, world models understand scene depth, geometry, and physical dynamics.
How is InSpatio-WorldFM different from other world models?
WorldFM is a generative frame model designed specifically for real-time 3D spatial reasoning that runs on consumer GPUs — not data center hardware. It achieves multi-view consistent scene generation and is fully open source under Apache-2.0.
How are world models used in robotics and embodied AI?
World models give robots a persistent 3D understanding of their environment, enabling them to predict action outcomes, plan long-horizon tasks, and transfer skills to new environments. Embodied AI systems use world models to simulate physical interactions before real-world deployment.
Can WorldFM run on consumer hardware?
Yes. InSpatio-WorldFM is specifically optimized for consumer-grade GPUs, making it one of the few real-time 3D world models that does not require data center infrastructure.
All Models
Browse open-source model library →Research
Explore publications →