Holo-World

Unified Camera, Object and Weather Control for Video World Model
1University of Science and Technology of China 2Li Auto.
3Institute of Artificial Intelligence, Hefei Comprehensive National Science Center
Corresponding authors.
yinxc2000@gmail.com

📋 ABSTRACT

Core Ideas
01

Unified World Control

A unified video world model interface

Holo-World jointly controls camera motion, object dynamics, and weather state from a single image.

02

World Preservation

Source-side controls define the scene scaffold to preserve

Camera trajectory, rendered geometry buffers, and object controls anchor the background structure and dynamic entities, keeping the generated world consistent with the observed source.

03

Weather-State Transfer

Target weather specifies the state to render

The target-weather prompt guides the preserved scene scaffold into a new weather state, allowing weather-dependent appearance and particle effects to emerge within the same controlled world.

Demo Video

🔬 Method

Method Overview

Holo-World jointly controls camera motion, object dynamics, and scene weather state within the same observed world. The model must change weather from a single image while still following explicit camera and object controls, rather than relying on a complete source video as in video-to-video weather editing.

Dataset & Benchmark
15K

HoloStateData

15000+ training samples across Real / Simulation / V2V subsets, carrying paired controls for camera, object, and weather supervision.

150

Benchmark

150 mutually exclusive evaluation samples for world preservation and weather transfer tracks.

🌦️ Weather-State Transfer

Given the same input: camera trajectory and rendered controls, Holo-World renders the controlled scene under different target weather states. Our learn weather-state transfer within the same scene scaffold, rather than regenerating a different world.

Sample 1 Simulation subset
Camera Trajectory
Rendered RGB
Origin
Origin Frame
First Frame
Snow
Cloud
Fog
Rain
Sample 2 V2V subset
Camera Trajectory
Rendered RGB
Origin
Origin Frame
First Frame
Snow
Cloud
Fog
Rain

🧭 World Preservation

Source-side controls define the scene scaffold Holo-World should preserve. Camera trajectory, rendered world controls, and object controls anchor background structure and dynamic entities, enabling temporally consistent video synthesis.

Sample 1
Camera Trajectory
First Frame
First Frame
Rendered RGB
Rendered Normal
Rendered Depth
Holo-World Output Ours
Sample 2
Camera Trajectory
First Frame
First Frame
Rendered RGB
Rendered Normal
Rendered Depth
Holo-World Output Ours

🎯 Camera & Object Control Comparison

Controllable video generation under camera trajectory and object manipulation. We compare Holo-World with VerseCrafter, Gen3C, and Uni3C. The top rows show the camera trajectory, Ground Truth, each method's rendered RGB control, and the bottom row shows the corresponding generated results.

Sample 1
Camera Trajectory
Ground Truth
Holo-World Rendered RGB
VerseCrafter Rendered RGB
Gen3C Rendered RGB
Uni3C Rendered RGB
Holo-World Ours
VerseCrafter
Gen3C
Uni3C
Sample 2
Camera Trajectory
Ground Truth
Holo-World Rendered RGB
VerseCrafter Rendered RGB
Gen3C Rendered RGB
Uni3C Rendered RGB
Holo-World Ours
VerseCrafter
Gen3C
Uni3C

⛈️ Weather Transfer Comparison

Note: Our I2V Advantage over V2V Methods

Holo-World performs Image-to-Video (I2V) generation, synthesizing temporal dynamics and weather effects from only a single input frame plus camera trajectory. In contrast, other methods (Wan2.7-Edit, Cosmos-Transfer-2.5) are Video-to-Video (V2V) approaches that receive the full origin video as input and perform video editing or transfer, which is a less challenging task as they can leverage the temporal information from the origin video.

Sample 1 Snow Scene
Camera Trajectory
Origin Video
Origin Frame
First Frame
Cosmos-Transfer-2.5 (V2V)
Wan2.7-Edit (V2V)
Holo-World (I2V)
Sample 2 Fog Scene
Camera Trajectory
Origin Video
Origin Frame
First Frame
Cosmos-Transfer-2.5 (V2V)
Wan2.7-Edit (V2V)
Holo-World (I2V)
Sample 3 Rain Scene
Camera Trajectory
Origin Video
Origin Frame
First Frame
Cosmos-Transfer-2.5 (V2V)
Wan2.7-Edit (V2V)
Holo-World (I2V)

📚 Citation

If you find HoloWorld useful in your research, please cite us:

@article{yin2026holoworld,
    title={Holo-World: Unified Camera, Object and Weather Control for Video World Model},
    author={Yin, Xiangchen},
    journal={arXiv preprint arXiv:XXXX.XXXXX},
    year={2026}
}