Improve model card: Add metadata, links, and sample usage
#21
by
nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,5 +1,27 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
library_name: diffusers
|
| 4 |
+
pipeline_tag: text-to-3d
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
# CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving
|
| 8 |
+
|
| 9 |
+
This repository hosts model checkpoints for **CVD-STORM**, a cross-view video diffusion model developed within the [OpenDWM (Open Driving World Models)](https://github.com/SenseTime-FVG/OpenDWM) project. It utilizes a spatial-temporal reconstruction Variational Autoencoder (VAE) to generate long-term, multi-view videos with 4D reconstruction capabilities under various control inputs. This model is particularly designed for autonomous driving scenarios, capable of producing diverse and meaningful information such as depth estimation and LiDAR data.
|
| 10 |
+
|
| 11 |
+
CVD-STORM was presented in the paper [CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving](https://huggingface.co/papers/2510.07944).
|
| 12 |
+
|
| 13 |
+
Project Page: https://sensetime-fvg.github.io/CVD-STORM/
|
| 14 |
+
Code Repository: https://github.com/SenseTime-FVG/OpenDWM
|
| 15 |
+
|
| 16 |
+
## Sample Usage: Layout Conditioned LiDAR Generation with Diffusion Pipeline
|
| 17 |
+
|
| 18 |
+
This example demonstrates how to generate LiDAR data based on layout conditions using the Diffusion pipeline.
|
| 19 |
+
|
| 20 |
+
1. **Download LiDAR VAE and LiDAR Diffusion generation model checkpoint.** (Refer to the [OpenDWM GitHub repository](https://github.com/SenseTime-FVG/OpenDWM) for download links under the "LiDAR Models" section).
|
| 21 |
+
2. **Prepare the dataset.** (e.g., [`nuscenes_scene-0627_lidar_package.zip`](https://huggingface.co/datasets/wzhgba/opendwm-data/resolve/main/nuscenes_scene-0627_lidar_package.zip?download=true)).
|
| 22 |
+
3. **Modify the configuration file.** Update the values of `json_file`, `autoencoder_ckpt_path`, and `diffusion_model_ckpt_path` to the correct paths of your dataset and checkpoints in a JSON config file (e.g., `examples/lidar_diffusion_temporal_preview.json` from the OpenDWM repo).
|
| 23 |
+
4. **Run the following command** to generate LiDAR data according to the reference frame autoregressively:
|
| 24 |
+
|
| 25 |
+
```bash
|
| 26 |
+
PYTHONPATH=src python3 -m torch.distributed.run --nnodes 1 --nproc-per-node 2 --node-rank 0 --master-addr 127.0.0.1 --master-port 29000 src/dwm/preview.py -c examples/lidar_diffusion_temporal_preview.json -o output/temporal_diffusion
|
| 27 |
+
```
|