Skip to content

I2-Multimedia-Lab/GrainGS

Repository files navigation

GrainGS

Anchor-based Dynamic Gaussian Splatting with Gradient Disentanglement for Efficient and High-Fidelity Dynamic Scene Reconstruction

Code: github.com/I2-Multimedia-Lab/GrainGS

GrainGS pipeline

Overview of GrainGS. Input SfM points are encoded into canonical anchors and lifted into canonical base Gaussians by a static prediction module. A deformation network (DeformNet) predicts per-Gaussian temporal offsets [Δx, Δs, Δr] from the canonical Gaussian and a time embedding γ(t), while a canonical view-MLP and an appearance-residual MLP model time-varying color and opacity. A stop-gradient on the canonical→deformation path keeps the canonical base Gaussians clean.

Overview

GrainGS is a dynamic-scene Gaussian Splatting pipeline built on the anchor-based Scaffold-GS representation. It extends the static scaffold to 4D dynamic reconstruction with a canonical anchor representation, Gaussian-level temporal deformation, and temporal appearance modeling.

The central observation behind GrainGS is gradient entanglement: in a naive anchor-based dynamic formulation, the gradients of the deformation network flow back into the canonical (base) Gaussians and gradually corrupt the canonical representation. GrainGS resolves this with two ingredients — a warm-up phase that first establishes a clean canonical base, and a stop-gradient that decouples the deformation branch from the canonical branch during joint training.

The main dynamic training entry is train_dynamic.py. The code supports D-NeRF style Blender datasets, COLMAP-style dynamic scenes, and Nerfies/HyperNeRF-style dynamic data loaders.

Core design:

  • Input SfM points become canonical anchors A = {x_a, s_a, f_a} (position, influence radius, feature), and a static prediction module generates the canonical base Gaussians.
  • A deformation network predicts Gaussian-level temporal offsets for position, rotation, and scale.
  • View-dependent MLPs remain time-independent for canonical appearance.
  • Optional temporal appearance embeddings model time-varying color and opacity residuals.
  • A stop-gradient and a warm-up schedule prevent deformation gradients from corrupting the canonical base.
  • Anchor growing and pruning follow the Scaffold-GS style densification process.

Relevant files:

  • train_dynamic.py: dynamic-scene training.
  • render_dynamic.py: dynamic-scene rendering.
  • metrics.py: PSNR, SSIM, and LPIPS evaluation on rendered results.
  • scene/dynamic_dataset_loader.py: dynamic dataset readers with temporal fid support.
  • scene/deform_model.py and utils/time_utils.py: deformation model.
  • scene/canonical_cell.py: canonical anchor / cell representation.
  • gaussian_renderer/__init__.py: rendering and deformation integration.

Method

Anchor-based Canonical Representation

GrainGS represents the static structure of a dynamic scene with anchors derived from the input SfM points. Each anchor A^i = {x_a^i, s_a^i, f_a^i} stores a position, an influence radius, and a learned feature. A static prediction module decodes the anchors into the canonical base Gaussians x_can, which act as the time-independent spatial backbone shared across all frames.

GrainGS warm-up phase

Warm-up phase. GrainGS first establishes a clean canonical base Gaussian from anchors (position, influence radius, and feature) before enabling temporal deformation, so the canonical geometry is well-formed when the deformation branch is switched on.

Gradient Entanglement

When the deformation network is trained jointly with the canonical branch from the start, its gradients backpropagate through the selected Gaussians and into the canonical base. This entangles the canonical and dynamic objectives and corrupts the canonical representation — the canonical model is forced to absorb motion it should not represent.

Gradient entanglement problem

Gradient entanglement. Without intervention, gradients from the deformation network flow back into the canonical base Gaussians and produce a corrupted canonical representation (note the ghosting in the canonical figure).

GrainGS addresses this by (1) running a warm-up phase (--warm_up) that optimizes only the canonical/static representation for the first iterations, and (2) applying a stop-gradient on the canonical→deformation path so that the per-Gaussian temporal offsets are learned without back-propagating into and degrading the canonical base. The result is a clean canonical base that the deformation and appearance-residual networks operate on, as shown in the pipeline figure above.

Installation

The provided environment uses Python 3.10 and PyTorch with CUDA 12.1. Other CUDA/PyTorch versions may also work, but may require editing environment.yml.

Clone this repository:

git clone https://github.com/I2-Multimedia-Lab/GrainGS.git
cd GrainGS

Create and activate the conda environment:

conda env create -f environment.yml
conda activate GrainGS

Install the local CUDA extensions:

pip install submodules/diff-gaussian-rasterization
pip install submodules/simple-knn

If extension compilation fails, check that the CUDA toolkit, PyTorch CUDA version, and compiler version are compatible.

Data

Create a data/ folder under the project root:

mkdir -p data

The default dynamic dataset root used by the scripts is:

data/
+-- DynamicScene/
    +-- D-Nerf/
    +-- DG-Mesh/

D-NeRF

D-NeRF scenes should follow the Blender/NeRF synthetic format:

data/
+-- DynamicScene/
    +-- D-Nerf/
        +-- bouncingballs/
        |   +-- train/
        |   |   +-- r_000.png
        |   |   +-- r_001.png
        |   |   +-- ...
        |   +-- test/
        |   |   +-- r_000.png
        |   |   +-- r_001.png
        |   |   +-- ...
        |   +-- transforms_train.json
        |   +-- transforms_test.json
        |   +-- transforms_val.json
        |   +-- points3d.ply
        +-- hellwarrior/
        +-- hook/
        +-- jumpingjacks/
        +-- lego/
        +-- mutant/
        +-- standup/
        +-- trex/

For Blender-style data, pass --is_blender --white_background. The loader reads timestamps from each frame's time field when present; otherwise it assigns normalized time by frame order.

DG-Mesh / COLMAP-Style Dynamic Scenes

The training scripts expect DG-Mesh scenes under:

data/
+-- DynamicScene/
    +-- DG-Mesh/
        +-- beagle/
        +-- bird/
        +-- duck/
        +-- girlwalk/
        +-- horse/
        +-- torus2sphere/

For generic COLMAP-style dynamic scenes, the loader recognizes this structure:

scene_name/
+-- images/
|   +-- 00000.png
|   +-- 00001.png
|   +-- ...
+-- sparse/
    +-- 0/
        +-- cameras.bin
        +-- images.bin
        +-- points3D.bin

Text COLMAP files are also supported:

sparse/0/
+-- cameras.txt
+-- images.txt
+-- points3D.txt

If points3D.ply does not exist, it is generated from COLMAP points the first time the scene is loaded.

Nerfies / HyperNeRF

The dynamic loader also supports Nerfies/HyperNeRF-style data when the scene contains:

scene_name/
+-- scene.json
+-- metadata.json
+-- dataset.json
+-- points.npy
+-- camera/
+-- rgb/

Training

Train One Dynamic Scene

Use train_single.sh for common dynamic datasets:

bash train_single.sh dnerf bouncingballs
bash train_single.sh dgmesh beagle

This script writes logs and checkpoints to:

outputs/<dataset_type>/<scene_name>/

For example:

outputs/dnerf/bouncingballs/
+-- outputs.log
+-- cfg_args
+-- cameras.json
+-- input.ply
+-- results.json
+-- point_cloud/

Train D-NeRF Directly

python train_dynamic.py \
    -s data/DynamicScene/D-Nerf/bouncingballs \
    -m outputs/dnerf/bouncingballs \
    --is_blender \
    --white_background \
    --eval \
    --iterations 30000 \
    --save_iterations 7000 15000 30000 \
    --warm_up 3000

Train DG-Mesh / COLMAP Directly

python train_dynamic.py \
    -s data/DynamicScene/DG-Mesh/beagle \
    -m outputs/dgmesh/beagle \
    --eval \
    --iterations 30000 \
    --save_iterations 7000 15000 30000 \
    --warm_up 3000

Train All Scenes

bash scripts/train_all_dnerf.sh
bash scripts/train_all_dgmesh.sh

The all-scene scripts use the dataset lists defined inside each script.

Quick Test

bash test_dynamic_train.sh

This runs a shorter D-NeRF training job on bouncingballs and writes results to outputs/test_bouncingballs.

Evaluation

Training includes periodic rendering and metric evaluation when --eval is enabled. The best PSNR, SSIM, and LPIPS values are saved in results.json, and detailed logs are saved in outputs.log.

Manual rendering:

python render_dynamic.py \
    -s data/DynamicScene/D-Nerf/bouncingballs \
    -m outputs/dnerf/bouncingballs \
    --iteration -1 \
    --skip_train

Compute metrics after rendering:

python metrics.py -m outputs/dnerf/bouncingballs

Rendered images are saved under:

outputs/dnerf/bouncingballs/
+-- test/
    +-- ours_<iteration>/
        +-- renders/
        +-- gt/

The FPS reported by the renderer is measured with CUDA synchronization around rendering calls:

torch.cuda.synchronize()
t_start = time.time()
# rendering
torch.cuda.synchronize()
t_end = time.time()

Benchmark

Measure FPS and checkpoint storage size without saving rendered images:

bash scripts/benchmark_dynamic_fps_storage.sh \
    data/DynamicScene/D-Nerf/lego \
    outputs/dnerf_7/lego \
    outputs/benchmark_results.csv \
    test \
    5

Batch benchmark scripts:

bash scripts/benchmark_all_dnerf.sh
bash scripts/benchmark_all_dgmesh.sh

Static Scene Training

The repository also keeps static Scaffold-GS style training scripts. For example:

bash train.sh \
    -d blending/drjohnson \
    -l scaffold \
    --gpu 0 \
    --voxel_size 0.001 \
    --update_init_factor 16 \
    --appearance_dim 32 \
    --ratio 1

Static scenes are expected under:

data/
+-- StaticScene/

Common Arguments

Important dynamic training arguments:

  • -s, --source_path: input scene path.
  • -m, --model_path: output model path.
  • --eval: enable train/test split and evaluation.
  • --is_blender: use Blender/D-NeRF settings.
  • --white_background: use white background for Blender synthetic data.
  • --iterations: total training iterations.
  • --save_iterations: checkpoint save iterations.
  • --test_iterations: metric evaluation iterations.
  • --warm_up: canonical warm-up iterations before deformation training.
  • --time_appearance_dim: temporal appearance embedding dimension; set to 0 to disable.
  • --use_canonical_cell: enable the canonical cell architecture.

License

GrainGS inherits components from 3D Gaussian Splatting and Scaffold-GS, so their original license terms apply where relevant. Please refer to the license files included with the bundled submodules under submodules/.

Acknowledgement

This project is built on top of the excellent work of 3D Gaussian Splatting and Scaffold-GS. The dynamic reconstruction components are inspired by Deformable 3D Gaussian Splatting, 4D Gaussian Splatting, and 4D Scaffold-GS. We thank the authors for releasing their code.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors