# romap-code **Repository Path**: monkeycc/romap-code ## Basic Information - **Project Name**: romap-code - **Description**: No description available - **Primary Language**: Python - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-12-03 - **Last Updated**: 2025-12-03 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # RoMaP: Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with Regularized Score Distillation Sampling [ICCV 2025 Paper] **Hayeon Kim\*, Ji Ha Jang\*, Se Young Chun†** Seoul National University \*Equal contribution, †Corresponding author [📄 Paper (arXiv)](https://arxiv.org/abs/2507.11061) | [📽️ Project Page](https://janeyeon.github.io/romap/) | [🔁 BibTeX](#citation) ## Overview **RoMaP** is a novel framework for fine-grained **part-level editing of 3D Gaussian Splatting (3DGS)**, enabling edit instructions like: > *"Turn his left eye blue and his right eye green"* > *"Replace the nose with a croissant"* > *"Give the hair a flame-texture style"* Unlike existing baselines which struggle with local edits due to inconsistent 2D segmentations and weak SDS guidance, **RoMaP** combines: - ✅ Geometry-aware segmentation `(3D-GALP)` - ✅ Gaussian prior removal and local masking - ✅ Regularized SDS with `SLaMP` (Scheduled Latent Mixing and Part Editing) image supervision ## Visual Results ### Example 1: Open Vocabulary Part Segmentation ![Segmentation](assets/segmentation.gif) ### Example 2: Controllable Part Editing ![Editing](assets/editing.gif) ## Key Features - **3D-GALP**: Robust 3D segmentation based on spherical harmonics-aware label prediction - **SLaMP editing**: Generates realistic part-edited 2D views to direct SDS in target-driven directions (Coming soon) - **Regularized SDS Loss**: Anchored L1 + mask support + strong target control (Coming soon) ## Installation Ensure your CUDA version ≥ 11.8 and PyTorch ≥ 2.1.0 is installed. ```bash git clone https://github.com/janeyeon/RoMaP.git cd RoMaP conda create -n romap python=3.9 conda activate romap pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118 pip install ninja -U pip install -r requirements.txt cd gaussiansplatting/submodules/diff_gaussian_rasterization pip install -e . cd ../ git clone https://github.com/camenduru/simple-knn.git cd simple-knn pip install -e . cd ../../../lgm/diff_gaussian_rasterization_lgm pip install -e . cd ../../ # pytorch3d pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable" # tiny-cuda-nn (Torch bindings) pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch # nvdiffrast pip install git+https://github.com/NVlabs/nvdiffrast # nerfacc (by nerfstudio) pip install git+https://github.com/nerfstudio-project/nerfacc # PyTorch Lightning conda install pytorch-lightning -c conda-forge # libigl Python bindings conda install -c conda-forge pyigl pip install rembg onnxruntime einops trimesh wandb segmentation_refinement tyro roma xformers==0.0.23 imageio[ffmpeg] imageio[pyav] plyfile lightning sentencepiece ``` ### Download Dataset You can download the datasets from this [link](https://drive.google.com/drive/folders/1V3fHMUGB5y06pa1tnqnOEHGly-P7kjJa?usp=sharing). This project utilizes the [NeRF-Art dataset](https://github.com/cassiePython/NeRF-Art), the [3D-OVS dataset](https://github.com/Kunhao-Liu/3D-OVS?tab=readme-ov-file), as well as a custom 3D Gaussian Splatting dataset created by us. ### For Reconstruction Scene Segmentation ```bash sh run_recon_nerfart_yanan_seg.sh ``` - `prompt`: The main prompt for segmentation or editing, describing the target object. - `seg_list`: A list of words specifying the parts you want to segment; it is recommended to list smaller (more specific) regions first, then larger ones. For compound words, you can include multiple terms in parentheses (e.g., `['sharp','eyes']`). - `if_segment`: Set to `True` if you want segmentation to be performed on the scene. - `ply_path`: A path or list of paths to pretrained PLY files you wish to use for initialization or further processing. - `seg_softmax_list`: For fine-grained control, adjust softmax values here. In most cases, values between 0.1 and 0.2 yield good results. For segmenting larger regions, consider increasing this value. - `if_recon`: Set to `True` if you are working with a reconstruction scene. - `rot_name`: If you wish to apply a custom camera matrix transformation outside of the default `transformation.json`, add a new entry in `rotation_dict` within `threestudio/data/multiview.py` and specify its name here. If not specified, the default is used. - `fov`: Use this to explicitly set the camera's field of view if you want to override the default setting. ### For Generation Scene Segmentation ```bash sh run_gen_woman_seg.sh ``` - `if_gen`: Set to `True` if the scene is a generation scene. ### For Complex Scene Segmentation (3D-OVS Dataset) ```bash sh run_recon_3d_ovs_bench_seg.sh ``` - `dataroot`: The folder path containing the desired point cloud and the corresponding `transforms.json`. - `eval_interpolation`: For custom camera matrix control, specify a list where the first n-1 numbers indicate the camera matrix views you want to interpolate, and the last number defines into how many intervals each view pair should be divided. ## Experimental Results | Method | CLIP↑ | CLIPdir↑ | B-VQA↑ | TIFA↑ | | :-- | :-- | :-- | :-- | :-- | | GaussCtrl | 0.182 | 0.044 | 0.190 | 0.432 | | GaussianEditor | 0.179 | 0.087 | 0.370 | 0.571 | | DGE | 0.201 | 0.095 | 0.497 | 0.565 | | **RoMaP (Ours)** | **0.277** | **0.205** | **0.723** | **0.674** | RoMaP consistently outperforms previous methods across all editing metrics, especially in: - Part-level segmentation accuracy - Drastic edit capacity (e.g., 'croissant nose', 'jellyfish hair') - Identity-preserving edits with complex structures ## Citation ```bibtex @inproceedings{kim2025romap, title={Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with Regularized Score Distillation Sampling}, author={Hayeon Kim, Ji Ha Jang, Se Young Chun}, booktitle={International Conference on Computer Vision (ICCV)}, year={2025} } ``` [^1]: 2507.11061v1.pdf We would like to express our gratitude to the developers of [threestudio](https://github.com/threestudio-project/threestudio) and [Rectified flow prior](https://github.com/yangxiaofeng/rectified_flow_prior), as our code is primarily based on these repositories.