SVGS: Single-View to 3D Object Editing via Gaussian Splatting

Pengcheng Xue1 Zhejiang Gongshang University
Yan Tian1,2* Zhejiang Gongshang University
Zhejiang Key Laboratory of Big Data and Future E-Commerce Technology
Qiutao Song1 Zhejiang Gongshang University
Ziyi Wang1 Zhejiang Gongshang University
Linyang He3 Jianpei Technology Co., Ltd
Weiping Ding4 Nantong University
Mahmoud Hassaballah5,6 Prince Sattam Bin Abdulaziz University
Qena University
Karen Egiazarian7 Tampere University
Weifa Yang8 University of Hong Kong
Leszek Rutkowski9,10 AGH University of Krakow
Systems Research Institute of the Polish Academy of Sciences

Abstract

Text-driven 3D editing has gained significant attention due to its convenience and user-friendliness. However, traditional 3D editing methods that rely on mesh and point cloud representations often struggle to accurately describe complex objects. In contrast, methods based on implicit 3D representations, such as Neural Radiance Fields (NeRF), can effectively render complex objects but suffer from slow processing speeds and limited control over specific object areas. Additionally, existing methods often face challenges in maintaining consistency and efficiency in multi-view editing. To address these limitations, we propose SVGS, a method that utilizes text instructions to edit objects through 3D Gaussian Splatting (GS). Specifically, given a singleview image and text input, we first employ a 2D editing strategy based on the cross-domain diffusion model to perform editing operations on the input image and generate multi-view consistent edited images. Subsequently, we introduce a GS reconstruction framework based on the edited sparse images to achieve 3D object editing. We compared SVGS with existing baseline methods across various scene settings, and the results demonstrate that SVGS excels in both editing capability and processing speed, marking a significant advancement in 3D editing technology.

Method

Description of the image

Example

Description of the image