Text-driven 3D editing has gained significant attention due to its convenience and user-friendliness. However, traditional 3D editing methods that rely on mesh and point cloud representations often struggle to accurately describe complex objects. In contrast, methods based on implicit 3D representations, such as Neural Radiance Fields (NeRF), can effectively render complex objects but suffer from slow processing speeds and limited control over specific object areas. Additionally, existing methods often face challenges in maintaining consistency and efficiency in multi-view editing. To address these limitations, we propose SVGS, a method that utilizes text instructions to edit objects through 3D Gaussian Splatting (GS). Specifically, given a singleview image and text input, we first employ a 2D editing strategy based on the cross-domain diffusion model to perform editing operations on the input image and generate multi-view consistent edited images. Subsequently, we introduce a GS reconstruction framework based on the edited sparse images to achieve 3D object editing. We compared SVGS with existing baseline methods across various scene settings, and the results demonstrate that SVGS excels in both editing capability and processing speed, marking a significant advancement in 3D editing technology.