The automatic design of a 3D tooth model plays a crucial role in dental digitization. However, current approaches face challenges in compositional 3D tooth generation because both the layouts and shapes of missing teeth need to be optimized. In addition, collision conflicts are often omitted in 3D Gaussian–based compositional 3D generation, where objects may intersect with each other due to the absence of explicit geometric information on the object surfaces. Motivated by graph generation through diffusion models and collision detection using 3D Gaussians, we propose an approach named DM-CFO for compositional tooth generation, where the layout of missing teeth is progressively restored during the denoising phase under both text and graph constraints. Then, the Gaussian parameters of each layout-guided tooth and the entire jaw are alternately updated using score distillation sampling (SDS). Furthermore, a regularization term based on the distances between the 3D Gaussians of neighboring teeth and the anchor tooth is introduced to penalize tooth intersections. Experimental results on three tooth-design datasets demonstrate that our approach significantly improves the multiview consistency and realism of the generated teeth compared with existing methods.
Table 2. Quantitative comparison table across three dental datasets. Our method consistently achieves the best performance in layout accuracy, collision rate, and reconstruction fidelity.
Table 3. Inference time (in minutes) for different methods. Our method maintains competitive efficiency while improving reconstruction fidelity.
| Methods | MVDream | Controlnet | GALA3D | ComboVerse | Ours |
|---|---|---|---|---|---|
| Inference Time (minutes) | 2.5 | 3.5 | 4.2 | 4.4 | 4.8 |
The curve of training loss in ablation studies. Our proposed components help achieve faster convergence and better stability during optimization.
(a) Input Image
(b) Training Loss Curve
Qualitative comparisons between ComboVerse, GALA3D, and our approach on the Shining3D tooth design dataset are illustrated. Our method better preserves structural details and avoids shape artifacts under complex occlusions and missing regions.
The layout is progressively optimized as training iterations increase. This figure illustrates how our method refines the missing tooth placement step-by-step, guided by both textual and structural constraints.
Visualization of initial tooth layouts generated by different algorithms across various datasets. Our method demonstrates stronger geometric coherence and fewer overlaps.