DM-CFO: A Diffusion Model for Compositional 3D Tooth Generation with Collision-Free Optimization

Yan Tian
Zhejiang Gongshang University
Pengcheng Xue
Zhejiang Gongshang University
Weiping Ding
Nantong University
Mahmoud Hassaballah
Qena University
Karen Egiazarian
Tampere University
Aura Conci
Universidade Federal Fluminense
Abdulkadir Sengur
Firat University
Leszek Rutkowski
Poland and AGH University of Krakow

Abstract

The automatic design of a 3D tooth model plays a crucial role in dental digitization. However, current approaches face challenges in compositional 3D tooth generation because both the layouts and shapes of missing teeth need to be optimized. In addition, collision conflicts are often omitted in 3D Gaussian–based compositional 3D generation, where objects may intersect with each other due to the absence of explicit geometric information on the object surfaces. Motivated by graph generation through diffusion models and collision detection using 3D Gaussians, we propose an approach named DM-CFO for compositional tooth generation, where the layout of missing teeth is progressively restored during the denoising phase under both text and graph constraints. Then, the Gaussian parameters of each layout-guided tooth and the entire jaw are alternately updated using score distillation sampling (SDS). Furthermore, a regularization term based on the distances between the 3D Gaussians of neighboring teeth and the anchor tooth is introduced to penalize tooth intersections. Experimental results on three tooth-design datasets demonstrate that our approach significantly improves the multiview consistency and realism of the generated teeth compared with existing methods.

Method

Description of the image

Quantitative Comparison

Table 2. Quantitative comparison table across three dental datasets. Our method consistently achieves the best performance in layout accuracy, collision rate, and reconstruction fidelity.

Quantitative results on three dental datasets

Inference Time Comparison

Table 3. Inference time (in minutes) for different methods. Our method maintains competitive efficiency while improving reconstruction fidelity.

Methods MVDream Controlnet GALA3D ComboVerse Ours
Inference Time (minutes) 2.5 3.5 4.2 4.4 4.8

Ablation Study

The curve of training loss in ablation studies. Our proposed components help achieve faster convergence and better stability during optimization.

(a) Tooth Number

(a) Input Image

(b) Training loss curve

(b) Training Loss Curve

Figure 5: Ablation study visualization. (a) Tooth Number; (b) Training loss curve demonstrating improved convergence with proposed components.

Qualitative Comparison on Shining3D

Qualitative comparisons between ComboVerse, GALA3D, and our approach on the Shining3D tooth design dataset are illustrated. Our method better preserves structural details and avoids shape artifacts under complex occlusions and missing regions.

Qualitative comparison between ComboVerse, GALA3D, and Ours on Shining3D
Figure 7: Qualitative comparison between ComboVerse, GALA3D, and our method on the Shining3D tooth design dataset.

Layout Evolution During Training

The layout is progressively optimized as training iterations increase. This figure illustrates how our method refines the missing tooth placement step-by-step, guided by both textual and structural constraints.

Layout optimization over training iterations
Figure 10: Layout refinement process across training iterations guided by both textual prompts and structural priors.

Layout Initialization Results

Visualization of initial tooth layouts generated by different algorithms across various datasets. Our method demonstrates stronger geometric coherence and fewer overlaps.

Layout initialization results across datasets and algorithms
Figure 11: Comparison of initial tooth layouts across datasets. Our approach achieves better geometric alignment and avoids layout overlap.