My3DGen: A Scalable Personalized 3D Generative Model

1The University of North Carolina at Chapel Hill, 2The University of Maryland

My3DGen personalizes a large 3D model (EG3D) using a few selfies (~50).


In recent years, generative 3D face models (e.g., EG3D) have been developed to tackle the problem of synthesizing photo-realistic faces. However, these models are often unable to capture facial features unique to each individual, highlighting the importance of personalization. Some prior works have shown promise in personalizing generative face models, but these studies primarily focus on 2D settings. Also, these methods require both fine-tuning and storing a large number of parameters for each user, posing a hindrance to achieving scalable personalization. Another challenge of personalization is the limited number of training images available for each individual, which often leads to overfitting when using full fine-tuning methods.

Our proposed approach, My3DGen, generates a personalized 3D prior of an individual using as few as 50 training images. My3DGen allows for novel view synthesis, semantic editing of a given face (e.g. adding a smile), and synthesizing novel appearances, all while preserving the original person's identity. We decouple the 3D facial features into global features and personalized features by freezing the pre-trained EG3D and training additional personalized weights through low-rank decomposition. As a result, My3DGen introduces only $\textbf{240K}$ personalized parameters per individual, leading to a $\textbf{127}\times$ reduction in trainable parameters compared to the $\textbf{30.6M}$ required for fine-tuning the entire parameter space. Despite this significant reduction in storage, our model preserves identity features without compromising the quality of downstream applications.


Reconstruction comparison between pretrained (EG3D-PTI), full fine-tuning, and our method. Full fine-tuning requires 31M trainable parameters per identity, while our method requires only ~0.2M trainable parameters per identity.


Interpolating between anchor pairs. Interpolation is performed between two anchors on the extreme left and right column (shown in blue).

Downstream Applications

Animation - Interpolation

Interpolating between two latent codes. We can also animate novel appearance (expression, hairstyle, etc.) by interpolating the latent codes of two input anchors. Use the slider here to linearly interpolate between the left anchor and the right anchor (including the pose).

Interpolate start reference image.

Left anchor

Interpolation end reference image.

Right anchor

Image Synthesis

Sampling latent codes from $\alpha$-space. you can synthesize unlimited novel appearances without changing the model weights. Uncurated samples are shown below.

Kamala Harris
Barack Obama
Scarlett Johansson

Semantic Editing

Using Semantic Editing, you can perform attribute editing using the latent code without changing the model weights. Input image is shown on the left, followed by the multi-view reconstruction of the edited image. With the same latent code, ours preserves the identity information while editing the attributes.

Adding smile to Dwayne Johnson
Removing smile from Kamala Harris


        title={My3DGen: A Scalable Personalized 3D Generative Model}, 
        author={Luchao Qi and Jiaye Wu and Annie N. Wang and Shengze Wang and Roni Sengupta},