YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

StyleGAN2-ADA Pipeline for Image Projection

This guide provides a step-by-step explanation of how to align a face image, project it into the latent space of StyleGAN2-ADA, and visualize the results.

Requirements

Dependencies

  • Python 3.7+
  • PyTorch
  • Required libraries installed via requirements.txt in the repository
  • Kaggle environment with internet enabled

Models and Methods Used

  • Face Alignment: align_images.py uses the shape_predictor_68_face_landmarks.dat model from DLib for precise facial alignment.
  • Image Projection: projector.py projects an aligned image into the latent space of StyleGAN2 using a pre-trained model (ffhq.pkl from NVIDIA Labs).
  • Pre-trained Models:
    • Face landmark model: shape_predictor_68_face_landmarks.dat
    • StyleGAN2-ADA pre-trained weights: ffhq.pkl

Step-by-Step Execution

1. Clone the Repository

Clone the repository for StyleGAN2-ADA:

!git clone https://github.com/rkuo2000/stylegan2-ada-pytorch.git
%cd stylegan2-ada-pytorch

2. Prepare the Raw Images

Create a directory for raw images and copy the desired file:

!mkdir -p raw
!cp /kaggle/input/test-notebook-images/profile-image.jpg raw/example.jpg

Verify the file:

!ls raw

3. Align the Face Image

Run the face alignment script:

!python align_images.py raw aligned
  • Input: raw/example.jpg
  • Output: Aligned image saved as aligned/example_01.png

4. Verify Alignment

List the aligned directory to confirm output:

!ls aligned

5. Project the Image into Latent Space

Run the projection script:

!python projector.py --outdir=out --target=aligned/example_01.png \
    --network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl
  • Output:
    • Latent space projection results saved in the out/ directory
    • A video (proj.mp4) showing optimization progress

Viewing Results

1. Inline Video Playback

Use the following command to view the progress video inline:

from IPython.display import Video
Video('out/proj.mp4', embed=True)

2. Download the Video

To download the video file, use:

from IPython.display import FileLink
FileLink('out/proj.mp4')

Click the generated link to download proj.mp4 to your local machine.


Adding Gradio for Runtime Image Upload

You can integrate Gradio to allow users to upload a photo and generate the GAN output (image and video) on runtime. Here is how to modify the pipeline:

Install Gradio

!pip install gradio

Update the Code

Add the following Python script to create a Gradio interface:

import gradio as gr
import subprocess
from PIL import Image

def process_image(input_image):
    # Save the input image to raw directory
    input_path = "raw/input_image.jpg"
    input_image.save(input_path)

    # Align the face
    subprocess.run(["python", "align_images.py", "raw", "aligned"])

    # Run projection
    subprocess.run([
        "python", "projector.py", "--outdir=out", "--target=aligned/input_image_01.png", \
        "--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl"
    ])

    # Load generated image and video
    output_image_path = "out/proj.png"  # Adjust if necessary
    output_video_path = "out/proj.mp4"
    output_image = Image.open(output_image_path)

    return output_image, output_video_path

# Gradio Interface
demo = gr.Interface(
    fn=process_image,
    inputs=[gr.Image(type="pil", label="Upload an Image")],
    outputs=[
        gr.Image(type="pil", label="Generated Image"),
        gr.Video(label="Projection Video")
    ],
    title="StyleGAN2-ADA Image Projection",
    description="Upload a face image to generate GAN output and projection video."
)

demo.launch()

Running the Gradio Interface

Save the above script and run it in your environment. A Gradio web interface will open, allowing users to upload images and see the generated results in real time.


Notes

  1. Ensure the internet is enabled in your Kaggle notebook for downloading required models.
  2. Verify input paths to match your dataset and file structure.
  3. Outputs are saved in the following structure:
    • raw/: Original images
    • aligned/: Aligned face images
    • out/: Projection results and video

Acknowledgments

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support