StyleGAN2-ADA Pipeline for Image Projection
This guide provides a step-by-step explanation of how to align a face image, project it into the latent space of StyleGAN2-ADA, and visualize the results.
Requirements
Dependencies
- Python 3.7+
- PyTorch
- Required libraries installed via
requirements.txtin the repository - Kaggle environment with internet enabled
Models and Methods Used
- Face Alignment:
align_images.pyuses theshape_predictor_68_face_landmarks.datmodel from DLib for precise facial alignment. - Image Projection:
projector.pyprojects an aligned image into the latent space of StyleGAN2 using a pre-trained model (ffhq.pklfrom NVIDIA Labs). - Pre-trained Models:
- Face landmark model:
shape_predictor_68_face_landmarks.dat - StyleGAN2-ADA pre-trained weights:
ffhq.pkl
- Face landmark model:
Step-by-Step Execution
1. Clone the Repository
Clone the repository for StyleGAN2-ADA:
!git clone https://github.com/rkuo2000/stylegan2-ada-pytorch.git
%cd stylegan2-ada-pytorch
2. Prepare the Raw Images
Create a directory for raw images and copy the desired file:
!mkdir -p raw
!cp /kaggle/input/test-notebook-images/profile-image.jpg raw/example.jpg
Verify the file:
!ls raw
3. Align the Face Image
Run the face alignment script:
!python align_images.py raw aligned
- Input:
raw/example.jpg - Output: Aligned image saved as
aligned/example_01.png
4. Verify Alignment
List the aligned directory to confirm output:
!ls aligned
5. Project the Image into Latent Space
Run the projection script:
!python projector.py --outdir=out --target=aligned/example_01.png \
--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl
- Output:
- Latent space projection results saved in the
out/directory - A video (
proj.mp4) showing optimization progress
- Latent space projection results saved in the
Viewing Results
1. Inline Video Playback
Use the following command to view the progress video inline:
from IPython.display import Video
Video('out/proj.mp4', embed=True)
2. Download the Video
To download the video file, use:
from IPython.display import FileLink
FileLink('out/proj.mp4')
Click the generated link to download proj.mp4 to your local machine.
Adding Gradio for Runtime Image Upload
You can integrate Gradio to allow users to upload a photo and generate the GAN output (image and video) on runtime. Here is how to modify the pipeline:
Install Gradio
!pip install gradio
Update the Code
Add the following Python script to create a Gradio interface:
import gradio as gr
import subprocess
from PIL import Image
def process_image(input_image):
# Save the input image to raw directory
input_path = "raw/input_image.jpg"
input_image.save(input_path)
# Align the face
subprocess.run(["python", "align_images.py", "raw", "aligned"])
# Run projection
subprocess.run([
"python", "projector.py", "--outdir=out", "--target=aligned/input_image_01.png", \
"--network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl"
])
# Load generated image and video
output_image_path = "out/proj.png" # Adjust if necessary
output_video_path = "out/proj.mp4"
output_image = Image.open(output_image_path)
return output_image, output_video_path
# Gradio Interface
demo = gr.Interface(
fn=process_image,
inputs=[gr.Image(type="pil", label="Upload an Image")],
outputs=[
gr.Image(type="pil", label="Generated Image"),
gr.Video(label="Projection Video")
],
title="StyleGAN2-ADA Image Projection",
description="Upload a face image to generate GAN output and projection video."
)
demo.launch()
Running the Gradio Interface
Save the above script and run it in your environment. A Gradio web interface will open, allowing users to upload images and see the generated results in real time.
Notes
- Ensure the internet is enabled in your Kaggle notebook for downloading required models.
- Verify input paths to match your dataset and file structure.
- Outputs are saved in the following structure:
raw/: Original imagesaligned/: Aligned face imagesout/: Projection results and video
Acknowledgments
- StyleGAN2-ADA by NVIDIA Labs: GitHub Repository
- DLib for face alignment: DLib Library