1. Generative Adversarial Networks (GANs)
- Concept
- GANs consist of two neural networks, a generator and a discriminator, that are trained simultaneously through adversarial processes.
- Generator:
- Produces synthetic data resembling the real data.
- Objective: Fool the discriminator into classifying generated data as real.
- Discriminator:
- Distinguishes between real data and generated data.
- Objective: Correctly identify real versus fake data.
- Interaction:
- Both networks compete against each other, improving their performance iteratively.
- The generator learns to create more realistic data, while the discriminator becomes better at detecting fakes.
- Adversarial Training Process
- The generator generates samples from random noise input.
- The discriminator evaluates these samples against real data.
- Loss functions guide each network:
- Generator loss: Maximizes the likelihood of discriminator misclassifying generated samples as real.
- Discriminator loss: Minimizes misclassification while correctly identifying real data.
- Training continues until the generator produces outputs indistinguishable from real data according to the discriminator.
- Applications
- Image Synthesis:
- Generating high-quality, realistic images.
- Example: StyleGAN for creating human-like faces of people who do not exist.
- Text-to-Image Generation:
- Generating images from textual descriptions.
- Example: Creating artwork or visual content from natural language descriptions.
- Super-Resolution:
- Enhancing the resolution of low-quality images while preserving details.
- Example: Applications in satellite imagery and medical imaging.
2. Training Challenges
- Stability Issues
- GANs often experience unstable training due to imbalanced progress between the generator and discriminator.
- Problems such as mode collapse occur when the generator produces limited variations, failing to explore the data distribution fully.
- Discriminator overpowering the generator can halt learning, while a weak discriminator provides poor feedback to the generator.
- Techniques to Improve GAN Training
- Loss Function Modifications:
- Wasserstein GAN (WGAN): Introduces a smoother loss function to improve stability and training dynamics.
- Least Squares GAN: Penalizes incorrect outputs more effectively.
- Batch Normalization:
- Normalizes activations to stabilize gradients and improve learning rates.
- Label Smoothing:
- Modifies discriminator labels slightly (e.g., “real” = 0.9 instead of 1.0) to prevent overconfidence.
- Training Tricks:
- Balancing generator and discriminator updates (e.g., training one more frequently than the other when needed).
- Introducing noise or dropout to prevent the discriminator from becoming overly confident.
- Advanced GAN Variants
- Conditional GAN (cGAN):
- Generates data conditioned on specific input labels (e.g., generating images of a specific category).
- CycleGAN:
- Enables unpaired image-to-image translation (e.g., converting photos to paintings).
- StyleGAN:
- Enhances control over the generated image’s style and attributes for better creative outputs.