How to Train a LoRA Model for Stable Diffusion

LoRA (Low-Rank Adaptation) models let you train Stable Diffusion on specific styles, characters, or objects without modifying the base model. This process creates a lightweight addon that captures your training data's unique characteristics while preserving the base model's general capabilities.

  1. Prepare your training dataset. Create a folder with 20-50 high-quality images of your subject. Images should be 512x512 pixels, well-lit, and show different angles or variations. Name each file descriptively and avoid blurry, low-resolution, or heavily filtered images. Keep file sizes under 2MB each.
  2. Install Kohya_ss training interface. Download and install Kohya_ss from the official GitHub repository. Run the installation script for your operating system. Launch the web interface by executing 'kohya_gui.bat' on Windows or 'gui.sh' on Linux/Mac. The interface will open at localhost:7860 in your browser.
  3. Configure training parameters. Set your learning rate to 1e-4 for most subjects. Choose 'AdamW8bit' as your optimizer and set batch size to 1. Configure network rank to 32 and network alpha to 16. Set training epochs between 10-20 depending on your dataset size. Enable 'gradient checkpointing' to reduce memory usage.
  4. Set up image preprocessing. Enable 'Caption Extension' and set it to '.txt'. Turn on 'Shuffle Caption' and 'Cache Latents' for faster training. Set resolution to 512x512 and enable 'Bucket Resolution' for varied aspect ratios. Configure 'Min SNR Gamma' to 5.0 to improve training stability.
  5. Start the training process. Load your base Stable Diffusion model in the 'Source Model' field. Point 'Image Folder' to your dataset directory and 'Output Folder' to where you want the LoRA saved. Click 'Start Training' and monitor the loss values in the console. Training typically takes 30-60 minutes depending on your hardware.
  6. Test and refine your LoRA. Load the trained LoRA file into your Stable Diffusion interface using the LoRA weight between 0.7-1.0. Generate test images using your trigger words and evaluate the results. If outputs are too strong or weak, adjust the LoRA weight accordingly. Retrain with modified parameters if necessary.

Related

  • How to Use AI to Transcribe Meetings
  • How to Use AI to Translate Voice in Real Time
  • How to Generate AI Narration for Audiobooks
  • How to Generate AI Narration for YouTube Videos
  • How to Use Adobe Podcast AI to Clean Audio
  • How to Use Descript to Edit Audio with AI