Stable Diffusion Cheat Sheet

Reviewed by Marouen Arfaoui · Last tested April 2026 · 157 tools tested

Last updated: April 2026

Quick Facts

Pricing

Open-source and free to run locally. Paid API access via Stability AI starts at $0.002 per image for SD3.

Free Plan

Yes + includes the core model, ability to run on your own hardware, and access to thousands of free community models.

Rating

4.5/5

Best For

Artists, tinkerers, and developers who want ultimate creative control, privacy, and no generation limits, and don't mind a technical setup.

Key Features

✓
Local Generation
I run it on my own PC. This means total privacy, no censorship, and zero per-image costs after the initial hardware investment.
✓
Text-to-Image
The core function. I type a description (prompt) and it generates an image. The quality is stunning, but prompt engineering is key.
✓
Image-to-Image
I feed it an existing image and a prompt. It transforms the image based on my text, perfect for sketches, edits, or style transfers.
✓
Inpainting/Outpainting
Inpainting lets me erase part of an image and have the AI redraw it seamlessly. Outpainting expands the canvas beyond the original borders.
✓
Model Ecosystem (Checkpoints)
The community creates specialized models. I have separate ones for photorealism, anime, fantasy art, and 3D renders. This is its killer feature.
✓
LoRAs & Embeddings
These are small add-ons to models. I use LoRAs to apply specific character faces or art styles without downloading a whole new giant model file.
✓
ControlNet
A game-changer. It lets me use edge maps, depth maps, or poses to rigidly control the composition, making the AI follow my sketches precisely.
✓
Upscaling
Built-in and external upscalers (like ESRGAN) can take a 512x512 image and blow it up to 4K or higher while adding believable detail.
✓
Negative Prompting
I tell it what NOT to draw (e.g., 'deformed hands, blurry, extra fingers'). This is crucial for fixing common AI artifacts.
✓
Customizable Samplers & Steps
I can choose different algorithms (samplers like DPM++) for speed or quality and adjust steps for more detail refinement.
✓
Automatic1111 WebUI
The most popular interface. It bundles all these features into a (somewhat chaotic) browser-based dashboard. It's essential for practical use.
✓
API for Developers
Stability AI offers a robust paid API. I've used it to integrate image generation directly into custom applications and workflows.

Tips & Tricks

TIP

Start prompts with the subject, then style, then quality terms (e.g., 'a knight, digital painting, intricate armor, masterpiece').

TIP

Use specific artists' names in your prompt (like 'by Greg Rutkowski') to instantly steer the style in powerful ways.

TIP

For photorealism, add technical camera terms: 'shot on a Canon EOS R5, 85mm, f/1.2, shallow depth of field.'

TIP

Negative prompt 'deformed, blurry, bad anatomy, bad hands, three hands, three legs, bad arms' to immediately improve quality.

TIP

Use a low 'denoising strength' (0.3-0.5) in img2img to tweak an image without completely changing it.

TIP

Download the 'EasyNegative' embedding and add it to your negative prompt; it's a community-trained catch-all for common flaws.

TIP

For consistent characters, generate a good face, then use it as an img2img source with a low denoising strength for new poses.

TIP

If you lack VRAM, use the '--medvram' or '--lowvram' command line arguments when launching Automatic1111.

TIP

Experiment with different samplers; Euler a is fast, DPM++ 2M Karras is my go-to for quality, and DDIM is good for img2img.

Common Commands

python launch.py --autolaunch --medvram

Launches the Automatic1111 WebUI interface in your browser, optimizing for medium VRAM GPUs (8GB).

Prompt: (keyword:1.3)

Uses prompt weighting. This increases the importance of 'keyword' by 30% in the final image.

Limitations

-Struggles with coherent text, complex anatomy (especially hands), and precise object counts (e.g., 'three cats').
-Requires a powerful GPU with significant VRAM (8GB+ recommended) for comfortable local use and faster generation.
-The learning curve is steep; you must learn prompting, model management, and UI navigation, which is not user-friendly.
-Output quality and style are heavily dependent on the specific model (checkpoint) you have loaded.
-As an open-source project, there's no official support; you rely on community forums and GitHub issues for help.

Alternatives

MidjourneyDALL-E 3 (via ChatGPT)Adobe Firefly

→

Stable Diffusion TutorialFull step-by-step guide

→

Frequently Asked Questions

What are the minimum PC requirements to run Stable Diffusion locally?+

You need a dedicated NVIDIA GPU with at least 4GB of VRAM (6GB+ is practical, 8GB+ is comfortable). 16GB of system RAM, a decent CPU, and 10GB+ of free storage for models. AMD GPUs can work but require more technical setup.

Where do I get started with Stable Diffusion?+

Download the 'Stable Diffusion WebUI by Automatic1111' from GitHub. It's the all-in-one package. Follow a YouTube tutorial for installation. Your first step after that is to download a good base model (checkpoint) from Civitai.

Is it legal to use images generated with Stable Diffusion commercially?+

Generally, yes, but you must check the license of the specific model you used. Most popular community models use the permissive CreativeML Open RAIL-M license. Always verify, especially for sensitive commercial work.

What's the single biggest mistake beginners make?+

Using vague prompts. The AI is a literal genie. 'A beautiful landscape' gives random results. 'A misty alpine landscape at sunrise, photorealistic, Canon EOS R5, dramatic lighting' gives you control and vastly better output.

How do I fix deformed hands and faces?+

First, use a negative prompt for 'bad anatomy, deformed hands'. Second, use a model specifically fine-tuned for photorealism. Third, use inpainting: generate the image, then use the inpainting tool to mask just the hands and regenerate them with a new prompt.

Was this helpful?