Stable Diffusion tutorial: an ultimate prompt guide

Created

October 27, 2023

Author

Vilius P.

‣

Table of contents

AI is changing the world, and nobody can be safe from it. You have to choose to sit and hope it's a fad and will pass, or you can start learning how to use it. By choosing to learn Artificial Intelligence tools, you will gain helpful skills and will be ahead of who the choice was to sit around and wait.

In this Stable Diffusion tutorial, I will teach you all the basics of creating a prompt for image generation. If you have a first time hearing about this technology, you can read a short blog post about it. Most AI image generators use the same principles of prompt creation, so if you want to use Midjourney or DALL-E, the information in this guide will be helpful.

All images in this guide will be generated using the A.I.C. mobile app for iOS and the Stable Diffusion XL model. You can download it now and get free credits to test what you learned. Or you can use any other tool that supports Stable Diffusion. This tutorial is not specific to one tool. It is the fundamentals of image generative AI prompt creation.

Download A.I.C.

Basics Of Prompt

The prompt is text where you explain to AI what result you want to get from it. It looks like an effortless thing, but it's not. I don't know how you are, but sometimes I have problems communicating my thoughts with real people. So, you need to know some rules on correctly expressing your thoughts in text form to get the best possible result from Stable Diffusion.

Don’t be polite

In this tutorial, I don't want to teach you to be rude but leave politeness out of prompt creation. Don't use words like Please or Thank you in a prompt. Avoid all words that will not describe your desired image. If you are using the Stable Diffusion XL model, your sentence can be grammatically correct. This model is advanced enough to understand full sentences.

Stable Diffusion models

I mentioned the Stable Diffusion XL model a few times in this guide. You need to know that the model is a switchable part of AI where magic is stored. In all seriousness, models can be trained using different data sets to excel in specific types of tasks. In Stable Diffusion, models determine what art style they can replicate best. For example, the Anything model is trained in anime style and can create fantastic anime images, but it will fail to make a photorealistic image of a person. For that, it's better to use the Realistic Vision model. Of course, then you have Stable Diffusion XL. It's a general model good enough to create every style but not exceed any specific. Midjourney or DALL-E are also general types of AI technologies without the possibility of using custom models.

How to structure prompt

Prompt base (Raw prompt)

Raw prompt, or I like the name of prompt based. It's the main content of your image. Let's say we want to generate an image of a cat in a cardboard box. Our prompt may look like this: cat in a cardboard box

cat in a cardboard box

You can see we got a cat in a cardboard box. AI gave us what we asked. But I didn't want multiple cats in cartoonish style. So, my prompt was only base of prompt. It was enough to get an image of what I asked but not enough to get what I wanted. By adding modifiers, we can change the result the way we want. Let's try again, but this time by using modifiers. cat in a cardboard box, 1cat, realistic, studio light, box full view, high angle shot

cat in a cardboard box, 1cat, realistic, studio light, box full view, high angle shot

We got more acceptable results by adding a few words (modifiers). With this prompt, it may take more than one time to get something you like, but it will be a better result than using only the base of the prompt. Continue reading this guide, and your success rate will increase dramatically.

Style keywords

Art is always created in some kind of style, but only a person can create art and, with it, define a new style. AI doesn't have the capability of that. It will choose a random style if not told and will show originality based on prompt instructions.

Let's say I want to create an image of a cat playing with yarn but in a concept digital art style. First prompt will look like this cat playing with yarn, concept digital art.

I think Stable Diffusion XL does a great job of giving half-decent results. But if we really want to get concept digital art, we need to give more details to prompt. concept art cat playing with yarn . digital artwork, illustrative, painterly, matte painting, highly detailed

By giving more modifiers of art style, we can get better results. But the hard part is to know what keywords define your wanted art style. Here, I don't have any secret. You will need to play around and see what is working and what is not. This is where prompt creator creativity will start to shine. You can read these 118 styles for the Stable Diffusion XL model to see how others are creating some styles. It can be a good foundation for your prompt style.

Tip. Then, you experiment and try new styles on the person. Use celebrity as a person. Stable Diffusion is trained on public data, so famous persons are drawn best without having complex prompts.

Camera and Lighting

I want to emphasize these two modifiers. In my opinion, by using them correctly, you can make mediocre prompt to be amazing. By controlling the camera angle and lighting, you will control emotion. It's hard to be successful the first time, but with word weights and negative prompts, it will be easier.

Here are camera keywords you should know:

eye level, low angle shot, high angle shot, drone shot, over the shoulder, dolly shot, crane shot, tracking shot, aerial View, steadicam shot

here are lighting keywords you should know:

daylight, moonlight, natural light, front light, backlight, soft light, hard light, moody light, dynamic light, rim lighting, sunlight, volumetric lighting, radiant light ravs, global illumination, translucent, luminescence, hard shadows, god rays, diffuse lighting, bioluminescent, studio lighting, cinematic lighting, dark, soft box lighting, glowing light, long exposure

These two topics require separate articles I will write in the future.

Word Position

Stable Diffusion doesn't prioritize words by their order. But you should know that it processes prompts in chunks. The result can differ depending on where the prompt chunk ends and the new one begins.

1boy, green eyes, close-up, short blue hair, white shirt, gold ring, red tie

1boy, green eyes, close-up, short blue hair, white shirt, gold ring, red tie

1boy, green eyes, short blue hair, close-up, white shirt, gold ring, red tie

Same prompt, different results. You will see this mistake only after generation is over. But it's easy to fix by reordering modifiers.

Controlling word weight

Giving weight to words is one of the more advanced methods to control your AI generator results. By wrapping a word in parentheses () we make it more important than other words. By wrapping it in square brackets [] we are making it less important. () - adding weight [] - subtracting weight

And it's not a one-time use. Every time you wrap a word, it will increase/decrease weight by multiplying/dividing 1.05.

(((Blue))) - will have 1,157625 weight (1x1.05x1.05x1.05) [[[Blue]]] - will have 0,8638375985 weight (1/1.05/1.05/1.05)

[[[[[[[[Diamond Crown]]]]]]]]

Diamond Crown

((((((((Diamond Crown))))))))

Negative Prompt

Negative prompts are one of the most powerful tools in your pocket to fix unwanted side effects created by AI image generation. For example, too many fingers on one hand or the wrong art style bleeding in the image. By writing a negative prompt, you say what you don't want to see in your image. You can use it to remove undesirable content like a deformed face. This way, you can clean your image and ensure you will not have artifacts in the final result.

No Negative Prompt

Negative Prompt: Deformed hands

Another usage is the conceptual opposite. If you want to generate an anime-style image in a negative prompt, you can write words opposite the anime style. For example, realistic, 3d. Doing that, you will have a conceptual cleaner result.

No Negative Prompt

Negative Prompt: 3d, photorealistic, cinematic

Both ways are correct, and you will see them in examples. But remember that a negative prompt is only the opposite of your prompt. You don't tell what to fix AI, but more what to avoid. AI image-generation tool creators are working hard to solve hand issues, so I don't think putting deformed hands in the negative prompt will get perfect results. It will make the situation better, but you will still need to fine-tune details till the image is amazing.

If you want to learn more about a negative prompt, read this article Best negative prompts for Stable Diffusion. You will find more information on how to use negative prompts and templates for common use cases.

Conclusions

My goal was to introduce and give essential knowledge about creating good-quality prompts. If you are still interested in learning more, you will be able to find more guides on the exact use case of AI image generation on my blog. This Stable Diffusion guide will help you create better prompts and better images. Don't forget the secret to perfect image is trying and experimenting.

You can try what you learned by downloading A.I.C. and using free credits:

Download A.I.C.