Table of Contents for What is Stable Diffusion and How Does it Work...:
- What is Stable Diffusion?
- Step by step guide Stable Diffusion
- Advantages and disadvantages of the AI image generator Stable Diffusion
- Use of Ki-generated content
- Alternatives to Stable Diffusion?
- Stable Diffusion vs. AI Midjourney
- Conclusion: Why Should I Use 3D Surface Renderings?
- FAQ - 3D Artist vs CGI Studio
What is Stable Diffusion?
Stable Diffusion is an AI image generator that generates digital images based on prompts, i.e. instructions in text form. The application was developed by Stability AI, a London-based start-up that has been in existence since 2020.
Runway ML, EleutherAI, the German company LAION and a research group from LMU Munich contributed to the company's AI image generator. The first version of the tool was released in August 2022.
It is open source software. This means that users can build on the existing code and develop it further. The whole thing is based on a deep learning system, i.e. a deep neural network consisting of several layers that make it possible to recognize and "learn" complex patterns and relationships in data sets.
This tool combines image recognition and speech recognition: The AI recognizes the voice commands that users enter and selects the matching elements from an existing image database.
The AI was trained with an extremely large number of images, each of which was given a suitable term and subjected to a latent diffusion model process. Diffusion means that an image is created from a pattern (dots or pixels) and the corresponding program recognizes the defined aspects of the image.
The several million images came from the LAION Aesthetics dataset. The AI can only use existing sources to generate "new" images.
Step-by-step instructions for stable diffusion
Stable Diffusion can be accessed in several ways. Option 1: Open Stability AI's website and click on the "Dream Studio" tool. Option 2: Open Hugging Face Hub via the platform. Option 3: Download software to your own device.
Step 1:
Open the Stability AI website.
Step 2:
Scroll down until you see the "Dream Studio" button. Click on it.
Step 3:
On the page that opens, look for the "Get started" button (may also be marked as "Try me now" or "Try for free"). Click on it.
Step 4:
Register with your e-mail address in the input mask that should now open.
Step 5:
You will receive a confirmation email. Use the link in the email to access the Dream Studios front-end application.
Step 6:
You will see another input screen. Enter your prompt, i.e. the text command, in the text field shown.
Important to know: The quality of the prompt is directly related to the quality of the result. The more precisely you formulate, the more accurate the output you get. Because not everyone is a gifted prompt engineer, Stability AI has published a prompt guide.
You will achieve the best results with Stable Diffusion with English prompts. The tool can also work with German instructions. But it draws on a much larger database in English. The prompts should be as detailed as possible. Keywords are better understood than fully formulated sentences.
Once you have entered your prompt, the tool provides you with four image variants. You can use these variants to continue working with it.
Advantages and disadvantages of the AI image generator Stable Diffusion
First of all, it sounds relatively easy to generate usable images with this tool. And it is. You should be reasonably fluent in English and be able to describe what you expect from the tool. This way you can generate image material in sufficient resolution for free and with a manageable amount of time.
But this is where the problems begin: The 3D footage is usable and the resolution is good. It is not outstanding image material, and the resolution is not excellent. The more specific you want your results to be, the more time-consuming it becomes to generate the material. At a certain point, the time required is no longer manageable.
And then there is still the problem that Stable Diffusion can only work with the image material that was fed into the LAION AI. It is therefore not possible to create something completely new.
The biggest advantages are that the tool is free to use and intuitive.
Advantages at a glance:
- Simple operation
- Good resolution (for most purposes)
- Available free of charge
Disadvantages at a glance:
- Can be time-consuming
- Partially faulty outputs
- The resolution is not high enough for some purposes
- Legal concerns
- Can only create images on basics
Copyrights of Ki-generated content
What about copyrights and rights of use? First of all, the legislation varies in the different countries where the tool is available. There is no uniform regulation.
And then it is generally disputed who owns the rights to AI-generated content. There are good arguments that the copyrights belong to those who programmed the AI. After all, the content could not be created without these people.
However, it is just as logical to assume that the copyrights belong to those who made the AI create precisely this content by entering customized prompts. This question has therefore not been conclusively clarified. It is also unclear who can be held liable in the event of problematic content.
Given this, it is completely understandable that companies are very hesitant to use AI-generated content. After all, the rights to use artistic and creative content can only be granted by those who hold the copyright. And that, as mentioned above, is not clear. In any case, the applicable terms and conditions should be thoroughly reviewed before content is used to whatever extent.
Alternatives to Stable Diffusion?
There are indeed some AI image generators that you can try out as an alternative. Artbreeder is one of them, DeepAI and DALL-E are other possibilities. Craiyon, NightCafe and Visionist are also more or less suitable for generating image material. However, AI Midjourney is probably the best-known representative among AI image generators.
Stable Diffusion vs. AI Midjourney
The first noticeable point is: Stable Diffusion is free to use and the resolution is good enough compared to AI Midjourney (higher than DALL-E). The speed and implementation of the prompts are satisfactory and the image quality is comparable.
What is striking, however, is that you have direct access to the input screen and the results of the Stability AI tool via Dream Studio. AI Midjourney is currently (summer 2023) still used via Discord. Discord must be installed, you need a user account, and data transfer is often overloaded. Then you wait a very long time for your prompts to be processed, even for relatively simple tasks, which is annoying.
The second point is privacy. With AI Midjourney, the generated image content does not belong to you. AI Midjourney reserves the right to show your generated material as an example in the gallery. This means that the 3D images are accessible to all interested parties, who can also continue to work with them. If you want to generate more than just a handful of images and use them commercially, you will need a subscription. Privacy also costs money.
Conclusion: Why Should I Use 3D Surface Renderings?
Generating images using AI has become much easier in recent years. The technology is making enormous progress. In fact, the development of tools is ahead of the formation of opinion in society - we simply don't know today how we should deal with this image material legally and morally.
The image material is not curated, which is why there may also be offensive material. You should not expect unique image material that is tailored to your application.
You can't even expect flawless images, because horses with five legs and similar mistakes happen all the time. You shouldn't expect diversity in terms of skin color, nationalities, languages, etc. either - this is where algorithmic bias comes into play.
If the result is still sufficient for you, there is no reason not to use Stable Diffusion or a comparable tool.
AI image generators will not disappear again, but will find and maintain their place in the creative industries. It is therefore time to take a closer look at them - from a technical, ethnic, user and legal perspective.
However, if you want to create completely new images, for example product images for your marketing, Stable Diffusion is not the right choice. In this case, however, we can help: Our CGI agency Danthree Studio can create product visualizations and animations of home & living items, interiors and furniture that are completely unique and legally compliant. Contact us for a free initial consultation!