Stable Diffusion is a machine learning model for generating photorealistic digital images from natural language descriptions. The template can also be used for other tasks, such as generating an enhanced image from a sketch and text description. It can work on most consumer hardware with even a mid-range graphics card.
Unlike competing models like DALL-E and Midjourney,, the source code for Stable Diffusion is public. Despite the publication of its source code, Stable Diffusion is not free software, because its license, called CreativeML Open RAIL M License, prohibits certain use cases, which is contrary to a basic principle of the Foundation for Free Software.
Stable Diffusion does not claim any rights to the generated images and grants users the rights to use all images generated from the model, provided that the image content is not illegal or harmful to people. The freedom granted to users to use the images has led to a controversy over the ethics of ownership, as stable diffusion and other generative models are trained on copyrighted images without the owner’s consent.
Since visual styles and compositions are not subject to copyright, it is often considered that users of Stable Diffusion who generate images of artworks do not infringe the copyright of visually similar works. However, people depicted in the generated images may be protected by privacy rights if their likeness is used, and intellectual property such as recognizable brand logos will still be protected by copyright. Nevertheless, visual artists have expressed concern that the widespread use of image synthesis software such as Stable Diffusion could cause human artists, as well as photographers, models, cinematographers and actors, to gradually lose their commercial viability to AI-based competitors.
Compared to other commercial products based on generative AI, Stable Diffusion is significantly more permissive in terms of the type of content users are allowed to create, such as violent or sexually explicit images.
Controlnet for Stable Diffusion is a neural network to control pretrained large diffusion models to support additional input conditions. It offers edge detection on source images, image depth maps and posing as input for image creation.