top of page
Writer's picturefridahoeft

#13 Using stable diffusion to generate letters

In experiment #12, the state-of-the-art AI tool stable diffusion is tested by a type designer. Stable diffusion can create realistic images, but can letters and type also be generated by the AI tool?

In experiment #13, the latent diffusion model Stable Diffusion is explored for its usefulness for type designers. For this, the notebook stable_diffusion.ipynb from Hugging Face is executed via Google colab. This can also be used to adjust some parameters. The higher the value of the num_inference_steps, the more detailed results are produced. A higher value of the guidance_scale forces the model to better match the prompt, but the generated images are also less diverse. The following overview shows the effect of the num_inference_steps on the prompt "A black alphabet on a white background". The best results are achieved with values around 50 and this value is also set for this experiment. The guidance_scale­ is between 5 and 10 for the most accurate results. Above this value, the pictures start to become blurred and break down into the individual colour pixels. For this experiment, the guidance_scale is set to 8. The model was trained with the large-scale multi-modal dataset LAION-5B which contains 5.85 billion CLIP-filtered image-text pairs. The text-to-image model can generate images with a size of 512x512 pixels.

The results of this experiment are remarkably clear, almost like vector graphics, mostly without artefacts and often have textures. But the generated images rarely match the prompt. Although letters were generated, they were different from those specified in the prompt. The probability of matching is 11 %. When an attempt was made to generate an image from the prompt "A letter A"", only a black image was output with the warning NSFW-content.

174 views0 comments

コメント


bottom of page