Nicholas Reece, the Melbourne lord mayor, recently faced online criticism after sharing AI-generated images of proposed parks with bizarre errors like lifeless bodies and extra limbs. These images were part of his re-election campaign pledge to build 28 new parks in Melbourne. Source: https://x.com/Nicholas_Reece/status/1837683224937746795
How?
AI generates images using models specifically designed for this task, such as Generative Adversarial Networks (GANs) or variations of transformer models adapted for images. Here’s a brief overview of these technologies:
-
Generative Adversarial Networks (GANs): This involves two neural networks, namely the generator and the discriminator, which work against each other. The generator creates images based on random noise data, aiming to produce images that look as realistic as possible. The discriminator evaluates these images against a dataset of real images, trying to distinguish real from generated images. Over time, the generator learns to make images that are increasingly difficult for the discriminator to differentiate from real images.
-
Transformers for Images: Models like DALL-E use a variation of the transformer architecture, which is originally known for its effectiveness in processing sequences in natural language processing. For image generation, these models are trained on a large dataset of images and their descriptions. They learn to understand the relationship between textual descriptions and visual elements, allowing them to generate new images based on textual prompts provided by users.
These AI models learn from vast amounts of data to understand patterns, textures, and relationships in images, which enables them to generate new images that are coherent and often quite realistic.
Where does the data come from?
Machines learn to generate images through a process called machine learning, which involves training a model on a large dataset of images. Here’s a explanation of how this happens:
-
Data Collection: The first step is gathering a large dataset of images. These images often come with labels or descriptions that help the machine understand the content of the images.
-
Model Selection: Depending on the task, different models can be used. For image generation, models like Generative Adversarial Networks (GANs) or transformer-based models (like DALL-E) are commonly used.
-
Training: During training, the model learns to recognize patterns, textures, and relationships in the data. For GANs, this involves the generator trying to create images that are indistinguishable from real images, while the discriminator tries to distinguish between the two. Over time, the generator improves based on the feedback from the discriminator.
-
Evaluation and Tuning: The model’s performance is continuously evaluated, and adjustments are made to improve its accuracy and ability to generate realistic images.
-
Generation: Once trained, the model can generate new images based on input it receives, which could be a noise pattern, a textual description, or some other form of input depending on the model’s design.
Through these steps, machines learn the complex task of image generation, creating new images that can be surprisingly realistic and detailed.
Why extra human limbs and lifeless bodies?
AI-generated images can sometimes display unusual or unrealistic elements, such as human limbs on lifeless bodies, due to the limitations and quirks of the training process and the model’s understanding of the data it has been trained on. Here are a few reasons why this might happen:
-
Training Data: If the dataset used to train the AI model contains a wide variety of images, including those with unusual compositions or abstract art, the model might learn to replicate these styles. It interprets and generates images based on the patterns and features it has seen during training, which can lead to unexpected results.
-
Model Interpretation: AI models, particularly those based on neural networks, do not “understand” images in the way humans do. They process images as arrays of data and learn to manipulate these arrays to produce outputs that statistically resemble their training data. This can lead to a literal or out-of-context interpretation of elements within an image.
-
Complexity of Human Forms: Human bodies are complex, and accurately generating them requires a deep understanding of human anatomy. AI models might not always have a detailed or accurate training on specific parts like limbs, leading to bizarre or unrealistic renderings.
-
Limitations of the Model: The architecture and capacity of the model can also limit its ability to generate coherent images. Simpler models or those trained with insufficient data might struggle to produce realistic images, especially for complex scenes or subjects.
These factors contribute to the sometimes surreal and unexpected outputs from AI image generation models. As technology and training techniques improve, the accuracy and realism of AI-generated images are likely to increase.
Concerns around AI-generated images
The use of AI in generating promotional material raises several concerns, including:
-
Accuracy and Misrepresentation: AI-generated images or content might not accurately represent the reality or the intended message, leading to potential miscommunication or misleading information.
-
Ethical Concerns: There are ethical implications, such as the authenticity of the content produced and the potential for AI to create deceptive or manipulative materials.
-
Impact on Creativity and Jobs: The increased use of AI tools can potentially displace human creatives, affecting jobs in graphic design, advertising, and related fields.
-
Public Perception and Trust: The use of AI-generated content might affect public trust, especially if the content is perceived as unnatural or eerie, as seen in the case of Nicholas Reece’s campaign.
-
Regulatory and Legal Issues: There may be a lack of clear regulations governing the use of AI in creating promotional materials, leading to challenges in accountability and standards.
These concerns highlight the need for careful consideration and potentially the development of guidelines and standards for the use of AI in promotional activities.
In summary
Innovative use of AI technologies, such as Generative Adversarial Networks (GANs) and transformer models like DALL-E, is great for generating images for promotional materials. These AI models are trained on extensive datasets to understand and replicate patterns, textures, and relationships in images, enabling them to produce new, realistic visuals from textual descriptions. This capability highlights the potential of AI to revolutionize design processes by offering cost-effective, efficient, and creative solutions. The technology’s ability to generate detailed and coherent images showcases its promise in enhancing visual content creation, making it a valuable tool for marketing and promotional activities. Despite some challenges and errors, such as the generation of unrealistic elements, the ongoing improvements in AI technology are likely to refine these tools further, increasing their accuracy and reliability in various applications.
Reference
Taylor J, 23 Sep 2024. Melbourne lord mayor ridiculed for ‘AI fail’ images with extra human limbs and lifeless bodies. [Online] Available from: https://www.theguardian.com/australia-news/2024/sep/23/melbourne-lord-mayor-ridiculed-for-ai-fail-images-with-extra-human-limbs-and-lifeless-bodies

