DALL-E 2 limitations and risks

DALL-E 2 Limitations and Risks

As impressive as DALL-E 2 is, it has some shortcomings and risks:

First off, it is worse at binding attributes to objects than other models. For example, when asked to depict a red cube on top of a blue cube, it tends to confuse which cube needs to be red and which blue. Furthermore, it is not yet good at creating coherent text in images. Here is what it came out with when asked to create an image of a sign that says "Deep Learning". The authors also observed that it has a hard time producing details in complex settings. For example, when generating an image of Times Square, the screens seem not to have any readable or understandable detail to them.

Apart from its shortcomings, it has biases commonly seen in models trained on data from the internet. For example, gender bias, profession representations, or images depicting dominantly western locations. And finally, there are of course risks of this model being used to create fake images with malicious intent. But luckily, the OpenAI team is taking a lot of precautions to mitigate these risks.

Keywords

DALL-E 2
Shortcomings
Risks
Binding Attributes
Object Recognition
Coherent Text
Details in Complex Settings
Biases
Malicious Intent
OpenAI

FAQ

Q: What is one shortcoming of DALL-E 2 in terms of object attributes? A: DALL-E 2 is worse at binding attributes to objects compared to other models, often confusing which attributes should be applied to which objects.

Q: How does DALL-E 2 perform when creating textual content in images? A: DALL-E 2 is not yet good at creating coherent text in images, often generating text that is not readable or understandable.

Q: What are some of the observed biases in DALL-E 2? A: The model exhibits biases commonly seen in data sourced from the internet, such as gender bias, profession representations, and a tendency to depict dominantly western locations.

Q: Are there any risks associated with DALL-E 2? A: Yes, there is a risk that the model could be used to create fake images with malicious intent. However, OpenAI is taking numerous precautions to mitigate these risks.

Q: How does DALL-E 2 handle complex settings in image generation? A: DALL-E 2 has a hard time producing details in complex settings. For example, when generating an image of Times Square, the screens do not contain readable or understandable details.