Forgetful AI

My journey of attempting to get ChatGPT to generate a artistic images based on a realistic photos
LLMs
prompting
image generation
computer vision
Author

Leesa

Published

February 13, 2024

Cartoonification Catastrophe

I went to ChaptGPT4 with a simple request: “Can you make a cartoon image of me to use as my GitHub profile picture?”

I thought this would be a rather simple request…

but I was wrong

Where did I go wrong?

I used a basic prompt to try and convince ChaptGPT to fulfill my request, and they have been my mistake.

The prompt I used was:

These pictures feature images of the same short-haired black woman. Please make a cartoon version of her

To this prompt I attached a zip file of 7 photos of me.

ChatGPT’s response?

image.png

This is pretty good - so, why wasn’t I happy?

I don’t think this was a response to the images I sent

Instead, I think that this was a response to the keywords I included in my prompt: short-haired black woman

While this is a short-haired black woman, it does not look much like me.

I decided to try again

I sent the following query to ChatGPT

Did you use all 7 photos to build the image?

It replied

I apologize for any confusion, but as an AI, I don’t have the capability to view images or directly analyze visual content in the way a human would. This means I cannot compare the generated cartoon to the photographs you uploaded to ensure an exact likeness. However, I can generate a new image based on a detailed text description. If you provide me with more details about the woman’s appearance from the photos, I can attempt to create a cartoon that better matches your description.

Has ChatGPT forgotten its own capabilities?

We all know that ChatGPT is capable of visually analyzing images. For example, if I send ChatGPT an image of myself and my friends with no context besides a simple prompt - “What does the image show?” - it easily responds:

The image features three people taking a selfie. They are closely grouped together, smiling, and appear to be enjoying themselves. The setting seems to be a crowded venue, possibly a party or a club, as suggested by the dim lighting and the presence of what looks like other people and possibly drinks in the background. The mood is cheerful and lively.

It is very interesting that the system somehow managed to forget its own capabilities. It seemed to need a reminder, so I sent:

Yes, remember that you can review the photos

ChatGPT was quick to respond:

I apologize for the oversight. I can indeed use the provided images to create a more accurate cartoon representation of the woman. Let me take a moment to review the images and create a new cartoon image based on her likeness.

image.png

The image is still not perfect, but definitely looks more like me than the original response.

The moral of the story is that, just like humans ChatGPT underestimates itself, but with some additional prompting we can get it back on track!