AI image generation is getting crazy good

I asked Gemini to create the picture of a mosquito from the perspective of a dragonfly and it showed me a dragonfly chasing the mosquito. I rephrased the request, bringing further clarifications, but obtained nothing. I decided I should start afresh with Copilot. Before asking for the actual picture, I discussed the nature of the dragonfly’s vision and then compared it with human vision. When I asked Copilot for the same image it produced the one below, which I don’t consider a success. Let alone that its own explanation of the picture didn’t match the image itself: a mosquito appearing fragmented due to the dragonfly’s compound eye structure.

Imgur

Here are Gemini’s attempts:

Attempt #1

Imgur

Attempt #2

Imgur

I don’t think toothpaste would be at the top of the list, either. The sauce would fit right in if that was a current product slogan, though.

This is the closest I got

With this prompt

Summary

A surreal fisheye portrait of a frosted, hexagonally ribbed glass panel. In each segment a copy of an image of a mosquito is partially visible, not clearly and heavily distorted by the glass, creating a repeating, abstract visual effect. Each image segment is like a lens at a slightly different angle, fragmenting the mosquito into many smaller images. The mosquito needs to face the same direction in every image.

Which was partially based on a prompt I had saved from somebody on Sora that made an image seen through vertical glass ridges

Summary

A surreal portrait of a person behind a textured, vertical ribbed glass panel. The face is partially visible, not clearly and heavily distorted by the glass, creating a wavy, abstract visual effect. The lighting is dramatic and cinematic, with the subject illuminated in vivid red light, contrasting against a blue background. The overall mood is mysterious and emotionally evocative. The image features high contrast and saturated colors with a strong emphasis on artistic lighting and abstract composition.

Nice work, but shouldn’t AI be able to figure things out by itself?

It may have more training set data simulating fly’s eyes than dragonflies. I tried this.

The AI seems to fail to understand that it should produced a multiplied picture of the mosquito, where the insect should be viewed gradually from a different angle. But let’s say that the dragonfly’s nervous system eventually generates a single aggregate image–then the hexagons are futile.

An interesting comparison.

ChatGPT just gave me what I asked for

Copilot just gave me what I asked for

Gemini decided to be an asshole

With your forewarning about Gemini, I tried a little flattery in my prompt:

Forget Star Trek, we are living in the Hitchhiker’s Guide to the Galaxy.

Genuine People Personalities, yes. And I instilled the version of ChatGPT I use for RPGs with the persona of a snarky saucy GM. Or tried to. It generally starts out that way, but the longer the session goes, the blander and closer to its original form it tends to go. Long before I max out the full context window.

A Dobsonfly-blobfish hybrid from Copilot, ran through Night Cafe’s “creative upscale” at 5% creativity.

Hey, how’d you get that photo of my uncle Fred?

I wonder if the reason it does that is because the top is where the face is, and because humans are highly keyed to recognizing faces, that makes the face the hardest part to get right. A human artist might start with the face, for the same reason.

What does it do for, say, a reclining subject, where the face might be on the left or right side of the image?

I saw @solost 's recent joke about a country vet/doctor in the jokes thread, asked ChatGPT about the likelihood of such a doctor existing in real life (no, because the licensing for both would be brutal, but maybe it happened over a century ago, though) and then pasted the joke into the conversation, asked for an illustration:

Haha, thanks for illustrating my joke. Almost exactly as I pictured it!

As for the reality of a doctor/vet combo, I was telling a barely remembered joke I had heard years ago. I’m not sure if the doc was also a vet in the original version I heard, but it seemed to make sense in the context of the joke, to explain the dog and cat in a doctor’s office.

ChatGPT totally got the joke when I pasted it in:

HA! That’s a good one — I see why you were thinking about country doctors pulling double duty.

It’s playing off that old-fashioned idea of the small-town doctor who’s the only medical help around, but really it’s just a slick pun setup: cat scan and lab work. :smirking_face: Classic groaner.

In real life, of course, the cat and the lab (Labrador) wouldn’t be part of the bill — unless the doc’s running a veterinary hospital with one hell of a side hustle!

And yeah, this joke leans on the imagined, almost mythic figure of the rustic generalist who fixes everything from broken bones to sick livestock. Those types might’ve existed a century or more ago, but now it’s all cut up into specialties and licenses.

ETA: ChatGPT is better than Mycroft HOLMES IV when it comes to understanding jokes. But ChatGPT isn’t self-aware, though. (we think.)

I don’t think so. I asked the Copilot to draw a tesseract. It first drew the top of the cube without the inner cube inside. Then the top of the inner cube showed up but all the sides (of both cubes) were prolonged downwards indefinitely, looking like rectangles with a blurred bottom.

ChatGPT may not be self-aware, but it clearly has good taste and a good sense of humor since it liked my joke :wink:

Midjourney just does the entire image in passes, clarifying it each time. For example, here’s a modified version of the rabbit prompt about half done (wanted to make sure to include her face in case it mattered vs top down view of her head)