AI image generation is getting crazy good

In the ancient AI days of Stable Diffusion 1.5 I was attempting to create an image of shelves full of jars full of heads in the style of Futurama. One of the images generated was, unexpectedly, this:

It had really interesting potential, so I ran lots of new prompts to refine what I stumbled on accidentally: a redhead standing in front of shelves full of colorful jars, images like this:

And this:

(Being SD 1.5, showing her face wouldn’t have been especially great.)

I recently revisited the concept in Sora:

Summary

Realistic iphone 15 photo of A redhead girl choosing a jar from a set of shelves filled with colorful jars filled with a large variety of strange fruits and strange deep sea fishes.

Hippos in their natural habitat.

The latest r/ChatGPT trend is “design a job poster advertising for my dream job, based on what you know of me from our conversations.”

Here’s what I got from that prompt when I didn’t first enable the cross chat contamination feature. That’s what I call it, anyway, because 99.999% of the time, no way do I want my streams to be crossing. It’s hard enough keeping the AI on track and not ignoring critical story details in the current mainline.

I suspect that it gleaned all that from its system prompt.

A single image that led in two directions. I saw this image on a FB AI group (no prompt given)

I wanted to play with elements of that—the high camera angle, reaching hand, and washed out colors. So I asked ChatGPT to give me a prompt that would reproduce the image. Here’s the prompt that ChatGPT came up with (minus the part for creating the superfluous text):

Summary

A highly stylized, high-angle (top-down) portrait of a girl standing in a brightly lit, pastel-colored room filled with Japanese vending machines and arcade elements. The girl looks up directly at the camera with wide eyes, reaching one hand toward the lens in dramatic foreshortening. She wears a short black skirt, graphic anime t-shirt, red accessories, and dark sneakers. Her hair is styled in two high pigtails with red ribbons. A curious cat looks up from the floor beside her. The lighting is overexposed and dreamy, with a soft glow, creating a surreal, playful mood. Background is decorated with floating shapes, retro icons, and soft shadows. Inspired by Japanese pop culture, vaporwave, and 2000s kawaii aesthetics. Shallow depth of field. Rendered in a soft cinematic style with pastel overtones and warm highlights.

That prompt doesn’t reproduce the style exactly, and I modified it a bit to get more realism

Then made it more grungy

I wanted even more grunge, and thought of a modification involving blobfish (I love ChatGPT’s blobfish) and changed the anime shirt to an engrish one

Then went for more cheerful engrish to contrast the grunginess

The prompt for that:

Summary

A high-angle (top-down) photo of a Japanese girl standing in the hold of a fishing boat. The distressed girl looks up directly at the camera with wide eyes, reaching one hand in a wave urgently toward the lens in dramatic foreshortening. She wears a short black skirt, fishnet tights, Cheerful colorful illustrated Engrish t-shirt (that is dingy), and red accessories. Her hair is styled in two high pigtails with red ribbons. She is disheveled. She is standing knee-deep in blobfish filling the hold. iphone 15 photo with shallow dof.

Meanwhile the grungy arcade machines gave me an idea for a liminal image: put a bunch of old broken machines in a run-down arcade (in the “long, covered building” definition of the word). That idea lead to some images I’m really pleased with

(I’m really curious what that ufo-shaped game in the bottom horizontal image is based on.)

Tossed in some yokai:

The prompt for that:

Summary

An abandoned long arcade. The arched glass roof is very dusty and has broken panels that allow weak, dusty beams of sunlight to filter down to the stained tile floor. The long, dingy room is filled with battered, broken Japanese vending machines and gaming machines. Yokai are wandering around. iphone 15 photo with shallow dof and forced perspective.

Looking just at the first pic you found and the second pic = the first pic you made …

To me the figure in the first / found pic reads as about age 20. Might be a couple years older but with makeup & hairstyles trying to look a younger 16-18.

The figure in the second / generated pic reads as about age 10. Might be a couple years younger, but with makeup & hairstyles to look more like 14-16.

I see the terms “girl” and “kawaii aesthetics” in your prompt. If you swapped “woman” for “girl” I wonder what it would do?

In a larger sense I’m wondering about what ages the AI associates with various looks. And whether it’s “default age” is more like 10, 20, 40, or 60.

Woman didn’t make any substantial difference.

Elderly woman, however…

Replacing woman with wombat

Fascinating; thank you.

The cat got increasingly alarmed as your images went on. Perhaps it’s scared of what the next request will be.

I threw “Photo portrait of a XXX in an arcade” into Midjourney using some vaguely age-identifying terms. Results are the pre-upscaled roughs in draft mode.






Almost forgot this one, “Aunt” which sort of falls between “Mother” and “Grandmother” for age looks

5025: After centuries of AI spoon-feeding us, humans devolved, allowing pets to evolve and run the show.

I just tried something similar with Copilot (Photo portrait of a x in an arcade). Strangely, it appears to think that everyone in arcades wears denim jackets. (I ran each prompt in a new session, so no details are leaking between images.) This is girl, tween, female, woman, mother, grandmother.

I kinda laughed at the term “coed.” That’s one I haven’t heard in decades. What about “wife”? I feel when I tried it awhile ago, it trended about mid-30s.

Absolutely dated but I’ve found it useful for “Little older than a teen” depending on the model.

Little younger than I remember. You could see if “housewife” nudges it up.

Photo portrait of a anthropomorphic denim jacket in an arcade

Actually, if you removed the cat & dog you’d have a portrait of a Saturday evening in one heck of a lot of US households.


Amazing outcomes on the various age words. It would be interesting to compare the apparent ages of the corresponding male & female words. I might choose “frat boy” and “sorority girl” as the terms calling for the ~20 demographic.

I am wondering whether we’d find the same terms would have an age bias between genders. i.e. “woman” looks mid 20s but “man” looks late 20s or even mid 30s. etc. “Middle aged man/woman” might be another loaded term to feed it that would give us different ages for males and females.

I have to say its ideas of aunts and grandmothers are far older-looking than the ones I date.


I’m sure not trying to use you all as the AI driving the AI to do my bidding. But if you find the questions interesting, have at it. :wink:

I assure you that is not a photo of my family on a Saturday evening, sir!

It was shot Friday evening.

Boy/man/father/grandfather. This time we get the first person without a denim jacket and the first person who isn’t white.

At this point I ask:

Why does every image on someone in an arcade have them wearing a denim jacket?

And the reply

It’s like the unspoken dress code of nostalgia—denim jackets practically come with a side of synthwave. Arcades often evoke that golden era of the '80s and '90s, and nothing channels that vibe quite like a faded denim jacket. It’s cinematic shorthand: the rebel, the romantic, the joystick warrior. Think about it—denim has that timeless, effortlessly cool quality that plays well under neon lights and reflected CRT glow.

Plus, from a visual standpoint, the texture and color of denim catch ambient light in a way that adds depth and contrast in a chaotic arcade environment. It’s the perfect foil to flashing cabinets and wild lighting—less a fashion statement, more of a mood enhancer.

But now I’m curious: what if we broke the trope? A grandfather in a crushed velvet blazer or a kimono at the skee-ball lane. Want to try spinning the stereotype?

So I ask for the blazer over a kimono.

What, no fez? What was that AI thinking! :grin:

Seriously, thanks for the research. Verrrrry interesting as Arte Johnson used to intone.

You don’t watch TV, so this won’t make immediate sense to you.

“Sorority girl” gave a more limited vibe versus “coed”, in the very limited testing (i.e. running it once)

Running males, I noticed that I actually got some minorities versus the slew of white women I got above. I suspect that pretty white women are over-represented on the internet and in media for obvious reasons and so are more likely to be the result the model comes back with.

I also ran “Kid” but it came back looking identical to “Boy” so didn’t bother saving it. But it’s also context dependent – Saying “Photo portrait of kid at a pony show” gave me three girls and one indeterminate gendered kid. So apparently it thinks a “kid” at the arcade is likely to be a boy. However, when I previously asked for “Teen”, it gave me three girls and a boy. So its concept of what people are asking for by just the word “Teen” seems to outweigh that context. It also means I left “Teen” out of the run of males since we already know what it’ll output. “Boy” actually returns a preteen kid versus “Girl” which returns a teenager likely owing to how the terms are used in society.

Anyway, here’s some guys…







Finally, since I ran “Middle Age Man”…