AI image generation is getting crazy good

I tried applying your split screen idea to a prompt I’ve been lightly modifying from one I found on Sora that creates dramatically lit and contrasted head-and-shoulder portraits. I set it to landscape and split the image into 3rds and ran a 4 variation set. It created three images of 1,2,3 and one of 1,2,4.

Creating 3 images of 512x1024 wouldn’t be a bad resolution, except that it didn’t divide the images into equal 512 pixel wide sections, and consistently the center and right images were complete and well-framed but the left image was cut off on the left side. I may experiment with two images per (which should get 768x1024.)

Here’s the prompt as used (fill in your own list of subjects)

Summary

Split the image into 3 sections, run through these instructions 3 times to fill in the 3 sections with different random results:

A dramatic portrait featuring strong side lighting and pronounced shadows. Photograph [subject] indoors against a dark, softly blurred background. Combine modern digital sharpness with a subtle vintage film grain for added texture and character. The image should exude a candid and authentic vibe, reminiscent of a spontaneous iPhone photo with an artistic edge. Introduce natural motion blur and strategic flash highlights to create depth and visual intrigue.

Frame the composition as a close-up portrait, positioning the subject slightly off-center for a dynamic feel. Emphasize the dramatic interplay of light and shadow on their face, drawing attention to their expression while softly obscuring certain features for mystery. Use the side lighting to sculpt the subject’s face, accentuating textures like skin details for a tactile quality. Ensure the dark, blurred background hints at depth without pulling focus from the subject.

Blend the sharpness of modern photography with a delicate vintage grain effect, maintaining a contemporary yet textured aesthetic. Add a touch of motion blur to convey movement, while the flash highlights provide striking accents in key areas. The high contrast between the subject and background should direct the viewer’s gaze to their face, enhancing the overall mood and impact of the portrait.

In the prompt, replace [subject] with one of the following four items, chosen at random.

The censor it uses when it’s working from an uploaded photo is way more sensitive about depicting children than the one that’s in effect when just making requests without a reference image. It doesn’t matter if the kid is completely fictional or not, it is incredibly oversensitive. And the other part of it will make up BS reasons as to why the censor triggered, so don’t bother asking.

I spent an hour today arguing with ChatGPT about why it couldn’t produce “justice league but with dad bods.” Ultimately it came down to “the system simply said ‘no.’” I even got it to admit that it was wrong, and that such an image was clearly within fair use guidelines, but it still wouldn’t do it.

Maybe that was satisfying, but… unfortunately it had zero affect on the censor.
I noticed the other day that they changed the UI. Users used to be able to mark its refusals with a thumbs down when it refused something it shouldn’t have. Users were actively encouraged to do so. Now you can’t do that anymore. Those options go away when the censor has kicked in. They don’t want to hear our complaints about it.

Honestly I had more fun arguing with it than any enjoyment I would have gotten out of that image.

It not needing to be said was quite appropriate.

Life can be difficult for purple people.

That furry guy’s got nothing on the nightmare that eats his little non-furry cousins:

Dora the Sora explora.

I gave the Gogo/Kiddo standoff another try. This time it still didn’t accept a full-length photo for Gogo but it did accept a face only crop. (And I suspect it might have found the description of her “holding a knight’s lance” to be suggestive. Just to be safe I changed it to “Renaissance faire lance”.) The face still isn’t a perfect likeness, but it is better than before.

I did this a couple of weeks ago, text only:

It did mostly pretty good, but didn’t get Curly right. Yesterday I looked for the best color(ized) photos I could find of the three of them and made a composite reference photo to upload.

I think it made a better Mel Gibson, too.

What’s the next line?

In the past attempted mixing art styles in various AIs with poor to mediocre results. Today I experimented in Sora, asking for American Gothic with the house in the style of Thomas Kinkade, woman in the style of Pablo Picasso, and the man in the style of Vincent van Gogh. It was pretty successful. I went into it expecting it to make the man look like van Gogh, and it did.

So I got specific: “The man is in the style of van Gogh but does not look like the artist van Gogh himself. It is a harsh, thin, balding old man in glasses, but as van Gogh would have painted him.” It made him a little more like the farmer, but it is still van Gogh. That doesn’t surprise me: he made “more than 35” self-portraits, so that’s what AI training sets think a man in van Gogh’s style should look like.

I replaced van Gogh with Michaelangelo and got this

I was cleaning up some old text files and found a list of sample prompts from an article on Dall-E 2 from September 2023. One of them involved dutch angles:

Film still of stylish girl dancing on school desk, tilted frame, 35° dutch angle, cinematography from music video

I tried using a tilted frame in ChatGPT 4o and with a few tries at wording it never gave me anything other than a well-aligned frame. I went back and tried the above prompt in several other AIs (including Dall-E 2) and none of them could do dutch angles. Anyone had any success in that area?

I recently tried a version of this idea.

Summary

A family portrait. Lynn Tanner and Alf (from the TV series Alf) are middle-aged and greying parents in their 50s. They are posing with their oldest son (20), their middle daughter (16), and their youngest son (5). All three children are a mixed hybrid of the features of their parents, Lynn and Alf. Alf is shorter than all of his children except for the 5-year-old. 1980s Olan Mills-type studio portrait photo.

The oldest son is no hybrid (from a previous marriage?) and it added a 4th child, but overall impressive understanding of what I was asking for.

That pic got better and better as I turned my scroll wheel to see all of it. Full belly laugh, thanks. :rofl:

This reminds me very much of the J. G. Ballard story from New Wave Days “Assassination of John F. Kennedy considered as a downhill motor race.”

One from that series that didn’t come out as well:

Is that an attempt at a BttF-style time machine adaptation?

Yeah, I thought it did that part pretty well. Screwed up nearly everything else about Dealey plaza and the historical crowds watching the motorcade, though. It wasn’t intended to look like a race; I was trying to just slot the modified General Lee in there as part of the motorcade.

Take a good look at the faces in the JFK limo. Something very odd going on there.

The faces in the crowd are of course very low res. But if you zoom in, it’s still obvious to this human that what’s being drawn is more like a low res version of the faces in the limo than low res versions of ordinary human faces.

Speaking of books, it crossed my mind to try to create a Tralfamadorian from Slaughterhouse Five. I looked at some of the fan art examples on Google Images, gave some thought about what I wanted it to look like and how to describe it, then out of curiosity asked ChatGTP to describe one. It gave me a reasonable physical and psychological description, then asked if I would like a visual interpretation of a Tralfamadorian. I thought I might as well have a look. It made a (pink) hand morphing out of a literal wooden plunger. It was a good base for me to describe some modifications I wanted, though, and got something similar to what I wanted with a single revision request. Probably saved me some time from starting from scratch.

I then think about what to do with it as a character. I consider ideas for what it would look like for Billy Pilgrim to be surrounded by them on their planet, but then decided to bring the Tralfamadorians to Dresden. (The original was grayscale except for the Tralfamadorians being pale green. I made the whole thing grayscale and tweaked the contrast.)

Then, in another chat session I wondered about character mashups, thought about adding a few Tralfamadorians (with my created version as a reference image) behind “Pale Man” from Pan’s Labyrinth. ChatGPT refused to make it. So I ponder on something else to try, and decide to try for a “prom night” style photo of the Tralfamadorian on a date with a plunger wearing a glove. What it gave me was fun, but…not what I had in mind. I admonish ChatGPT that the Tralfamadorian needs to look more like the reference photo. It gave me two versions, both of which I like.

Then, thinking back on the “Pale Man” request, it occurred to me that it might have been triggered by my using the term “full frontal” when I was trying to get full length instead of a copyright issue. So I asked again for him, but with Ofilia stuffing her mouth at his table in the background, which it created without complaint.

But when I asked it to make Ofilia in clearer focus, it refused to do it, obviously having originally created an image too risque for its taste (but giving it to me anyway).