AI image generation is getting crazy good

It does a pretty good job with Skylab if you provide a reference photo, but doesn’t do a good job of docking Apollo to it. (To be honest I’m not sure how that should have looked myself.)

Summary

Apollo Creed docked at Skylab. It should be the character Apollo Creed from Rocky, except he is like a spacecraft docked to a port of the Skylab space station orbiting Earth. photorealistic dslr photo with shallow dof in a style suitable for a classic Time magazine cover photo (but without cover text).

I gave it a reference image this time. It got several other details right that it hadn’t before, but it still couldn’t get the proper orientation of the docking adapter/telescope mount with its 4 solar panels correct, like a helicopter’s blades. I give up.

Now it got the solar panels and ATM right for you. Ugh.

FWIW, I used this reference photo.

And I used this one. Not sure why yours was better for its understanding.

It looked clearer to me. I just now picked this image for Apollo for upload to Sora

And this prompt

Summary

Apollo docked at Skylab. It should be the Apollo Command and Service module docked by the nosecone to a port on the side of the Skylab space station orbiting Earth. photorealistic dslr photo with shallow dof in a style suitable for a classic Time magazine cover photo (but without cover text).

The attached photos are examples of Skylab and the Apollo Command and Service Module.

But am out of free generations for now. You might want to try it (with those two photos).

Took me a moment to see how Harpo and Teller fit together. (Took me another moment to add the mima.)

Long ago in an Olan Mills far, far away…

(That was from a four image set. The other image was very similar to this one but two of them “failed”. I wonder in what way they differed so as to trigger censorship.)

Whereas I immediately recognized Teller, and then my mind automatically extrapolated that the person next to him must therefore be Penn, despite looking nothing at all like him. Once you identified him as Harpo, the pattern fell neatly together.

Though the portal shouldn’t be floating in midair.

It landed on the invisible box Marcel Marceau was trapped in.

Just screwing around making fake album covers. Decent likeness of both. I’d recognize them without the title. I should have asked it to give it an album name involving mustaches.

Imgur

I think this pairing makes a better team-up.

I thought of doing a Cylon and Garfunkel like appeared on Futurama, but decided that Garfunkel isn’t a recognizable enough figure.

Today’s project, I really admire the realistic, atmospheric film and lighting styles I see in some AI photos (often in places that don’t automatically share the prompt) and want to experiment more with those. I wanted a scene in heavy rain with misty air. Had to come up with a subject, imagined a girl sitting, bored, sheltering from the rain. With a capybara. Googled around until I found an image for reference of roughly how I was seeing her seated. Had to decide what she was sitting on, considered wooden pallets, then went with a bench under a bus shelter.

(Reference image)

So I go to Sora, describe the scene in detail and provide the reference image in addition for illustrating the pose I wanted. Ran into a reasonably unexpected problem: Sora/ChatGPT just doesn’t seem to be able to imagine her with her foot lifted up sitting on the bench with her and just hangs her foot mid-air. Had some otherwise pretty nice images spoiled by the stupid floating foot. After a few tries with prompt permutations (including dumping the reference and going text only) I decided that particular detail wasn’t a hill I cared enough to die on, so tried to think of something else for her foot to rest on, went with a backpack. And it doesn’t want to put her foot on the backpack either. (Apparently the capybara wants to sit there, though.) Kind of surprising that out of all the details it can get more or less right, that is one that stumps it. I eventually dumped the backpack, too.

Got a number of nice images, but never got the foggy look I was originally looking for no matter how I described it until I finally tried changing the film from “modern smartphone” to “1980s film stock with a disposable camera”, resulting in the bottom right photo, which is the closest to my original idea.

The final prompt for that one:

Summary

A pretty, slender teen girl. She has shoulder-length brown hair that lies limp with water (a strand or two fall down around her forehead or face). She wears a worn tan M-65 field jacket (open to expose a white cropped tank top underneath), fashionably-frayed faded black jeans and bulky white sneakers. She is sitting on a long art deco park bench under a long, simple, somewhat old, slope-roofed bus-stop style shelter. Her elbow is resting on her bent knee and her hand is cradling her head in obvious boredom. The other arm is resting on her other leg, her hand hanging relaxed. Close by her side on the bench a serene but wet capybara is sitting upright. It is raining heavily, and though she is protected under the shelter rivulets of water run from its roof. The air is misty with heavy rain and fog but in the background you can see the colorful lights of a tall futuristic city of giant skyscrapers at night across most of the horizon. Candid, completely realistic flash photograph, the subjects slightly obscured by fog as the minute suspended water droplets reflect the camera’s flash. With shallow dof and forced perspective taken on 1980s film stock with a disposable camera. It is a profile taken with a wide-angle lens from a medium distance, giving the girl, capybara, and bench a sense of lonely isolation in the rainy night. 9:16 portrait.

Bonus art styles

BTW, I wondered when this would eventually happen, but Night Cafe has apparently added ChatGPT 4o for paying customers. They have three versions, which they call GPT low, GPT medium, and GPT high. (It won’t even show me how much the images cost without a paid subscription.)

I’d like to see Garfunkel rock out.

I suddenly realized how to make the batch/script prompt idea discussed a couple of weeks ago work! It came to me to try a variation on the “five random words” prompt.

Here’s the original prompt, reworked:

Summary

“Candid waist-up iPhone capture of a young woman with tousled mid-length hair, paparazzi-style framing with tilted composition and flash glare, amateur aesthetic featuring pixelated texture and uneven focus.”

Randomly choose one of these six phrases to add to the above prompt, completing it:

“Nighttime cafe window reflection with rainy glass distortion”

“Crowded metro station with motion-blurred commers behind”

“Overgrown alleyway lit by flickering neon signs”

“Sunny park bench with blown-out sunlight through tree leaves”

“Concert exit chaos with streaked car headlights”

“Laundromat interior with fluorescent tube lighting casting green hues”

And the results from a four image set

It did the laundromat twice. Not sure how or if you can adjust it to avoid repeats within an image set (since each image in the set is a separate run of the prompt.)

“Baltimore Orioles uniform based on the Old Bay packaging”

Imgur