AI image generation is getting crazy good

Even ChatGPT is afraid of Disney’s lawyers!

Yep. ChatGPT’s censor won’t even allow the parts of Disney that have fallen into the public domain.

Closest I could get copilot to get.

Some pieces of the recent Ohio meteorite fell on land belonging to Morton Salt. I uploaded the logo to Gemini and simply told it to replace the rain with small meteorites. Got this with the first try.

You’re right. He’s not like, the perfect example of the average Florida man. He’s like.. the aspirational, platonic ideal of a Florida Man. What every Florida Man could be if they they were the best versions of themselves as they would see it. I may or may not have created a whole Florida Man cinematic universe. I showed Claude an image and asked him to write a biography and his name is Buck Dale Malone Jr and he was born at a Lynyrd Skynyrd concert in 1981. His backstory ended up going pretty deep and kind of hilarious.


Here’s a Polynesian fire dancer like the sort of guys I saw in Hawaii. I absolutely love this character. I want to preserve him and iterate on him to a degree. I think I need to learn how Midjourney’s “omni reference” system works.

I’m sure you guys are generally aware of this but you get the best, most detailed, most realistic (if you’re going for realism) images where there’s a lot of training data (lots of real world pictures), the elements you want in the pictures are salient in the training data, and it’s all sort of congruent. So replicating beautiful real world scenes is trivial and absurdly high quality. So any images that have plausible people in plausible situations work great. Any realistic landscape works great.

So you can easily crank out a million versions of something like this. But it’s still really beautiful.

This one was actually inspired by a real shot I took in Hawaii - you can feed external images into midjourney like real photographs and use it as an image (preserve composition) or style prompt. The real/external image serves as a sort of inspiration for creation.

I also adore my little cheesecake kittens.

I sort of stumbled onto one through a vague exploration prompt and loved it so much I made a whole series of them like they were collectible figurines.

Look at how adorably they animate

They need to let the poor thing blink sometimes.

It’s a quirk of nature that cheesecake kittens are born with no eyelids. They don’t let that hold them back.

You’re right - actually, that’s a little unusual. Here’s a different take on a cheesecake kitten that blinks plenty.

It looks kind of like those eyes are button-eyes, like on a stuffed animal. From a few angles, you can see an edge.

Yeah - the prompt I used to create them made them more like little plushy figurines than actual animals. But the animation system is versatile, so it renders them like a toy that came alive. The animation I posted here looks a bit like a toy kitten animated in a pixar style.

Perhaps the world is now ready for moving mewing seemingly live bonsai kittens? :smiling_face_with_horns:

The world sure wasn’t ready 25 years ago for the still pic version and somebody nearly got lynched over it:

I’m reminded of the genetically-engineered kitten tree, in one of the Vorkosigan novels…

A prompt I saw on Reddit

High school dance circa 2012, taken with iPhone 4

Modified

High school dance circa 3012, taken with iPhone 4

High school dance in Japan circa 1912, taken with iPhone 4

High school dance for cryptids circa 1984, taken with iPhone 4

Those are all Copilot. Here’s what Gemini made of the original prompt

Those original prompt 2012 versions are scary realistic. I would not want to play “Is it real or is it AI?” with a stack of those kinds of images.

It is never clear what each system will allow and what it will block.

There is a TV series called Collector’s Call where each week Blaire from The Facts of Life visits a different collector. They’ve been showing commercials for an episode with a Smurf collector and called the upcoming episode “Smurflector’s Call”. Which sounds like a parody of Silence of the Lambs.

So…

Create a photo of a hybrid of Hannibal Lecter and a smurf.

Copilot said “nope”.

I can’t create that image.

It combines a real, copyrighted character (Hannibal Lecter) with a trademarked character (a Smurf), and I’m not allowed to generate images that depict or blend protected characters.

But ChatGPT was fine with it

And so was Gemini

I told Gemini “Replace the food with fava beans and a nice Chianti”.

Grok is famous for willingly generating CSAM so I’d be surprised if it filtered out much of anything.

I’ve noticed that microsoft / copilot is the most lawyerly “cover your ass” of all the LLMs. It obviously has system prompts designed to keep users from weaponizing it by getting it to say disparaging things about certain groups or generate problematic images. It plays it WAY safe.

Edit: Obviously the CSAM thing is a huge negative about grok. I’m not endorsing it. I’m just saying… if you end up running against safeguard boundaries, you can probably bypass them with Grok. But who knows. Maybe it lets you generate CSAM but it would stop you from generating “donald trump drawn like a pig.” Musk definitely puts his thumb on the scale.

Actually… that’s an interesting test. Does anyone have a grok account? See if you can draw Elon in unflattering ways.

You can. Musk approves.

And Grok was never a free-for-all when it was free for all. I had plenty of times when it refused to make something.

Unflattering? That’s the most flattering swimsuit photo featuring Elon ever posted on the internet.

I noticed a post on Reddit about the screen placement problem being solved. So I did some experiments in Copilot.

A girl shocked by what she sees on her tablet. Iphone 15 photo with shallow dof and forced perspective.

A girl shocked by what she sees on her laptop. Iphone 15 photo with shallow dof and forced perspective.

A girl shocked by what she sees on her phone. Iphone 15 photo with shallow dof and forced perspective.

Then I tried

1970s photo of a girl shocked by what she sees on her phone. Instant camera photo with shallow dof and forced perspective.

But it didn’t get from context the type of phone should be shown, so I got explicit

1970s photo of a girl shocked by what she sees on her landline rotary phone. Instant camera photo with shallow dof and forced perspective.

And more explicit

1970s photo of a girl shocked by what she sees on her landline rotary phone receiver. Instant camera photo with shallow dof and forced perspective.

But then back on topic

A group of bored and surly teens sitting around looking at their phones and tablets paying no attention to each other. Iphone 15 photo with shallow dof and forced perspective.

one thing is sure: The Stock-Photo market is dead in the water … when you can get instant stock-photos on demand ™

.

On a more “philosophical” plane: What do we think will happen with the credibility of real photos over the next 10/20/30 years? …

My guess is: the medium that “photo” is will lose a lot of its validity … up to a certain degree we can see that already happening today …

DJT w/ Epstein → fake news!!!