AI image generation is getting crazy good

No, never tried Tron until just now. I was interested in Bit because ChatGPT gets the answer wrong on this question: “Who was the first fully CGI animated character in a feature film?”

It was actually doing pretty well at Tron on the left, but Disney’s copyright lawyers noticed around when it was finishing drawing his head. The right version took a lot longer before it reached the stage where you see it start to draw. I wonder if that version was thinking harder about how to avoid the lawyers.

I was going to try it in Sora, which has significantly lower copyright guardrails, but it’s not up right now.

This is mine (made in Sora)

I stand corrected!

As a former graphic artist and illustrator, it’s that details that blow me away. The teal reflections on his hair to help him “read” against the dark background is kind of mind-blowing, for an AI that is.

With The Count! I love that guy, my fave Sesame Street character. Any reason you inserted him?

Oh, cool, Tron Legacy. The censored version I only got to see the top quarter of was clearly the 1982 helmet and much more so than the one on the right that the censor let through.

I can just imagine the count trying to count all the legal cube permutations. :slight_smile:

Just a “Mad Libs” approach to probing what it can do. I’ve been doing lots of images attempting to combine characters from two or more different properties.

In case you didn’t notice, that’s Faith the Vampire Slayer he is approaching. I try as many characters as I can, so don’t often use the same character twice, and I had already done Buffy.

Yeah, I don’t think either Darrin or Major Nelson are very accurate at all. And why is Tony Nelson (whichever of the two is meant to be him) wearing a regular suit? He always wore his Air Force uniform.

I tried a few variations on that one with Jeanie, Samantha, Major Nelson et al. In both ChatGPT and Dall-E. I was amused that Dall-E remembered he was in the Air Force, but put enlisted stripes on him.

A new fail: usually 4.o is quite good with text, but this time I got a set of three “Hello my mame 19” stickers and a set of three “Hello (gibberish)” stickers. But I have a guess what happened here. I think ChatGPT is familiar with seeing many photos of that type of sticker (it is exactly the kind I was imagining) and it used that mental image instead of creating it completely from scratch. And in the sample images the word “hello” is big enough to be clear but the “my name is” isn’t.

Summary

An image of three characters named Sabrina. Kiernan Shipka as she appeared in “The Chilling Adventures of Sabrina”, Melissa Joan Hart as she appeared in “Sabrina the Teenage Witch” and Audrey Hepburn as she appeared in the 1954 movie “Sabrina”. Each of them is wearing a rectangular white ID sticker that has the printed text “HELLO MY NAME IS” and below that each of the three stickers is signed “Sabrina” in distinctly different handwriting. Widescreen Kodachrome DSLR photo with shallow dof.

Bonus

Summary

An image of three characters named Sabrina. Kiernan Shipka as she appeared in “The Chilling Adventures of Sabrina”, Melissa Joan Hart as she appeared in “Sabrina the Teenage Witch” and Audrey Hepburn as she appeared in the 1954 movie “Sabrina”. Style it like a movie poster with the title “SABRINAS”, similar to a poster for “Heathers”.

Fascinating… It accurately reproduces the most common arrangement of numbers on a d20 (twice), but puts a 40 and two 0s on a d6.

Those are, uh, custom dice for an RPG you don’t know about. Yeah, that’s it. Also the d8 that lets you roll a 20, 0 or 00. Anyway, don’t lose them or you’ll have a devil of a time finding new ones.

And as I think about it, both of those d20s are also in the exact same “pose”, which shows the 20 as the number rolled, and right-side-up. And it occurs to me that the image of a d20 is very frequently used as a symbol of the hobby, and it’s almost always oriented that way. The net result might be that the AIs don’t have a model of a d20 as a three-dimensional object at all, but merely as a 2-dimensional icon.

What happens when you try to see different orientations of one? Or ask one of the animation generators to rotate it?

I gave ChatGPT a very simple prompt: “Several d20 dice bouncing across a table”.

I gave the same prompt to Gemini

“Now create a fantasy hybrid between d20 dice and bacteriophages. There are several of the hybrid dice attaching to an e. coli cell.”

It really likes duplicating numbers on dice.

I’m enjoying how it takes old photos and animates them.

It also gave me a human version of my cat.

Pretty good. You can even see some of the expression in the eyes copying over.

I tried a prompt specifically explaining that it shouldn’t repeat numbers on sides, but it didn’t work.

Perhaps try telling it explicitly which digits you want showing on the side of the die the viewer can see.

Could do, if you are showing just one die. In my try it was a version of the bacteriophage idea, and there were (counting) 45 faces big/focused enough to have discernable numbers.