I’ve been trying to get images similar to the first one I posted (Yokai Parade) by giving prompts that lead in that direction. Tonight I tried “Full Moon Over Tokyo Bakemono”. I didn’t get the nightmare creatures I was hoping for, but I did get one of my favorite images so far.
Has anybody taken a look at OpenAI’s improved text-to-image system, DALL-E 2? I don’t know how curated the images are that are presented on the site, but still, that’s a new level of impressive…
Speaking of invisible, has anyone else had cases where the AI starts by painting the whole canvas white, and then at the last minute realizes it needs to toss in a little bit of detail? It happened to me with “Flag of Ukraine”:
And then again with “Can we fix it? Yes we can!”
(but oddly, not with my earlier “Polar bear in a snowstorm”
)
Oh, and I think this one is recognizable, but then, I know what the prompt was:
What do we think? I pity the fool
It has got to be doing some sort of looking stuff up like Chronos suggested. Here is Fin Fang Foom and Doctor Doom Sitting in a Waiting Room.
It is obvious that the system has some idea of who Fin Fang Foom is. (Doctor Doom, on the other hand, looks more like a profile of Bender Bending Rodríguez.)
It may have gone unnoticed, but I spoilered it below the image: The prompt was, literally, “I pity the fool”.
I’m not sure why it gave him so much hair: Mr. T’s mohawk is probably his most distinctive feature (along with his heavy chain necklaces, which it also partially included). Maybe it has some sort of reversion to the mean going on, that it tries to make any picture of a person look more like “typical human”?
Experimentation continues. To my surprise, it was able to correctly deal with a prompt in Latin (albeit a very familiar bit of Latin):
‘In principio dixit Deus “Fiat lux!”, et facta est lux’
That’s definitely God creating light, if we accept that God looks like a character from a goofy Adam Sandler movie.
Emboldened by that, I tried Sindarin, which I think maybe worked?
“A Elbereth Gilthoniel silivren penna míriel o menel aglar elenath!”
No such luck with the Black Speech or Kuzdul, though:
In my ongoing quest to generate a whole series of similar creepy images (I want to make a series of “photographic slides”) I tried Gashadokuro at Fuji Shrine.
At first I couldn’t discern any sign that it knew what a Gashadokuro is. But then I realized, yeah, thats a skull and hunched back eclipsing the moon, and stunted arms reaching out. I concider it a failure at being a good image, but I can see the influence.
I wish I knew that before. I wonder if there is any way to generate a 16:9 (or even 4;3) recomposition of works I have already made? Would re-entering the same prompt generate the same image again? (I’m guessing not.)