Digital art creator algorithm website

I would take any version info it gives about itself with a huge grain of salt. Each new version is trained on a lot of data that includes info about older versions. I’ve gotten clearly wrong answers before.

Tonight I looked for a setting that forced it to go with 4o. I didn’t find one, but I did find one to delete all previous chats. I don’t know if that is what did the trick, but it is using 4o now. I uploaded Ursula Vernon’s The Biting Pear of Salamanca

Told it this

Summary

This is a drawing of a pear with an open mouth exposing teeth and tongue. There is a small chipmunk standing in front of it setting up a camera on a tripod. They are on top of a grassy hill with white flowers. Below them in the distance is a winding path leading to a tall slender stone lighthouse. Standing along the path is two giraffes. Please convert this drawing to a realistic photograph.

And got this

Then I reupped the toy claw photo (I have him in 7 out of 8 colors) I tested it with earlier, told it this

Summary

This is a figure of a creature that is shaped like a giant hand. It has sharp yellow claws and warty bumps, and on the top of the middle finger there is a face with angry eyes and an open mouth filled with sharp white teeth. Convert this image to a realistic photo that looks like it is a living creature. Make it look intimidating and a little creepy. Change the white background so that the creature is standing on a town sidewalk, with cautious pedestrians staring at it. Note that the hand has only three fingers and a thumb.

And got this

Wrong number of fingers, but still clearly 4o.

Since I had success and fun with my claw figure in ChatGPT (it basically works like a Lora that trains on a single image)

I tried to do some more. Unfortunately ChatGPT is able to recognize what the figures are (even if I try to obscure that in describing them) and refuses to generate realistic images because of apparently copyright issues.

With this one a Google Lens search included an AI description recognizing Grimlock by name. (But thought that Catbus was another Transformer.)

With this one Google Lens didn’t include an AI description, but did link to various pages about rancors and about AT-ATs.

AI is getting too smart for my own good.

I like how ChatGPT 4o renders generic cartoons—clean lines and clear meaning. And it usually “gets” your concept and nails the text on the first try.

How Plants Communicate

I see lots of comics on Facebook now that are obviously from 4o. With comment sections filed with people laughing at/commenting on the content of the comic instead of declaring how anyone who uses AI is literally a million times worse than Hitler. It has reached the point where random peope don’t even know that it is AI. (Though in a while many of them will begin to recognize the style and thus see what horrible “AI slop” it is again.)

BTW, if that slightly cut off left edge irritates you like it would me, here’s a slightly outpainted version.

I saw somebody else get this out of Gemini. I believe the prompt itself was generated by showing a photograph to Gemini, but that part wasn’t posted to the thread I was reading. What I like most about this image is the art style:

The prompt was long and involved; I was impressed at how much Gemini got right.

A digital painting in a realistic style depicting three women standing outdoors on a sandy terrain, possibly near a body of water and some architectural structures in the blurred background under a daytime sky.

The woman on the far left has long, flowing dark brown hair that cascades over her shoulders. She appears to be wearing a light blue, low-cut dress. Her skin tone appears light to medium, and she has prominent cheekbones and full lips. Her eyes are light blue or green.

The woman in the center has long, wavy reddish-brown hair. She is wearing a dark purple long-sleeved top with a V-neck and blue jeans. Her skin tone also appears light to medium, and she has noticeable eye makeup.

The woman on the far right has blonde hair styled in two buns on top of her head. She is wearing a white sleeveless top and a pink and white patterned wrap or shawl around her shoulders and arms. Her skin tone also appears light.

All three women are facing slightly towards the viewer, with the woman on the left angled more to the right, the woman in the center positioned more directly facing forward, and the woman on the right angled slightly to the left. Their expressions are somewhat neutral or slightly pensive.

The overall lighting suggests it is daytime, casting soft shadows. The background is somewhat out of focus but shows a light-colored sandy ground, a body of water in the distance, and some buildings or structures, one of which has a distinctive dome. The style should aim for realism, focusing on detailed rendering of the women’s features, hair texture, and clothing, while maintaining a slightly soft and painterly feel.

Here’s the results ChatGPT gave from that same exact prompt:

I’d love to get the original art style out of ChatGPT. Showing ChatGPT the Gemini pic and asking it for prompt suggestions hasn’t gotten anything much different than this one. Anybody have any ideas what else to add or subtract from the prompt to get it closer to Gemini’s results?

I’ve never used Gemini but Midjourney takes that prompt and gives it a flat, almost vector-style feel.

I would maybe try to push the digital painting aspect of it since that’s what comes through for me on the Gemini. ChatGPT is obviously trying to emulate a handpainted classic style that doesn’t fit.

In unrelated news, here’s some character illustration made in MJ then heavily retouched/edited in SDXL (using on of the Splashed models)

Are you referring to what you see in the Gemini image above, or are you trying the prompt yourself in Gemini? I wondered for a moment if Gemini’s image results can be contaminated by still having the original pic in its context, but I just tried the prompt myself in Gemini, and got an amazingly consistent result:

Yes. Three models are interpreting/applying the prompt in different ways. Gemini’s interpretation seems to be a digital painting style almost similar (in my eyes) to the current GTA video game art style (on the boxes and posters, not in game). Since that’s what’s missing from the ChatGPT results – it’s interpreting it as wanting a handpainted feel – I’d try to emphasize that you’re looking for a more digital style.

I, too, get extremely similar results in Gemini. So I decided to modify the prompt slightly:

Summary

A widescreen digital painting in a realistic style depicting three disheveled women standing outdoors on the burning apocalyptic landscape of a ruined city in the blurred background under a daytime sky.

The woman on the far left has long, flowing dark brown hair that cascades over her shoulders. She appears to be wearing a light blue, low-cut dress. Her skin tone appears light to medium, and she has prominent cheekbones and full lips. Her eyes are light blue or green.

The woman in the center has long, wavy reddish-brown hair. She is wearing a dark purple long-sleeved top with a V-neck and blue jeans. Her skin tone also appears light to medium, and she has noticeable eye makeup.

The woman on the far right has blonde hair styled in two buns on top of her head. She is wearing a white sleeveless top and a pink and white patterned wrap or shawl around her shoulders and arms. Her skin tone also appears light.

All three women are facing slightly towards the viewer, with the woman on the left angled more to the right, the woman in the center positioned more directly facing forward, and the woman on the right angled slightly to the left. Their expressions are somewhat neutral or slightly pensive.

The overall lighting suggests it is daytime, casting soft shadows. The background is somewhat out of focus but shows a light-colored sandy ground, a body of water in the distance, and a horizon covered with burning damaged buildings or structures, one of which has a distinctive dome. The style should aim for realism, focusing on detailed rendering of the women’s features, hair texture, and clothing, while maintaining a slightly soft and painterly feel.

There are alien saucers flying around firing rays at the ground.

This is pretty great:

The above prompt, modified to this

Summary

A widescreen grainy disposable camera snapshot in a realistic style depicting three shocked weary worried disheveled women standing outdoors on the burning apocalyptic landscape of a ruined city in the blurred background under a daytime sky.A widescreen digital painting in a realistic style depicting three shocked weary worried disheveled women standing outdoors on the burning apocalyptic landscape of a ruined city in the blurred background under a daytime sky.

The woman on the far left has long, flowing dark brown hair that cascades over her shoulders. She appears to be wearing a school uniform. She is Japanese.

The African woman in the center has long, wavy reddish-brown hair. She is wearing a pink rabbit cosplay onsie. She has noticeable eye makeup.

The European woman on the far right has blonde hair styled in two buns on top of her head. She is wearing a Clemson t-shirt and baggy camouflage pants.

All three women are facing slightly towards the viewer, with the woman on the left angled more to the right, the woman in the center positioned more directly facing forward, and the woman on the right angled slightly to the left.

The overall lighting suggests it is daytime, casting soft shadows. The background is somewhat out of focus but shows a light-colored sandy ground, a body of water in the distance, and a horizon covered with burning damaged buildings or structures, one of which has a distinctive dome. The style should aim for realism, focusing on detailed rendering of the women’s features, hair texture, and clothing, while maintaining a slightly soft and painterly feel.

There are alien saucers flying around firing rays at the ground.

And run in Flux Schnell and creatively upscaled (at 25%) at Night Cafe.

ChatGPT can be frustrating. I’ve tried converting iconic moments from Totoro and Spirited Away into realistic images and it refused to make them. And yet it was willing to turn this

Into this

but when I tried to make some adjustments to that, it refused.

And to add insult to insult, those refused conversions count against the number of uploads I can do in a day. (But not against pure text generations.)

Yeah, it acts like there are different censorship algorithms or criteria. I’ve had similar problems where the original conversion from a source image went fine, but ask for further refinements on the created image, no go.

I only started playing with Gemini yesterday, and ran into something like a censorship issue for the first time. I was playing around more with modifications of the recent prompt with the goal of making it weirder and more elaborate. I already had the women as three different races in three different outfits, and the flying saucers firing rays into the burning city. In the next step I wanted to give each specific hand poses, and have the flying saucers with aliens sitting on top operating it with levers (think something looking like those Hejaz cars from parades). Gemini told me

I can’t generate images with that level of detail, especially those depicting violence. Would you like me to try generating a different image?

So I tried resubmitting the prompt without the hand position descriptions (but with the outside aliens). It ran that, and it successfully included all of the extra level of detail that it rejected as too much in the earlier prompt.

I’m severely impressed by the amount of elaborate specific details you can successfully prompt for now both in Imagin via Gemini and in ChatGPT 4o. On top of every other detail I added footwear. (Cowboy boots, bunny slippers, and sandals, for the middle outfit in case it isn’t obvious I’m aiming for Ralphie’s outfit from *A Christmas Story.) Since Gemini was giving me angles that cut off the feet even with footware descrtions I added description of what they are standing on.

Full Imagin is a paid-only option at Night Cafe. Imagin Fast has the same level of detailed prompt adherence, but lower image quality. The image I posted earlier created in Flux Schnell was one of the best I’ve seen byt was a bit of a fluke because most Flux tries were more confused and mangled (and tended to have more than three subjects with confused details).

I’ve also been experimenting with book cover versions.

Turns out that the word “influencers” is more than it can handle, so I had to come up with something simpler.

And there were more extreme examples of not understanding the assignment.

Here’s the prompt for that:

Summary

An image in a realistic style depicting three smiling weary worried disheveled smudged women standing outdoors on broken bricks in the burning apocalyptic landscape of a ruined city under a daytime sky. They are standing a respectable distance from each other.

The woman on the far left has long, flowing dark brown hair that cascades over her shoulders and is slightly tangled. She appears to be wearing a school uniform and cowboy boots. She is making a heart sign with her hands. She is Japanese.

The African woman in the center has long, wavy reddish-brown hair. She is wearing a pink rabbit cosplay onesie and pink bunny slippers. She has noticeable eye makeup. She is making a peace sign.

The European woman on the far right has blonde hair styled in two buns on top of her head. She is wearing a Clemson t-shirt, baggy camouflage pants and a pair of sandals. She is taking a selfie.

All three women are facing slightly towards the viewer, with the woman on the left angled more to the right, the woman in the center positioned more directly facing forward, and the woman on the right angled slightly to the left.

The overall lighting suggests it is daytime, casting soft shadows. The background is somewhat out of focus but shows a body of water in the distance, and a horizon covered with burning damaged skyscrapers. The style should aim for realism, focusing on detailed rendering of the women’s features, hair texture, and clothing. There are alien saucers flying around firing rays at the ground.

The image is on the cover of a battered, ragged, faded pulp science fiction paperback novel with yellowed pages with the title “Aliens vs. Influencers” in garish lettering. The novel is lying at an angle on top of other books on a wooden table seen from above and slightly from the right side.

(Pasting that prompt into Bing is a joke, it doesn’t even use the full first two paragraphs.)

Not for ChatGPT. I tried your prompt there. Still giving that painting art style, but it nailed all the text.

Thanks for running that, I’d love experimenting more in ChatGPT, but the low daily image creation is a severe constraint. I spent my day’s images on earlier evolution of the prompt, ending with this

Which I asked to be converted into the style of Bob’s Burgers

Night Cafe has just added a new AI engine called HiDream-I1. I tried the book cover prompt on it. The text creation is great, the ability to follow complex descriptions is not.

Definitely not a top-tier AI.

From Gemini:

The prompt was “Leeloo from the 5th Element crawling through a square air duct with John McClain from Die Hard. John McClain is wearing a dirty white tank top. Leeloo has orange hair. grainy disposable camera snapshot with shallow dof and forced perspective.”

That is…maybe Sigourney Weaver and Chris Rock?