Digital art creator algorithm website

In this case, those aren’t remotely like the kind of leather art I meant.

Nightcafe got it right, not DALL-E.

I played with the Dall-E Mini a little and found it faring worse than the NightCafe AI (which I know isn’t actually by them) in understanding some stuff I wanted. One of the first NC images I ever tried was “A Toilet Made of Meat” which gave me:

…wheras Dall-E gave me:

Which is a better looking toilet but lacks the “made of meat” concept.

Likewise, the execution for “16th Century Pinball Machine” from NC was sort of a mess…

…but it understood what I wanted better than Dall-E which just gave me a straight up pinball machine:

(I know Dall-E gives multiple returns but am actually picking its best result)

I won’t bother to repost NC’s “Wrapping Paper for Mummies” but thought it was a lot more evocative than Dall-E’s offering:

But, amusingly, both failed to understand what I wanted from “Man-of-War Firing A Broadside” and both gave me old timey paintings of a firing soldier:

With my very limited playing around, Dall-E seems better at creating realistic images of things but worse at understanding concepts. I could be wrong though since my sample size it just the couple of attempts.

Going to call NightCafe the winner as well for Eldritch Cheerleader Car-Wash



(the other Dall-E results were just cars with varying degrees of fur but no cheerleaders, eldritch or otherwise)

On the other hand, Dall-E nailed “Bird on a Rock”, creating a much more real looking bird sitting on a rock than NightCafe could ever spit out.

I am especially happy with this Yokai Parade image. I took this photo of people in India

and cropped, stretched, and blurred it to make this start image.

I then ran it with increased accuracy and the highest available resolution (which they call medium for some reason) at the cost of 6 credits and upsampled it to the max for another 3 credits, and got this.

Too bad you can’t embed a live panorama here like you can with Facebook, but here it is as a video clip (made by me, not the lousy video option that they have).

I just tried the mini version (was it ever linked here? I googled for it) and output almost entirelu useless stuff for my tries.

Yeah, I was using the mini as well. Tried to be clear about that to start but then just called it Dall-E after that. Maybe the full version is better but the mini wasn’t impressive for understanding things. It did so a good job of just drawing simple concepts (“A chair”, “A tree on a hill”, etc)

This is what I meant by DALL-E mini:

To be clear, it is a “mini” model created only to show how DALL-E operates, with a simplified architecture and trained on two orders of magnitudes fewer images than DALL-E 2; I’m more confident the latter understands different styles of “leather art”.

It is not surprising that the mini model simply does not grasp many things. It is more for understanding how the system works, not for final production use.

At current 2022 prices, it would cost you about $175–$575 to train a DALL-E Mini using cloud computing. The full DALL-E 2 is also on Github, but, extrapolating, someone has to spend hundreds of thousands to millions of dollars to train one. I’m sure people are, because “closed betas” for various things are popping up like mushrooms, but that does not help us.

I saw one Dall-E review where he explains what art styles and prompts it can do and what it cannot do:

Works well:

  • photorealistic content of anything that has a lot of stock images online, including clothing design, close-ups of cute animals, close-ups of food, jewelry
  • Pop-culture references (Disney, Tolkien, anything that it recognizes) in arbitrary art styles
  • art style transfer, especially styles that are a little bit forgiving in the details
  • creative digital art; you have to figure out the right prompts and it takes some trial and error

Bad:

  • Scenes with multiple characters
  • Specific foreground and background (e.g. “Two dogs dressed like Roman soldiers on a pirate ship looking at New York City through a spyglass” does not work)
  • Anything it does not “know” what it is (e.g. “chair” works; “Otto bicycle”: nope)
  • Objects used in non-standard ways, or anything completely deviating from the training images (e.g., he could not get a woman whose “eyes are full of stars”.
  • text

Here
https://github.com/coffeenut/DalleLinks
are a whole bunch of scraped, therefore presumably unfiltered, Dall-E links, so we can see the failures (and there are quite a few failures) as well as the successes. Unfortunately, no yokai or elves or leather art for comparison, though (as far as eldritch monstrosities) they did have Cthulhu standing at the DMV:

I think this gets back to what I said about DALL-E effectively making collages. Give a kid a stack of magazines, a pair of scissors, and a glue-stick, and they can make all sorts of pictures of one thing next to another thing, or a thing in a particular setting, or the like. But those tools simply don’t enable transformations of the individual clips. You can’t use magazine clippings to turn a toilet into a pile of meat, and so neither can DALL-E. You can find a picture of a cheerleader and put her next to something eldritch, but you can’t make the cheerleader herself eldritch.

Both systems know who Urkel is.

I understand why robots want to kill all humans.

I only clicked around the results list posted earlier but I did see a few examples of merging ideas seamlessly like “Steampunk Earbuds” but also a lot of collage style images as you mentioned. It should go without saying that I don’t have any dog in this fight (and I doubt any of us do) and just want whatever AI can best give me an Eldritch Cheerleader Carwash. Right now, the AI that Nightcafe uses gives more “accurate” interpretations of melded ideas than DALL-E Mini does but, if the full version of DALL-E ultimately does better then sign me up. The Mini version right now isn’t a real good substitute in my limited experience.

Oh, and speaking of man-of-wars, I give you “Attack ships on fire at the Tanhauser Gate” (and yes, I know that’s not the exact quote):

Well, that’s definitely an attack ship on fire. At least, the one in the foreground is; I’m not sure what the thing hovering behind it is. And I can definitely accept that that’s a Gate in the upper left.

Tannhäuser, oil on canvas by CompVis Latent Diffusion demo


I love how he’s wearing modern clothing…

Early on I put in Roy’s entire dying speech from “I’ve seen” to “time to die”. I got this.

Decided to try generating a terrestrial landscape then alien landscapes in the style of two different artists.

In the style of Danny Flynn (after one evolution of the one above):

In the style of Mark Garlick ( after trying Danny Flynn off the original image ):

You might want to try triptychs. I made a frame and stick the noise from saved seed images in the frames, like this:

Or use your own source of noise if you can generate it. But the noise is critical—large areas of solid colors are dead zones (which is why the frame works). Adjust them however you want to hint in a specific direction. (In this case, I had “natural” red and green noise images and wanted a blue one to with it, so I adjusted the hue on a different noise image. Different colors probably have been better for the subject, but I just threw this together with a set I already had on-hand.

Neat idea! I have some old fractal images that I should be able to noisify.

I had been trying for portrait triptychs—it didn’t occur to me until your post to try something like a travel brochure. (I used that term in the prompt for these—in some of the frames it tries to insert alien text.)

Oddly angled and shaped panels do less well than strictly rectangular ones, but I do love that swirly dome thing in one of them.

Here is a new panorama, using the same start image of that yokai panorama I posted earlier, except with a green strip drawn across the bottom.

Revisting the topic of AI image upscaling. I took another look at websites, and a site I had burned my credits at in early experimentation had credits ready for me again. It uses the same AI upscaler as Nightcafe. The site gives you 10 photos per week for a free user (with purchase options) upscaled to 4x length and width. Source images can be up to 4000x4000 pixels. But it occurred to me—the system would not know or care if you upload a collage/contact sheet of multiple images. For instance, if you wanted to upscale 640x480 images, you could fit 48 of them in that 4000x4000 block. I’ve tested it—yes, it does work, and yes, it does give you a 16,000x16,000 256 MP jpeg. (Which you can then cut back into the individual images.)