Still getting fairly frequent unrequested nudiy in the brief moments I can get Dall-E 3 to work. This is another image of a human-snail combination. On top of the nudity, I had tried to put it on a deathcap mushroom and apparently Dall-E interpreted that as wanting it dead. One of the creepiest AI images I’ve generated. (Outpanted in Stable Diffusion, all that was added was water on the left and right.)
SDXL 1.0 is now available for non-subscribers at Night Cafe. Renders are 1 credit each. (SDXL 0.9 beta renders are still 3 credits each for some reason.)
So I got A1111 running last night with SDXL. Didn’t have much time to play with it but it definitely works. Jophiel was correct that I can generate batches of 512px images very quickly. 1024 takes a bit longer than the commercial sites but isn’t onerous.
Very curious to see what I can do with it, especially once I start adding loras.
I’m not sure if you’re using the base SDXL model or a variant off Civitai but the base model works best if you run the results through a separate “refiner” model. There’s variants on Civitai that don’t require a refiner (which is how I usually go).
There’s a client, ComfyUI, that people say works faster than A1111 for renders but the UI operates like an old Radio Shack electronics kit with nodes and connections and you building it yourself. A few friends of mine are really into that but I was overwhelmed within a short time and went back to the sweet spot between “It just works” and “Flexible user options”. I just tried it now and it’s slower, if anything, but I also barely know what I’m doing it with. I’ll just stay with A1111.
Interesting! Do you recommend a particular refiner model?
I’m also curious about building loras. If I’m understanding, I could feed it a bunch of selfies and, after quite a long time spent training, self-insert myself into images with a fair amount of fidelity?
I don’t think there’s a lot of options. There’s a SDXL refiner and some people recommend using the older beta 0.9 version of SDXL refiner if you can locate it. Guess it works better. Many of the SDXL models on Civitai say they don’t need/benefit from a refiner and I tend to just use those.
That’s correct. Unfortunately, I haven’t played with making Loras in a while (pre-SDXL) and my “real people/things” attempts weren’t great (but I didn’t try hard either, more just messing around) so I’m probably not a font of wisdom on that. I know it’s possible because there’s a metric ton of celebrity Loras on Civitai. The program I used was called Kohya and searching for “Kohya Lora” in Google will point you in a direction but I’m a little helpless past that. I’m sure there’s a bajillion tutorial videos.
I continue to be blown away by the quality and creativity of Dall-E 3 images, if only I could actually make them. It has been 20 hours since the site has worked for me. Outages usually occur just as I start experimenting with a new prompt idea. My last prompts last night were for “terrifying totoro with a wide open mouth filled with lost souls” and a couple of variations:
(Compare that to SDXL output for the same prompt):
Here is “terrifying totoro with a wide open mouth filled with lost souls and standing outside in the rain on the side of a road at night with trees in the background”:
Bing is suddenly very heavily censoring Dall-E 3. For instance, with the creepy creatures I have been making I have been using “terrifying” in the prompt. That word is now blocked. So are “scary”, “frightening”, and “creepy”.
Some great stuff from SDXL, and a great reminder of how interchangeable the various AI models are not, and how much your impression of the quality of an AI depends on whichever personal rabbit hole you fall into. There are types of prompts that Dall-E 3 does vastly better than SDXL, and vice-versa. And prompts done better by Bing’s first Dall-E version than by Dall-E 3 (sadly, unlike the continued availability of older Stable Diffusii, that Dall-E option is probably gone forever).
I mentioned before how early in experimenting with Dall-E 3 I happened to try for a human snail hybrid and discovered how great DE3 was with the concept. These are results for “Terrifying smiling human snail hybrid by Junji Ito and Mark Ryden”.
Here is the same prompt with Dall-E 2:
With Stable Diffusion 1.5:
And with Stable Diffusion XL:
DE3 clearly blows the others out of the water on this prompt. I might could get something closer if I worked at it, especially the artist styles (if I remove Junji Ito and Mark Ryden from DE3 prompts I get flat drawing-style outputs), but probably never anything that good.
Sadly, I may never get anything that good out of DE3 again either if they continue to block all prompt words related to negative emotions. (Or even positive emotions–I had a couple of prompts blocked apparently for using the word “smiling”.)
I think a large part of it is if you have an idea in your head or if you want the AI to do your thinking for you (no judgment intended; this can be a fun way to see what happens). Most images I think I could get to work (in varying degrees) by just banging at the prompts until I got it to output what I wanted. That would involve both positive and negative prompting, adding prompt weights, etc. On the other hand, various models and systems (Dall-E, all the SD systems, Midjourney, etc) are going to handle short natural language prompts differently.
If I know that I want a weird snail-centaur thing then I’m fully confident that I could find an SDXL model that would let me make one with a few rounds of trial and error. On the other hand, if I don’t know what I want besides “Let’s see what this does” then basic SDXL might not show the amount of creativity for me to springboard off of (less so if I’m using a photorealism-oriented model). Of course, on the other hand, the SDXL Faetastic model seemed to do okay, using your prompt:
If I wanted to use one of those as a base image for making a photorealistic man-snail-monster, I’d be able to do so. Or I would have just specified that I wanted a realistic man-snail-monster to start with. Adding “photorealism” and negative prompts for artwork, illustration, etc got me this rather unsettling friend:
For general “shit-prompting” where I’m just entering dumb things and seeing what happens, I prefer Midjourney. If I have a specific thing in mind that I want to make, I value the ability to run a bunch of (uncensored) prompts, models, Loras, img2img, inpainting, etc from SD to hammer out what I want. Of course, that’s not an option anyway if you’re not able to run it local.