Digital art creator algorithm website

With SD, I was able to get some reasonable college tries with just “Elephant_rhino_hybrid, (rhino)” using the SD v1.5 model. It’s probably helpful to specify which version/model of SD one is using since I could train a model on nothing but elephants and rhinos or I could run my prompt through a model made for generating anime waifus but neither would really prove much.


(Not saying those are amazing but I think they’d make a good starting point for refinement)

Played around again with generating a krasue (as covered far up-thread). This time I used a trimmed and rotated version of the drawing from the Wikipedia entry and img2img, and used various prompts including “woman’s head”, “dangling organs”, and lists of various organs. Some images moved in the direction of the desired results, at least. (I got several versions of the “weird person with head hat” theme.)

The NSFW censorship on Night Cafe really needs work. I often get what seem to be (through the blurring) innocuous images from innocuous prompts blurred. But today a fairly innocuous prompt (aiming at horror, nothing else) produced this. (Linked instead of embedded because it is seriously N. S. F. W.

Here’s an article on Controlnet, which, if you haven’t been reading the SD reddit page, is a very powerful tool for Stable Diffusion for people who have the hardware to run a local copy. Just miles beyond what you can do with current web services. Be sure to at least skim the 33 page PDF linked at the bottom of the article.

I sort of get the feeling that a proper NSFW filter for one of these art generators would have to have been implemented at the source-images level, not at the output level. Like, in the initial tagging, make sure every NSFW image was tagged NSFW, and then give that tag a very high weight. As-is, I get the impression that the possibility of NSFW art (somehow) didn’t even occur to the creators, until after the AIs were up and running.

Up-thread, I had a few experiments I did with the most minimal prompts I could think of, and they all seem to have ended up NSFW. For instance, a prompt of just “.”, a single period, resulted in a giant vulva. And a prompt of “” (open and close quote marks themselves as the prompt) was an abstract but very breast-ish thing. All I can figure is that the training data contained a lot of images of breasts and vulvae, so when the AI has no other guidance, it just falls back on what’s common.

For those on Playground AI - which I’m liking more and more (more than NightCafe) - they have five new built in filters to play with. They produce some pretty cool results. There is a Pop Art (very Lichtenstein), Fashion Magazine (Perfume II), Dark Comic (graphic novel), MixPunk (steam, solar, diesel, you name it) & Kid’s Storybook (watercolor, scrawly drawing - which makes for interesting results when using it to create a variation on a source).

Also on the floor … I’ve been looking into running Stable Diffusion locally. I think I’ve found some decent sites walking me through installation but if anyone has any tips, I’m all ears.

I seem to get different results with the same settings than at Night Cafe. Playground seems to do a much better job of infilling and img2img, for example, for some reason. Possibly because it is working at 768x512 instead of 576x384 for landscape?

As for a local setup, do you read the reddit group? Tons of resources there. I’d love to be able to use Controlnet, “plain” SD seems crippled in comparison.

I would suggest just cloning one of the forks on Github, e.g.

does not really matter exactly which one; then it automatically sets itself up with a convenient web-UI to all the features (e.g. there are instructions there for getting ControlNet working)

Thanks. Looks promising. How do I clone a fork. The site isn’t that intuitive (at least not for me).

Dark Comic and Kid’s Storybook are pretty interesting for original creations. And KS is great for img2img. I used a photo of my cat yawning with the prompt “cat yawning”.

Not especially kid-friendly, though. (The two rightmost images are from Woolitize.)

You need Python anyway to run all the scripts; you do not have to, but I suggest installing Miniconda/anaconda, which makes it easy to install the exact version it wants. Once that is set up, I believe you should be able to conda install -c anaconda git from an Anaconda prompt window. You also need the model files and various dependencies listed here. There is also a guide:

In any case, after git is installed you can git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git like it says.

It is insane that nobody has created a simple, one-click Windows installer like they have for the Mac.

In the world of Stable Diffusion, there is now (well, has been for a month or two, I think?) something called a Low Rank Adaptation or “Lora”. It’s basically a small training injection you can use on top of another model. It can also be trained quickly (GPU depending) and on a small number of images (though more can mean better results).

As an example, the SD model I was using had no idea what the D&D monster Grell is. Asking for images of a wizard fighting a grell provided this:

Didn’t even try. But training a Lora on only seven images and asking for a wizard fighting a grell gave me:

One thing I noticed was that seven images of grells against plain backgrounds doesn’t exactly help teach it a sense of scale :smiley:

A couple more grell images including fantasy and then, changing models, a “Grell ordering a martini in a singles bar” and “An anime game about a grell finding romance” to pull our monster out of the fantasy painting style:




Anyway, thought it was pretty cool and the rate this stuff is advancing is amazing.

(Side note, this was more proof of concept so I made no attempt to clean up or inpaint any of the images. All of them could be a lot better with some work and re-rendering but I just wanted to see if it would draw me some grells and understand “a grell” as part of a prompt)

Lora is one of the things that I envy local-run Stable Diffusion for.

I have absolutely no real need whatsoever for making images of Gizmo (from Gremlins) or Alf (from Alf), but early in my AI testing I tried for various pop-culture creatures that came to mind. Min-dall-e and Craiyon actually do a pretty good job of making both of them (but not together) but Stable Diffusion has only the very vaguest idea of what they are, tending to make something vaguely dogish, puppetish, or dog-puppetish. I’ve wondered what was different about the training data that allowed those AIs to do a respectable job while Stable Diffusion has nothing. So if I had Lora access, I’d probably train it for Alf and Gizmo. I could probably use some of the Craiyon images as training data. Stable Diffusion shouldn’t be beat by min-Dall-e or Craiyon, damnit!

I just noticed that one of the middle left Alfs looks like it has stuffing leaking out of it. A training set including damaged Alf dolls?

Anyone have any tips to get actual english words to generate as per the latest challenge?

OK. Right after I posted I had some success. Basically by starting with an image with a superimposed word, lowering the noise weight, and then trying a bunch of times. Glad, there is a single image free option now.

The results of an experiment aimed at creating fictional players for a baseball text sim:

A comic/meme was recently posted on reddit, where (if you don’t want to click) a woman is practically swooning over the idea that the man in front of her in the subway with a sketchpad might be drawing her. But in the final panel, we see he is writing a prompt based on her. A number of people have tried the prompt on reddit

(The prompt is “Imagine red hair woman, mid-30s, great smile, purple sweater, brown jeans, cross-hatching technique, pencil sketch, 4k”.)

I tried the straight prompt, it produced results, some better than others. But of course I went on to tweak it, varying the description of the woman. Then made various additions to the style. In the process, I found a set of artist prompts that tends to produce very nice results in SD 1.5 (but not so much in SD 2.1). You need to try “Dan Witz, Mark Ryden, Margaret Keane, Pino Daeni” as a group in a prompt or twenty.

(I intended to post some samples, and hopefully I will later, but Imgur is having “technical difficulties” for now.)

Imgur still isn’t working, so I’m trying out a different hosting site.

So I’ll skip posting examples of the plain reddit prompt and simple variations. But here are variations after adding “Dan Witz, Mark Ryden, Margaret Keane, Pino Daeni” to the prompt. I started out with simple changes of the physical description, including aging her up and down. The sweater became a Christmas sweater (and then a Cthulhu Christmas sweater) to add a new level of detail. Trying for cat sweaters lead to cat people.

I went further afield. The Olan Mills moose-ish insert in the Santa shot happened all on its own without being in the prompt.

Then I pretty much abandoned the original prompt entirely and made famous people riding things. (The last is George Lucas on the shark.)

This final example was reasonably close to the original meme prompt (Imagine japanese girl, mid-teens, great smile, christmas sweater, jeans, 4k, medium shot, Dan Witz, Mark Ryden, Margaret Keane, Pino Daeni). I did some infill tweaks and expanded the top and bottom. Her body proportions are really distorted, but I think she turned out pretty cute