AI image generation is getting crazy good

I get that - especially coming from Kobold and A111. I actually started with Kobold, using it to help with dialogue for a textRPG I was working on, then got frustrated very quickly. I suspect my previous experience with node/graph-based programs like Blender, UnrealEngine, TouchDesigner and Max/MSP made it easier for me to pick up. I think of it kind of like Legos - each block does stuff, and you draw the little lines to make the data go where you want, when you want it to, and then use each block’s tool to modify to taste. One of the cool things about this tech is how everyone can apply (or not apply) it however works for them.

I use my LLLMs a lot, to help me with communication, organizing my thoughts and interpreting other people. I’ve trained them via system prompts and local memory to understand and mirror my communication style, based on my language use and a couple of other parameters. I’m an associative thinker, and my LLMs help a lot in dealing with the linear world I’m stuck in.

Ahh - it’s the ‘kill Preview’ step I was missing - thanks! Let’s see if this works..

“Whispers of Beauty”
LLM-generated haiku
AI animation of AI-generated stills
Haiku into ACE-step for audio gen
Assembled by this meatbag using DaVinci Resolve

“The Question”
Same prompts, with the same seeds, rendered through two different models, as a test: How much consistency can be achieved?
Same process as above, but audio is all me playing with some of my toys. ACE-step wasn’t available at the time.

This one looks more like an early 1960s death ray. I swear that I’ve seen it on Antiques Roadshow.

So I made one - not yours, whatever Copilot came up with:

Huh? Can you give me an example of how it is used in the wild? I like tracking linguistic drift, and this is not one I’ve encountered yet.

Way back ages ago (in AI time), I made a prompt on Midjourney including “Style of Precious Moments by Enesco” which hit the top of the hot list and created a lot of people using the style. Closest I’ve ever come to going viral! Anyway, here’s some equipped with 1920s style death rays.

Yes. (And I first saw the title of that one as a recommended thread at the bottom of this thread!)

I did a Lego airplane on a treadmill in Bing/Dall E 3 back in the earlier Lego AI images thread. It got the concept pretty much right.

By George, it seems he’s got it. Yaay!

A Google for “the current meta is” will give you a lot of them, but here’s one example for weapon loadouts for a game called Warzone.

This meaning evolved from an older meaning of meta-optimization, where what’s optimal depends on what everyone else is using. I first encountered this concept in Magic: the Gathering. If, for instance, the most powerful current decks use very few or no large creatures, then cards which protect against large creatures are of little value, but if a new set is released with really good large-creature cards, then the new most popular decks will start to be large-creature decks, and so one might say that “the current meta is to include cards that protect against large creatures”.

In the current usage, though, if there’s some option that’s always the best against everything, without anything that particularly counters it, it still gets called “the meta”.

Been messing around with NMKD Stable Diffusion GUI. Not as many options as Automatic1111 but is totally offline and simple to install. Results vary wildly based on model and settings, but since it’s all free, a lot of trial and error costs you only time. The models usually come with settings recommendations in the description, but still it takes a lot of tries to get what you want out of it. Some models are better about understanding full English descriptions while others work better with a detailed list of keywords and very short phrases.

Still working out how things like LoRAs and tags work; on some models they seem essential, on others they seem to be completely ignored. Also they work differently in SD1.5 vs SDXL.

But getting some good results. Nice to not have the program applying arbitrary content restrictions that ultimately boil down “because daddy said ‘No.’”

Main drawback seems to be the complete lack of any feedback whatsoever. ChatGPT and similar tools will push back on a nonsensical, impossibly vague, or illogical request, but Stable Diffusion will spit something out no matter what, even if you just mash the keyboard and then click on “Generate.”

Imgur

Imgur

Imgur

Imgur

Damn, if we could combine image generation with 3D printing, I’d buy one of those.

Love that first image. One of my favorite FB AI groups is Liminal AI, which is (unsurprisingly) about liminal images. (I keep meaning to put some work into those, but so far I’ve only lightly experimented with it.)

There are programs that’ll turn an image into a 3D model or STL. I’ve never had reason to use any so I don’t know if it’s more “Wow, that was easy! Tech is great!” or closer to “Ok, now to spend two days fixing all the parts this jacked up…”

Via plugins one can generate a model and textures in Blender and/or ComfyUI; HunYuan has a 2D->3D model I’ve used to make .obj files. I haven’t checked into it myself, but I assume those could be imported into Fusion, or whatever slicing software is being used. I haven’t messed with 3D printing myself, but it’s on my list of tech to test.

There have been many creatures I’ve generated over the years that I would love to own as a physical model. A lot of them other people would probably like, too (there’s definitely a market for creepy/weird). I’ve mostly been using the ChatGPT renderer lately, and while understanding the prompt is one of its great strengths, it also makes the output less weird, random, and surprising. With other AIs you can put in the same ambiguous prompt and get wildly different outputs each time, some of which are awesome. Bing/DE3 is great (when the fluctuating censorship level is down) for making figures with prompts involving the word “homunculus”. No other AI makes these, including ChatGPT. For instance, here’s a certain prompt ran through Sora just now (after stripping references to artists, which it rejected).

All fairly close to the the prompt and all pretty similar to each other. But this is an example of what Bing did with the prompt. Far from what I asked for, but far superior to it.

Here’s a couple of galleries with 70 (out of hundreds) of images created with various (but a lot fewer than 70) “homunculus” prompts.

https://imgur.com/a/UwnqZJ3

https://imgur.com/a/PGY17y9

Eta the Imgur gallery links aren’t working for some reason. I’ll pick just a few images and post them directly.

Who, deciding proper pronunciation.

ChatGPT Prompt

Make an image that’s a creative interpretation of this subject: “A Perfectly Reasonable Amount of Schadenfreude about Things Happening to Trump & His Enablers” Do not include that literal text as a caption.

Second try with that prompt, result not as fun.

https://i.gyazo.com/e78a9cc37a693cf99e006a6664778123.png

DCnDC’s post inspired me to play around a little more with my local models. I ran some old Midjourney prompts through SDXL using the SplashedMix model and a couple others and was pretty happy with the results. Not quite as good but a good example of 80% of the results for 0% of the cost. Mainly I ran stuff playing around with color, artistic effects, depth of field and stuff like that. Also, all could be improved with some extra passes, selective inpainting, etc but I just wanted to see the raw results.






That werewolf’s human legs kind of crack me up. I could maybe get some digitigrade legs if I tried to inpaint over them but, out of the box, it decided that the wolfman gets man legs.

Nice. The second to last one with the flying saucer/mushroom, would have made an excellent cover for a 1950s sf magazine, like Imagination. (Definitely not Astounding.)