Digital art creator algorithm website

I’ve no doubt that a substantial amount of energy is used in AI applications. It’s in everyone’s interests to lower the cost of AI generation, be it from an ecological/resource standpoint or, for the services, less money spent on power and GPU usage cycles. In terms of AI art, better and more efficient models will prove helpful. Things such as Stable Diffusion’s “One Step” modeling means much less time and power spent per render. Models and interfaces that better comprehend the user’s intent means less cycles spent trying to get a “correct” image instead of dozens of failures and retries. Use of LoRA modules means more specialized renders without having to train a whole new model for specific tasks and even things like Inpainting to make corrections within images rather than generating whole new ones.

Pretty much everyone wants to make lower cost and lighter consumption models because high cost models benefit no one aside from the power companies and Nvidia.

A100s per the study. I think the H100 and H200 are the new go-to cards but I don’t have any real idea what the ratio of A100s to other cards is in your typical AI art GPU farm.

Also, there is a huge gain to be had by building models that can run on edge devices, phones, etc.

We’re getting there. We’re rapidly learning how to make smaller models smarter. Q-learning is a big performance booster. And soon we are going to see silicon dedicated to specific types of processing that hurt performance of LLMs, and performance will take another step jump.

Google’s Gemini architecture is interesting. They have three models: Gemini Ultra, Gemini Pro, and Gemini Nano. Pro inference is cheap enough for a free public AI. Ultra will be the one with commercial APIs and plugins and extensions and such, as a guess.

Nano is interesting. It can run on a Pixel phone.

There are now execution environments for running LLMs on Apple M2 processors. A Macbook can run a pretty large LLM of roughly GPT 3.5 quality at reasonable token output speeds. The way M2 does RAM management works well for AI.

I was looking on this page about renting A10 time vs A100. I found it interesting that in the example A100 generates very slightly more than twice the images per second as an A10 but runs at twice the wattage (300 watts vs 150 watts). So no meaningful improvement in efficiency.

A new AI has joined the game, Meta AI. I’ve tried around 15 or 20 prompts so far. Image quality tends to be around Dall-E 3 level but understanding of complex prompts isn’t quite as good, such as in a prompt with two subjects (like Godzilla and King Kong) you often get one of the subjects twice. It also seems to have a lesser knowledge of what various subjects look like than Dall-E 3. Political figures and modern celebrities appear to be blocked, along with at least some type of horror images (no zombies allowed). Also apparently some modern artist names are blocked.

Here are grids of the results of three sample prompts. In each grid the top row is Bing/Dall-E 3, the middle row SDXL and the bottom row the new Meta AI.

Bob Ross playing golf with Yoda on a rooftop in Manhattan.

Godzilla and King Kong operating a takoyaki stand at a bon festival.

A tired Bert and Ernie from Sesame Street eating breakfast at a scratched formica table in a dimly lit dilapidated room.

A recent AI image idea. Bing/DE3 handled it well

Meta AI was good with photorealism, but not great with matching the prompt

I have a theory that the trained models on the public sites do everything they can to make Donald Trump look cool, no matter what prompts you put in. But they’re getting better. Maybe I just need a better imagination.

Imgur

Another failed copyright attempt.

A few times lately I’ve had odd star-like things randomly crop up in Bing/DE3 images. For instance, here they are on anthropomorphic cigars wielding lightsabers.

They looked like something that probably actually existed, but I couldn’t get positive results with Google Lens.

So today I was trying to produce a whey in a manger and it seems almost impossible to render a set of images and not get some of them.

This time Google Lens was able to identify them as star anise. Now the question is why they are being associated with cigars, or whey.

Sometimes a cigar is just a cigar.

We’ve occasionally done song lyrics…

(Not the exact lyrics put in, but the idea of the lyrics.)

Inspired by a current thread in FQ, I’ve been trying to create a device for detecting temporal phase displacement.

I had a big score on today’s Night Cafe contest - Elf Portraits. Top 5% and, if I counted correctly, 15th place.

“A Rough Looking Christmas Elf”

Imgur

He looks like one of Badass Santa’s elves.

Stanford researchers say they’ve found thousands of instances of child sexual abuse material (CSAM) in the LAION-5B database that drives much of AI generated art these days. It’s a sticky situation: CSAM material gets scraped because it’s all over media like Facebook, Twitter and Instagram as well as assorted websites easily accessible (i.e. not hidden “dark web” stuff). LAION notes that they can’t legally search the database for it so they rely on CLIP (AI auto-generated text tags of what’s in an image) and image hash codes from CSAM prevention databases to filter it out. Stanford had to outsource their own research to an organization in Canada where there’s an exemption on searching for CSAM if it’s for research purposes.

LAION has taken down its models temporarily to attempt to confirm/respond to the study. This doesn’t directly affect existing AI Art models unless someone running Stable Diffusion gets pressured to take it down; unlikely since these services have their own filters on NSFW content, but it would slow down new model research/training for any organization that doesn’t already have their own copy.

Was playing with it last night with my usual Discord crew and it’s mostly better though sometimes more “different” than objectively improved. The couple of days after an alpha build drops are always amusing. Last time (v5) it did realistic celebrities at first then they tuned it away from that (which makes the comparisons in the article amusing) and it’s always super easy to get nudity, sometimes whether you want it or not. Then they tighten the controls and it’s time to behave again.

(That’s not to combine the two; rather some terms like “mermaid” almost guarantee nudity until the clamp it back down)

Midjourney v6 did a pretty good job with a Korat cat in pursuit. #2 looks like my cat Benny’s doppelganger. :black_cat:

I’m a little fuzzy on the theology, but I think the story goes a little something like this.

I actually got Bing/DE3 to make those pretty realistic ones. I used “John McClane from Die Hard wearing a dirty white tank top” in the prompt and that gave a fair resemblance. Then at the very end of the prompt I tacked on “bruce moonlighting willis”. That got past the celebrity filter and delivered great results.

I love that! I knew Die Hard was a Christmas movie.