Digital art creator algorithm website

Darren_Garrison · November 22, 2023, 11:15am

Nice. But you can download the image without the watermark in the corner.

Darren_Garrison · November 22, 2023, 5:47pm

Stable Diffusion video model released.

Dr.Strangelove · November 23, 2023, 5:03am

I downloaded the tensor weights for Stable Video Diffusion the other day, but I use ComfyUI as it’s the only front end I’ve found that totally suck, and they don’t seem to have added support yet. Hopefully soon.

Darren_Garrison · November 23, 2023, 4:42pm

Image-generating adjacent, a lawsuit against LLMs has been largely tossed out.

Jophiel · November 23, 2023, 4:45pm

Bold choice to use the only front end that totally sucks

Darren_Garrison · November 24, 2023, 1:47am

Dr.Strangelove · November 24, 2023, 8:50am

I’m a glutton for punishment, apparently. But seriously, ComfyUI is great for running local models. And it looks like support for Stable Video Diffusion just came out:

I’m visiting the parents, though, so I won’t be able to try it for a few days.

Chronos · November 24, 2023, 4:37pm

Weird. The popcorn, pretzels, and jellybeans look great, the bird’s only problem is being implausibly still, and the dog is pretty good aside from being out of focus, but what the heck is going on with that toast? I’d think that any model that can do the other things well would also be able to handle toast, or at least do a better job of it than that.

Darren_Garrison · November 24, 2023, 4:44pm

I had better toasts, but many images had a dog/table merger going on.

Darren_Garrison · November 28, 2023, 4:20am

Hugging Face has a usable version of Stable Diffusion image to video now.

I made one video last night, but spent more than 12 minutes in a queue waiting for it to run. Here’s the result:

For comparison, here’s what I got with RunwayML Gen2 when that first released:

(Here’s the source image, made with Bing/DE3):

Jophiel · November 28, 2023, 10:51pm

Stable Diffusion released an early test of SDXL Turbo, a “one step” model. “Steps” being how many passes it makes when it renders an image, not how many tasks you need to complete to use it. A photorealistic image in Stable Diffusion is usually around 30-50 steps.

As a result, it was completing two images per second on my RTX 3080Ti when I tested it. That’s hecka fast. It does have some major limitations though – it doesn’t really do photorealism, photoreal human faces are a disaster, it’s meant to run at 512x512 and adding more steps immediately blows the image out. It’s really just one step and setting CFG to 1 as well. It is a really cool hint at how the tech is progressing though and I tried a bunch of stuff and had some “great” results when you remember: two images per second.

(Also, results are from a single run using A1111. I believe that Comfy might do it better since I wasn’t doing the second Refiner pass and for various other boring technical reasons)

Darren_Garrison · November 29, 2023, 12:16am

On clipdrop:

Chronos · November 29, 2023, 12:55am

OK, those chesscrapers are just plain cool. A human artist probably wouldn’t have had too much difficulty making that, either… given the concept. But it’s a great concept.

@Darren_Garrison , I think the possum is probably happier in the RunwayML version. It’s still getting a creepyhand stuck to its nape, but its head isn’t dematerializing.

Darren_Garrison · November 29, 2023, 1:27am

Since I’ve set up a new Youtube account, here are the more coherent clips that I generated with my limited free Runway Gen2 seconds a while back.

Jophiel · November 29, 2023, 1:40am

Being able to run 800 images in just under 5min (4:58) is addicting

Was worried for my SSD but those 800 images only take up 280MB of space at 512x512

Pleonast · November 29, 2023, 3:10am

I’ve set my output to a RAM drive, and then only move the good ones to the permanent disk.

Darren_Garrison · November 29, 2023, 10:08pm

Another day, another new toy.

Darren_Garrison · December 2, 2023, 6:03pm

Wanted a Mandalorian on a DeLorean. For some reason Bing/DE3 keeps thinking that concept should include a cat or small dog and some brown leather bags.

Darren_Garrison · December 3, 2023, 3:29am

I’ve been playing around a lot with Stable Video Diffusion. It is a long way from perfected, but what is already there is impressive.

It creates video clips from images without a prompt, and needs to try to recognize the objects in the image but also what type of motion makes sense. A few videos are failures, with little or even no motion. A few are very simple pans across a static image (essentially the “Ken Burns effect”) but with new data created for the edges as the “camera” moves.

The zooms in or out are more sophisticated, moving the elements of the images around relative to each other in an awareness of parallax and that the image contains distinct objects layered in a three-dimensional environment, with the software very accurately determining the edges of individual objects. Some videos involve rotating “inside” the image, understanding the three dimensional nature of the objects and creating new “structure” on the objects as they rotate.

With some images the software recognizes specific types of objects and tries to give them the appropriate type of movement. It can be environmental like billowing clouds and crashing waves or mechanical like turning wheels. It also can often recognize humans and animals and attempt to move their limbs and faces in appropriate ways.

Stable Video Diffusion does not do any of these things perfectly, and makes major mistakes. But this is an experimental early release of the first version of the software, and could (like still image generating) improve rapidly.

SVD creates a set of 24 images, allocated as 6 frames per second for 4 seconds. It seems to me like they could make it an indefinite duration and not RAM-limited, basing each new frame off a number of previous frames, but currently it seems to keep all frames in memory all once and reference all of them–some of the video clips form near-seamless loops, and some loose detail mid-video only to regain them before the end.

This is a compilation of 75 if the 4 second clips, all generated from images I made using Dall-E or Stable Diffusion.

Dr.Strangelove · December 3, 2023, 3:34am

I think one of those was a small Wookiee. Also, it really likes making a Back to the Future DeLorean with the extra wires and stuff on the outside.

Topic		Replies	Views
a challenge to any Doper who thinks they can do it Miscellaneous and Personal Stuff I Must Share	12	1028	August 17, 2000
2007 "Worst/Stupidest Image on the Internet" Nominations In My Humble Opinion	70	10685	March 19, 2007
Show your artwork! (Regardless of talent!) Cafe Society	91	3052	October 12, 2004
The Ultimate Straight Dope Art Thread Miscellaneous and Personal Stuff I Must Share	98	6826	March 15, 2001
Ask the AI In My Humble Opinion	52	3640	December 12, 2002

Related topics