Digital art creator algorithm website

Response from NovelAI’s anime art generator from that prompt:

From NovelAI’s anthro furry generator:

Oh no! I’ve been beaten to it!

Woohoo!!! Top 10% for my Aliens entry. I’m going to stop spamming the thread with my ‘fails’ now.

…And start spamming it with my successes! Whooooooo!!! (just kidding)

I like to draw and paint with Clip Studio Paint, a programm that can be bought (as opposite to Adobe’s Photoshop, tat you can’t own anymore, and the trouble with Pantone and licenses you may have heard about, but I digress… as usual). Now, after many free improvements, they have reached Version 1.12.11 after about ten years, they are about to release version 2, for which I will have to pay. I’m OK with that, they are not really expensive, and program so far is powerful and well designed. Now with Version 2 they have shortly succumbed to temptation: let’s do what is possible, not what is good! And so they thought of adding what we (well, mostly you so far) are discussing in this thread: AI image generation. They anounced that on Nov. 29.
Today, three days later, they recanted. The user community was very put off and expressed “anxiety and concern”. Getting into some detail Clip Studio Paint writes:

Here are some concerns that we have taken to heart:

Current image generation AI exploits other artists' intellectual property and is unusable
This feature will hinder rather than help artists in their creative activities
Using artist’s work that is not opted-in to a data set is ethically unacceptable
The fear that this will make Clip Studio Paint artwork synonymous with AI-generated work
There are existing features that need to be prioritized over image generation AI features
Having something unknown in the app I use daily potentially infringing on legal or moral rights is unacceptable
Clip Studio Paint should be an app that takes responsibility for a safe, creative environment

I believe this is relevant to the discussion here.
BTW: Coooool pictures you have shown so far. Which, I am afraid, is part of the problem.

There is no problem. While the systems being considered in this thread are certainly capable of not-particularly-creative output (e.g., “in the style of Picasso”), the resulting output in no way violates Picasso’s intellectual property than composing a song in 12-bar blues violates W. C. Handy. Styles and ideas are not copyrightable, nor should they be, though it is not a bad idea to document your sources, like Handy did.

No idea why your paint program does not include the AI plugins that are included in other programs or separately available; could be that they were threatened by a patent troll.

I am not sure I overestimate or you underestimate the magnitude of this AI innovation. This other thread may convince you that it is you who shoulld think again:

But perhaps you have not made your argument clear enough to me? Maybe I (and the people worried who wrote to ClipStudioPaint) are being luddites?

It is possible I am underestimating or misunderstanding something, but, for the purposes of the users’ complaint, I doubt it matters if it is an AI or a human artist producing arguably derivative, yet not plagiarized, paintings (or novels) after having seen others’ work on the Internet (the “dataset”). You may rightly call me a talentless hack, but if I want to write “in the style of Cormac McCarthy” or paint (from @Folly 's prompt) like Artgerm and Greg Rutkowski, what is anyone going to do about it? As long as I am not representing it as other than it is, no forgery or plagiarism, in other words.

Have you heard of Jack Reacher? There is a series of novels featuring one John Puller…

I am not sure about that: the AI has taken Picasso’s pictures and has analyzed and stored them without asking for permission. It has taken many other pictures as well and digitized them. That is not the same as a human plagiarizing, being inspired by or wanting to parody Picasso (or whomever). But it is hard to explain the difference and I am sure lawyers will debate this for years to come and in the end the solution will be political, not technical (in the sense that stating that Facebook and Twitter do not have editorial responsability for the things the users post is a political, not a technical decision).
One way or another, we are not going to resolve this conundrum here. I just wanted to add a data point and express my insecurity.

No, sorry, I have not.

This is not correct, and that is easy to see without having to know what specific technique was used: the AI model (a few gigabytes) is far too small to have possibly stored the enormous amount of pictures it looked at during its training.

But it has used them for training, and in classical AI fashion, not even the programers know what the algorithm is really doing: it is a black box. The pictures may not be in the program but they have made the program the way it is.
Imagine making a second copy of the algorithm, trained with exactly the same data set, except for Picasso’s pictures. Of those the algorithm gets not one. Then ask the algorithm to draw a bull in Picasso’s style. What will it do?
What is the difference between both algorithms? Something about Picasso, it seems to me. That must have some relevance for copyright issues.

Absolutely; I have heard from some artists who are absolutely dismayed and angry that their data was used that way.

However, my naive understanding is that “you can’t copyright style”:

By this logic an art student who studied Picasso paintings is also violating copyright.

The art student has used them for training, and in classical human conciousness fashion, not even the parents or teachers know what the student is really thinking: their mind is a black box.
The paintings in the textbook may not be in front of the student while they paint, but they have made the student the way they are.

Imagine a second student, taught from exactly the same art curriculum but with all references to Picasso removed. Then ask the student to draw a bull in Picaso’s style. What will they do?

What is the difference between the two students? Knowledge of and influence by Picasso, it seems to me.
That must have some relevance for copyright issues.

Yes, I thought we would veer into questions related to vitalism and the difference between the human mind and a computer. I don’t know the answer, if there is one at all. For me a mechanical device is doing something cualitatively different from a human, but the argument of the black box undermines that difference somehow. The only prediction I dare make is that this issue will make many lawyers rich and some creators poor and angry.

Yeah. The more advanced AI gets at replicating specific human functions, and the more we learn about the inner workings of the human brain, the muddier the waters become.

Not just in the sense of “can machines be creative?”, but also “what is creativity, anyway?” It might not prove to be such a special gift, after all.

But it’s still possible that some of the input works, that the AI considered especially representative of a set of objects, ended up stored in the database more or less in their entirety, while other input works sufficiently similar to those were mostly or entirely discarded. If, for instance, the training set included a thousand photographs of a piece of broccoli, and the program essentially kept one or two of them in a compressed form and discarded the rest, then whenever you asked it for a photorealistic picture of broccoli, you’d get that same single piece showing up repeatedly. Which, from what I’ve seen so far, appears to be what happens.

I don’t believe that’s how these programs work. They don’t keep full images as part of the model in any way. Rather, the AI was shown a bunch of pictures of broccoli, and told “this is broccoli”, and it came up with some list of the properties of what a picture of “broccoli” is. And when you ask it for broccoli, it creates an image that scores highly in all of these “broccoli” properties.

If it always displays a very similar image, it is because that image, in the eyes of the AI, checks off all of the properties of “broccoli”.

But that’s also how image compression works. If I show you a JPEG photograph of your mother, and you say “That’s my mother!”, and I were to then reply “No it’s not; it’s just an image that has similar Fourier phases and amplitudes as an image of your mother”, then I would simply be wrong. It is an image of your mother, and the Fourier business is just a way of storing that image (or at least, the important parts of it) efficiently.

I don’t think that is correct. There’s no image storing at all, and no ability to recreate a specific image.

Obviously we don’t know exactly what’s going on in these models, but neural net processing is more about coming up with tests for fitness and then tweaking the network until it optimizes. So it looks at a million pictures of broccoli then you tell it to draw broccoli. The neural net produces random glurge, then tests it for ‘broccoliness’. Test fails, so it tweaks its parameters and generates an image. This time there’s a bit of broccoliness, so it changes its parameters again and tries again. It keeps doing this until its image maximizes the tests that were set up by the output of the natural language interface.

I asked GPT3.5 “Explain how diffusion models are used by AI to make pictures.” This is what it said:

Prompt: Do diffusion models store copies of the images they use for training?

However, it does agree that copyright violation may be taking place:

Prompt: Does training an AI with other artist’s images violate their copyright?

The right being violated isn’t due to a strict copying issue, but rather a violation of the artist’s moral rights, which allows them to restrict how their copywritten images are used. So it’s not necessarily true that an image ‘in the style of Rutkowski’ is a violation of copyright, but the use of Rutkowski’s images for training an AI may be a cooywrite violation, even if no output remotely similar is generated.

But it’s a gray area. I would guess that using those images for research or scholarship,is ‘fair use’, but opening that data up commercially could be a copyright violation.

Very pertinent questions and very interesting answer. Well done!