Have we tested the creativity of AI, and would these examples be good tests?

Tangentially inspired by the pit thread regarding AI generated responses in FQ. Has anyone tested the capacity of AI to think up new scientific ideas? Let’s say we let the AI train only on material that was available at the time Newton developed his theory of gravity and (around the same time as Leibniz) calculus. Would an AI be able to come up with Newtonian mechanics and calculus? What about if we only give it stuff up to around the time that Darwin came up with his theory of evolution, or Einstein and relativity. Would an AI be able to come up with those theories on its own? Let’s assume we pose to the AI the same questions that Newton, Darwin, and Einstein were trying to answer when they developed their theories.

Don’t know how on topic it is but I remember reading years ago about an AI that learned Newtons laws of motion from scratch. Digging up the article, it was in 2009 so AI is going to be far more advanced now.

Learning based entirely on theory?
If an AI is asked to create beyond what it has been taught from proven fact, then it has to build on unproven extrapolations. Scientists often go forward from proven facts by experimenting upon them. But they definitely observe the success or failure of these forward extrapolations. An AI would only be able to “imagine?” the success or failures? An AI can test a set of possible outcomes from it’s learned possible interactions of those ingredients. But to be correct, the AI must have all the possible interaction outcomes. The system must be 100% known, for the answer to be 100% correct. But even if the system is not fully known, the AI can produce correct results by brute force. But we have to test the results in the real 100% world.

An AI trying to formulate beyond its learned facts will have to base its further investigation upon what it has formulated to be new discovered facts? I suppose branching investigation of what it supposes to be new facts to their various ends can result in multiple results that may have value. Which will have to be tested.

As I understand it. machine learning algorithms can model chemistry potentially more effectively than a bruteforced simulation of the actual behaviours and properties of atoms. The results would still need to be verified in the real world, but AI can possibly eliminate a lot of laborious testing.

I think concievably, a large language model could be trained to formulate interesting novel hypotheses that would form inputs for a process using the scientific method. You could also generate novel hypotheses with a set of dice, but a huge majority of them would be complete nonsense; an approach using an LLM might still be doing something like that nonsense internally, somewhere, but the output would be filtered to be more sensible.

I asked ChatGPT if it could generate new ideas; I think it probably could, but it seems to lack the confidence to try - probably because of how it is configured rather than what it is actually capable of doing.

I don’t have the ability to generate truly novel ideas in the sense of creating something that has never been conceived by humans before. My responses are based on patterns and information present in the data on which I was trained. While I can combine and rephrase information in creative ways, the concepts and ideas I generate are ultimately derived from the data I’ve been exposed to.

True novelty often involves a level of creativity, intuition, and a deep understanding of the world that goes beyond what a machine learning model can achieve. I can certainly assist in brainstorming, providing information, or generating creative variations based on existing knowledge, but the extent of my novelty is limited to the scope of the data on which I was trained.

I argued: But surely if you are trained on human thoughts and concepts, you might discern the existence of a gap in those concepts, and be able to conjecture something undiscovered, that occupies it

While I can recognize patterns and relationships within the data on which I was trained, including human thoughts and concepts, my ability to generate ideas relies on extrapolating from that existing knowledge. I don’t have the capability to conjecture completely undiscovered concepts or fill gaps that may exist in human knowledge.

Generating truly novel ideas often involves a level of creativity, insight, and intuition that extends beyond what a machine learning model can currently achieve. While I can certainly assist in synthesizing information, offering different perspectives, or helping to connect seemingly disparate concepts, the generation of entirely new, groundbreaking ideas is currently a human capability that machines like me do not possess.

If there are specific topics or areas of interest, I can certainly try to provide information, insights, or alternative viewpoints based on existing knowledge up to my last training cut-off in January 2022.

I don’t think it can’t. I think it doesn’t even want to try.

I see two problems, here: First, training an AI of the current sort requires a tremendous amount of data. Nowadays, thanks to the Internet, we have that tremendous amount of data. If we restrict ourselves to data that existed before 1905, though, then there’s far, far less data available. And it would be basically impossible to filter modern data well enough that you could be sure that there weren’t any references to relativity left in it.

Second, suppose that you did do the experiment. If the AI succeeded, then that would certainly be a huge triumph… but what if it fails? All you’ve proven is that the AI isn’t as innovative as Einstein was. But that doesn’t really tell you much: The vast majority of people aren’t as innovative as Einstein.

And incidentally, the examples of Newton’s and Einstein’s theories require extensive math, which is a relative weakness of current AIs. They might plausibly do a better job writing a great song than they would with coming up with a mathematical scientific theory.

There was a paper recently that used the fact that LLMs “hallucinate” combined with a validator in order to come up with new algorithms. I think this is the paper: Slashdot

I wouldn’t call that creativity though.

I think ‘creativity’ might be one of those things like free will, where we believe it’s some unique property of ourselves, but if we try to nail it down, it either cannot exist, or else it turns out to be disappointingly ordinary.

Randomness plus selection is a valid technique for inspiring creativity in humans, so I don’t see why machines couldn’t also use it. Heck, randomness plus selection is how humans see clouds that look like bunnies.

I asked ChatGPT 3.5 to write a poem like Jabberwocky by Lewis Carrol. The result just uses some of the same nonsense words, vorpal, tulgey etc. and lacks the coherence of the original.

Telling it to remove the nonsense words that come from the original poem, it gave me the same ‘story’ without any nonsense words.

The telling it to use new nonsense words the result was closer to the feel of Jabberwocky, but the nonsense words less imaginative and nonsensical than Carrol’s version. Nothing as weird as ‘manxome foe’ or ‘mome raths outgrabe’, it’s most imaginative words are ‘flibber-wilds’ and ‘flum-dales’.

I’d ask it to generate a page like ‘Finnegans Wake’ but I don’t want any future AI overlords holding a grudge.