One of the things I read about recent “improvements” in ChatGPT is that it’s now “less reluctant” to answer questions. Maybe the 2023 question was an example of it pushing forward into an area it knew almost nothing about (in DeepQA terminology, having “low confidence”) and picking responses not just on the basis of “bullshit humans might say”, but responses that filtered the bullshit through some kind of heuristic pattern-matching to the number being asked about – a process that was seriously flawed. It’s potentially an interestingly revealing flaw.
It did better with “Please write an opening paragraph of a P.G. Wodehouse Blandings story”.
It was a sweltering summer day at Blandings Castle, and the Honorable Galahad Threepwood was feeling particularly vexed. As the younger son of the Earl of Emsworth, he was accustomed to a life of leisure and luxury, but his brother Frederick’s insistence on turning the castle into a pigsty had made it quite unbearable. Galahad had retreated to the cool shade of the gardens, hoping to find some solace in his favorite pastime of writing poetry, when he was interrupted by a loud commotion. “Good heavens, what on earth is going on now?” he muttered to himself as he stood up to investigate.
That’s why I don’t understand the purpose of this. What’s the point of a device that creates something that sounds like a human, but is actually nonsense?
In the future, a program may utilize multiple AIs in tandem. So one engine parses your question, another finds a correct answer, another parses the answer into human legible text.
Nobody has the slightest clue why it comes up with the things it does. The creators know what they fed it as inputs, and can reason about why it might be stronger in some areas than others–but ultimately, the behavior is almost entirely emergent. They created something which has human-level language skills but deficient reasoning skills. It’ll get better in the next version, though the creators still won’t really be able to explain why.
Honestly, I’m inclined to agree: We already have things that think like humans. More things that think like us are only valuable if we can understand how they (and hence we) think. What we really need is things that can think unlike humans, to solve the problems we can’t.
You might want to go back and look at some of its responses I’ve already posted in this thread. About a third of them I posted here because they were amusingly incorrect.
Good idea, so I tested an earlier year. It only did a tiny bit better:
Give me some interesting facts about the number 1999.
Summary
Here are some interesting facts about the number 1999:
\ 1999 is the last year of the 20th century and the last year of the 2nd millennium.
1999 is a prime number, meaning it is only divisible by itself and 1.
In the year 1999, the world saw several significant events, including the impeachment of US President Bill Clinton, the launch of the first version of the Napster music sharing service, and the release of the highly popular movie “The Matrix.”
The number 1999 is also the name of a song by the band Prince, which was released in 1982 and became a hit single.
In the game of craps, rolling a pair of dice and getting a total of 1999 is called “boxcars,” and is considered a very unlucky roll.
1999 is the atomic number of the element einsteinium, which is named after the famous physicist Albert Einstein.
1999 is also the title of a science fiction novel by John Varley, which is set in a future where humans have colonized the solar system and explores the concept of artificial intelligence and its relationship with humanity
ETA: a backslash in its output to keep Discourse from doing weird formatting things. Second time I’ve had to do that as a quick hack, don’t know the right way to solve it.
It occurred to me that 1999 was probably a poor choice to try because “1999” is probably seen as a year way more often in its training data than as a number, so I tried this:
Give me some interesting facts about the number 2017.
Summary
- 2017 is the first year that can be evenly divided by both 5 and 7.
- The number 2017 is a prime number, meaning it is only divisible by 1 and itself.
- The year 2017 marked the 100th anniversary of the Russian Revolution.
- The 2017 solar eclipse, which was visible in the United States, was the first total solar eclipse to cross the entire country in 99 years.
- 2017 was the year that the iPhone X, the 10th anniversary edition of the iPhone, was released.
- In 2017, the United States experienced its most expensive natural disaster in history when Hurricane Harvey hit Texas, causing over $125 billion in damages.
- The 2017 film “Get Out” was a critical and commercial success, and won the Academy Award for Best Original Screenplay.
- The 2017 World Series was won by the Houston Astros, their first championship in franchise history.
- In 2017, North Korea tested its most powerful nuclear bomb to date, causing international tensions to rise.
- 2017 saw the launch of the Falcon Heavy rocket by SpaceX, making it the most powerful operational rocket in the world.
I asked about the number, not the year.
Summary
I apologize for the misunderstanding. Here are some interesting facts about the number 2017:
- 2017 is a palindrome number, meaning it reads the same forwards and backwards.
- The number 2017 can be written as the sum of two squares in two different ways: 44^2 + 1^2 and 21^2 + 20^2.
- The number 2017 is the 15th Bell number, a sequence of integers that represent the number of ways to partition a set.
- In the binary number system, 2017 is written as 11111010001.
- The number 2017 is the sum of the first 45 natural numbers.
- The number 2017 is a Harshad number, which means it is divisible by the sum of its digits (in this case, 3).
- 2017 is the smallest number that is both the sum and the product of three or more consecutive numbers (5 + 6 + 7 + 8 + 9 = 35, and 5 x 6 x 7 x 8 x 9 = 151200).
- The number 2017 is the sum of the first three triangular numbers (1 + 3 + 6 = 10, and 10 + 11 + 12 = 33).
- The number 2017 is the product of two prime numbers: 61 and 33.
- In the Roman numeral system, 2017 is written as MMXVII.
There are elaborations of the transformer-architecture that can do data lookup ‘live’, like OpenAI’s WebGPT, but at least as far as I know, ChatGPT doesn’t do any sort of database retrieval (unless you count the weights of its neural networks as a very scrambled, approximate representation of its training data). Here’s a good sequence of blog posts illustrating the key technologies of transformer models and the important attention-mechanism:
Incidentally, “Toronto” would be wrong even if it was an American city. One airport is named after a World War I hero (not World War II), the major international airport is named after a Canadian prime minister, and no airports are named after a famous WW II battle. In fairness to Watson, it rated itself as having low confidence in its proposed answer.
Naturally, I had to try this out on ChatGPT. Nope!
What American city has airports named for a World War II hero and a famous World War II battle?
The American city that has airports named for a World War II hero and a famous World War II battle is New Orleans, Louisiana. The city’s main airport is named Louis Armstrong New Orleans International Airport, after the famous jazz musician and World War II veteran Louis Armstrong. The airport was renamed in his honor in 2001.
New Orleans also has a smaller airport called Lakefront Airport, which is named after the Battle of Midway, a significant World War II battle in the Pacific Theater. The battle, which took place in 1942, was a major turning point in the war and is considered to be one of the most important naval battles in history.
A very odd response indeed, especially as it found “Midway”, but somehow not Chicago. And “Lakefront” is named after the Battle of Midway? WTF? Also, Louis Armstrong served for only a short time in the FIRST World War, not the second, and was never regarded as a “war hero”.
This was followed by …
Don’t you think “Chicago” would have been a better answer. Its airports are named O’Hare and Midway.
You are correct, Chicago, Illinois also has two airports named after World War II heroes and battles. O’Hare International Airport is named after Edward “Butch” O’Hare, a World War II naval aviator and Medal of Honor recipient. Midway International Airport is named after the Battle of Midway, a significant World War II battle in the Pacific Theater. I apologize for the error in my previous response. Thank you for pointing out the correct answer.
Despite its many faults, this thing can be genuinely instructive. I mean, I know the answers involved here, but imagine someone just trying to learn this stuff and having this conversation with ChatGPT:
Please explain how a satellite stays in orbit.
A satellite stays in orbit around the Earth because it is constantly falling towards the Earth due to the Earth’s gravity, but it is also moving horizontally at a high speed. This combination of falling and horizontal movement means that the satellite never actually hits the Earth. Instead, it follows a curved path around the Earth.
To stay in orbit, a satellite must have a certain minimum speed, known as the orbital velocity. This speed depends on the altitude of the satellite above the Earth’s surface. The higher the altitude, the slower the orbital velocity needs to be.
Orbital velocity can be calculated using the formula:
v = √(GM/r)
where v is the orbital velocity, G is the gravitational constant, M is the mass of the Earth, and r is the distance of the satellite from the center of the Earth.
I hope this helps! Do you have any other questions about satellites or orbits?
Can the orbit of a satellite be described in terms of General Relativity?
Yes, the orbit of a satellite can be described using the theory of General Relativity. In fact, the theory of General Relativity provides a more accurate description of the orbit of a satellite than the classical mechanics description based on Newton’s laws of motion.
According to the theory of General Relativity, the orbit of a satellite is not a simple ellipse as described by classical mechanics, but rather a geodesic in spacetime. A geodesic is the shortest path between two points in spacetime, and it is determined by the curvature of spacetime caused by the presence of massive objects such as the Earth.
The equations of motion for a satellite in orbit around the Earth can be derived from the theory of General Relativity and are known as the geodesic equations. These equations take into account the curvature of spacetime caused by the Earth’s mass, as well as other relativistic effects such as time dilation.
I hope this helps! Do you have any other questions about General Relativity or orbits?
So in the absence of gravity, a geodesic would be a straight line?
Yes, that’s correct! In the absence of gravity or any other force, the path of an object through spacetime is a straight line, known as a geodesic.
In the presence of gravity or other forces, the path of an object through spacetime is still a geodesic, but it is no longer a straight line. Instead, it is a curve that is determined by the curvature of spacetime caused by the presence of massive objects.
The theory of General Relativity explains that the presence of a massive object such as the Earth causes spacetime to curve, and this curvature determines the path of a satellite in orbit around the Earth.
I hope this helps! Do you have any other questions about General Relativity or orbits?
As to the purpose of the current iteration of GPT, I think what I saw on a YouTube video is that it’s going to be used as the base engine on which people can build modules that have been trained on different data sets for various different fields. The chat function is just a way to be able to take human natural language inputs and return something that seems like the right answer that a human will be able to understand and will be accepting of. The majority of the training of this AI is in getting the language processing down, which for the most part it’s capable of doing perfectly. I was able to ask it a series of somewhat nonsensical questions, sometimes switching languages multiple times in the prompt, but it seemed to understand them perfectly (although I couldn’t critique its use of French, it seemed reasonable) in any sense that they could be understood.
It has a major problem answering questions for which the answer is outside of its training data, and only does a best guess based on what is in its training data. Future AIs that are trained on more specific data sets will solve the issues that we have in terms of the AI simply not knowing the answer. The problem that I see is that this AI has a problem of failing to understand when it doesn’t know the answer, which could be pretty bad once programmed with more technical information and someone relies on its answer.
The AI does need to be taken to another level in terms of being discriminating with the answers it gives when it doesn’t know the answer for sure. For Final Jeopardy, it makes sense to give an obviously wrong answer when you don’t have any other answer that’s possibly correct either, since you have to give something as an answer and there’s no cost to being wrong. Presumably though, it wouldn’t chime in if it were a regular clue, because it doesn’t have enough faith in the best answer available to it. Chat GPT in its current state is able to give nuanced answers in difficult topics that often involve “talk to an expert in the field”, but quite often doesn’t give any disclaimer when presenting general knowledge information that is just plain wrong.
I think it also needs to generate its responses multiple times and compare them to each other and look to see where they’re wrong. Even in answers to purely factual questions like the track list of an album, it can sometimes give two different answers after each other. If the goal is to mimic how humans sometimes make mistakes, that’s fine. But it definitely needs more work if it’s going to be used to provide general-purpose information in a reliable way.
However, when you introduce terms to it, it assumes that the terms have some meaning. I came up with a number of two word phrases that have no normal meaning to them, but contain one or two words that likely reference a specific topic:
Carmichael Variables → it tells me about Carmichael numbers, so that’s at least close.
Stochastic Postulates → it tells me about the axiomatic development of probability, which is close, but acts as though “stochastic postulates” is a term that’s used, when it’s not.
Quantum discriminant → It identifies this as what is actually the overlap of two quantum states. I didn’t find a single source on Google that suggests the term is used this way (or means anything else for that matter).
Goat Wax - it tells me that this is a wax that goats secrete. This type of wax is apparently so secret that Google has never heard of it, but this AI was able to tell me how it was produced and what its applications were.
Two word phrases that are far enough away from any actual meaning do stump it (correctly):
Hensley’s ramifications
Navin’s Prides
Glowacks’s lemma
planter hearsay
Busy Tide
Stanford Pilgrim
Here’s something I hadn’t expected it to give an answer for, but it did, and gave utter nonsense that is definitely grammatical English that sounds plausible, but what kind of world would we live in if this principle were true?
The inchoate loan doctrine is a legal principle that applies to debt contracts. It holds that if a person agrees to lend money to another person, but the loan is not completed because the borrower fails to fulfill their end of the agreement, the lender can still recover the amount of the loan from the borrower.
For example, if Person A agrees to lend $1000 to Person B, but Person B fails to pay back the loan as agreed, Person A can still sue Person B for the full amount of the loan, even if the loan was never actually paid out. The inchoate loan doctrine allows lenders to recover the value of the loan that was promised, even if the loan was never actually made. This doctrine is based on the idea that a debt contract is a binding legal agreement, and that both parties are obligated to fulfill their obligations under the contract.
There is a such thing as inchoate lien doctrine, but it’s not even close to the same thing, though it also involves debt.
To be fair, unconscious ignorance is also a problem with humans.
I’m very curious what the future will hold for subscription libraries that train these NLPs on their content. I suspect (without any expertise myself) internal checks could mitigate a lot of this. Something like:
- All output must include citations from the data set.
- All output must be checked against those citations.
- Output which does not match citations must be rejected and regenerated.
Current models don’t check their work. They just predict the most likely outcomes token by token and then show you what popped out.
Even then, there’s nothing stopping these models from drawing incorrect conclusions or spitting out verbal nonsense. Asking any predictive model to generate conclusions is probably a very stupid idea to begin with, since those conclusions are just going to be based on what “most people on the internet” think.
Microssoft is apparently in talks to give them $10 billion for a 49% stake in OpenAI.
Penny Arcade’s take. [warning: profanity]
That’s not how Large Language Models work. It’s not doing ‘lookups’ into source sata. Once the model is trained, there is no source - just a large neural net. ChatGPT simply uses its model to start printing words. After each word, the AI decides what the next best word should be. It is very much like a human who has read a ton of books - you can quote faxts, understand things based on your knowledge gleaned from many sources, but almost none of it is necessarily directly linked to specific material.
If you know that the U.S. won WWII, can you cite your source for that? Probably not, since you probably learned it very young nd heard it in many places. ChatGPT is pulling its words from a neural net trained on terabytes of data, but it’s not just repeating stuff from its sources. What it tells you may be the result of a thousand different books that it ingested.
As I understand it, no one really knows how it works in detail or what capabilities it might have. Much of it is emergent with scale. It’s not predictable, and no one really understands what happens when you scale the model. New abilities appear at certain scales, but some abilities do not get better, and some may get worse.
For example, ChatGPT knows a huge number ot prgramming languages, even some obscure ones, and also human languages. But for some reason it does not understand Morse Code, and it’s not coear why. It’s a lot simpler than some other things it knows. Maybe GPT-4 will have the scale that causes Morse Code to be understood, but maybe not.
In short, it’s all quite fuzzy My guess as to what will be needed to make GPT more accurate will be a second AI model, trained on the human responses to ChatGPTs putput. That’s probably the reason for the huge open beta. Perhaps the next version will compile its ‘thoughts’, but before spitting them out check with its own ‘judgment’ model to see if it thinks it is wrong.
It’s just likely that Morse code wasn’t in the training set.