Wut??? I’m talking here about skills in solving problems in logic, such as the types of questions posed in IQ tests to assess basic intelligence. You’re focused on coding skills, which is a niche specialty only tenuously related to general intelligence. Maybe I read too much into the statement “It lacks the critical thinking skills of a human and can only address problems it has previously encountered” if you meant it specifically with regard to coding skills, but my point was that it’s definitely not true in the general case and, again, coding skills are just a niche specialty in the overall realm of intelligence. And in that context, the statement that GPT “lacks the critical thinking skills of a human” and “can only address problems it has previously encountered” has repeatedly been shown to be just flat-out wrong.
I can dig up some of the specific examples I was alluding to earlier if you like, but here are a couple of things I posted before* that may be even more persuasive.
To those who claim that ChatGPT and its ilk don’t actually “understand” anything and are therefore useless, my challenge is to explain how, without understanding anything, GPT has so far achieved the following – and much, much, more, but this is a cut and paste from something I posted earlier:
- It solves logic problems, including problems explicitly designed to test intelligence, as discussed in the long thread in CS.
- GPT-4 scored in the 90th percentile on the Uniform Bar Exam
- It aced all sections of the SAT, which among other things tests for reading comprehension and math and logic skills, and it scored far higher across the board than the average human.
- It did acceptably well on the GRE (Graduate Record Examinations), particularly the verbal and quantitative sections.
- It got almost a perfect score on the USA Biology Olympiad Semifinal Exam, a prestigious national science competition.
- It easily passed the Advanced Placement (AP) examinations.
- It passed the Wharton MBA exam on operations management, which requires the student to make operational decisions from an analysis of business case studies.
- On the US Medical Licensing exam, which medical school graduates take prior to starting their residency, GPT-4’s performance was described as “at or near the passing threshold for all three exams without any specialized training or reinforcement. Additionally, ChatGPT demonstrated a high level of concordance and insight in its explanations.”
The converse question that might be posed by its detractors is that, if GPT is so smart, how come it makes some really stupid mistakes, including sometimes a failure to understand a very simple concept that even a child would understand? The answer, in my view, is simply that it’s because it’s not human. We all have cognitive shortcomings and limitations, and we all sometimes misunderstand a question or problem statement, but because an AI’s cognitive model is different, its shortcomings will be different. I strenuously object to the view that because GPT failed to properly understand or properly solve some problem that seems trivially simple to us, that therefore it doesn’t really “understand” anything at all. The fact that it can generally score higher than the vast majority of humans on tests explicitly designed to evaluate knowledge and intelligence seems to me to totally demolish that line of argument, which some philosophers have been harping on ever since Hubert Dreyfus claimed that no computer would ever be able to play better than a child’s beginner level of chess.
And to those who claim that GPT could only do this well because all the questions and answers were in its database, no, they were not, unless OpenAI is blatantly lying to us. Again, from one of my previous posts:
OpenAI made a point of the fact that GPT had never been exposed to any of the tests that were used in its performance scoring. That it may have seen similar tests is irrelevant, as it boggles the mind how a putative “token prediction” machine could use that information to improve its performance on completely different questions. Humans can benefit from seeing prior test materials because it familiarizes them with the methodology and allows them to practice and hone their skills with it. OpenAI was very explicit that GPT received no such specialized training.
* Those posts were from a larger discussion in this thread.