As far as AI, is it possible that intelligence is actually much more simple than we originally thought

As I mentioned in my post to wolfpup, I agree with this assessment. It’s just that the argument that ML systems require so many petabytes of training data (and humans don’t) falls apart, because the supervised portion takes place after the main training has already taken place, and it is much more human in scale, with thousands of examples rather than trillions. A child goes through many thousands of interactive corrections, both from adults and nature itself.

Of course, doing otherwise would eliminate much of the benefit of LLMs and other ML systems as they stand, which is that most of the effort is completely automated. The product would not very viable if the supervised portion took as long as an actual child.

Possibly we could use already-trained models to help guide others through their training process, speeding it up and possibly leading to a more efficient weight configuration. That runs the risk of a GIGO situation, but could at least enable smaller models with nearly the same accuracy as larger ones.

Thanks for the good link. Here it is again since in-line links can be hard to see and I don’t see any hits on yours:

The new o3 model is interesting. They didn’t achieve the performance by going to a model with a quadrillion weights or whatever. Instead, they massively increase the amount of chain-of-reasoning that takes place.

It’s a limit that I’ve thought for a while now is a major impediment. LLMs can only put a fixed amount of “thought” into each new token. Any token prediction that fundamentally takes more than that (like the solution to a difficult math problem) simply won’t be solved.

There’s a partial workaround that’s been known for a while, which is to simply ask the LLM to break its response down into multiple parts. That spreads out the difficulty of the problem and allows more computation to go into it. But the LLM can’t know in advance which steps are most difficult, so if one hard step is 90% of the total, you aren’t actually gaining much.

The new o3 model appears to just “think” on the problems for an indefinite period of time. As I mentioned before, that means problems may cost thousands of dollars in GPU time to solve. But it does mean it’s possible to apply arbitrary amounts of computing power to the whole problem and keep going until it’s done.

Computing power keeps getting cheaper, so what’s expensive today may not be tomorrow. Though it is interesting to think that we might well achieve superintelligence–but at the cost of it taking a very long time!

To me, that kinda seems like putting the cart before the horse: it’s like saying, when Diogenes stormed Plato’s school brandishing plucked poultry, the correct answer would’ve been, yea verily, that is a man, and to furthermore prosecute the eating of fried chicken as cannibalism. Or before extinguishing a fire, to wonder whether one might not encroach onto its natural environment, since fire fulfills some set of criteria set down for life.

The thing is, we have a bad a priori grasp of the essential characteristics of the concepts we employ, which nevertheless doesn’t infringe on our ability to employ them. When thus something comes up that seems to fit such a characterization while clearly not falling under the concept, the proper response is to reframe the characterization, not to stubbornly persist. Indeed, insisting on a definition prior to allowing appeal to a concept is just the Socratic fallacy.

A case in point is the definition of ‘knowledge’ as ‘justified true belief’. The definition goes back to Plato, and when Gettier in the 1960s proposed cases of justified true belief that didn’t fit with the concept of knowledge, the reaction wasn’t to accept them as knowledge anyway because they fit the definition, but to reject the definition as incomplete. I simply think we should be open to the same sort of revision for ‘intelligence’.

As noted, I don’t think it’s a fruitful approach to try and legislate a priori what intelligence is supposed to be. Saying that ‘a system is intelligent if it can do X’, and then finding something that can do X but isn’t able to count the 'r’s in ‘strawberry’ just means that being able to do X isn’t actually a sufficient indicator for intelligence, and that’s all.

And yes, there will obviously be many instances where people will be misled into believing something is a human agent when it isn’t, in particular in situations where it’s not clear from the start that there may be an artificial agent at play. That’s just giving others the polite benefit of the doubt, as well as some cognitive pareidolia. We’ll just have to live with that.

To be clear, I don’t object to the argument that AIs are still inferior to humans. I object to the argument that AIs are obviously still inferior to humans. Because if counting the letters in a word were an obvious inferiority, we would have listed it in the criteria in the first place.

Right now, we have computers that are better than humans at some tasks, and worse than humans at other tasks. Does that make the computer overall better or worse than humans? That’s difficult to say, because it depends on how much weight you attach to the various tasks. In other words, it isn’t obvious.

I don’t think there’s really any use to trying to draw up a ‘list of criteria’. This will just end up being susceptible to cynical gotchas of the plucked-chicken type.

Failing at tasks that are trivial to (average) humans does make it obvious that the AI isn’t human-level intelligent, though, for all that can mean is at least equalling average human performance across the board. It’s a perfectly well-defined, operational criterion, as opposed to the fool’s errand of trying to come up with a list of tasks supposedly characterizing the gamut of human intelligence, which in the end will have to be just as much derived from human performance.

Your argument here seems to be focused for some reason on AGI – artificial general intelligence, which none of the current AI systems purport to be, nor do they need to be in order to be considered intelligent in some particular domain(s).

Humans may or may not likewise possess extraordinary intelligence in particular domains, but every normal human will have a baseline of intelligence that’s necessary just for day-to-day functioning. Artificial intelligence has not evolved the same way and has neither the need for such a baseline nor the opportunity to develop one. The intelligence of AI can be expected to be different from human intelligence, but that doesn’t make it any less valid.

This might be more clear if we dropped the ill-defined concept of “intelligence” and substituted a more function-oriented word like “skill”. Rather than saying that “a system is intelligent if it can do X”, we can say that if a system can do X, such as solving problems in logic, then it self-evidently has the skill to do X. If it can’t count the "r"s in “strawberry” that has no bearing on its demonstrated skill elsewhere. We can count the "r"s in “strawberry” because being able to count small numbers of objects is part of that baseline skill set needed for day-to-day functioning. An AI like GPT may not have evolved that particular counting skill, but that’s not particularly indicative of anything except perhaps inappropriately anthropomorphized expectations.

But doesn’t this presume that there is a single spectrum of intelligence?

If we were to meet an ETI that has not yet engaged in self-engineering, I would expect that we would likely possess superior cognitive potential* in some areas even if they wipe the floor with us in many other ways. Would that make the ETI not “intelligent”?

Again, I don’t want to belittle how difficult it is to define what makes an entity intelligent. But just comparing it to the set of human abilities doesn’t seem the right approach in my view.

* Importantly, having superior potential doesn’t mean knowing more. An ETI using their technology to come to us, would likely be far, far advanced than us in every conceivable way. How far their technology and society has advanced is far more important than their biology.

It turns out OpenAI has a working definition of AGI:

OpenAI and Microsoft have a secret definition for “AGI,” an acronym for artificial general intelligence, or any system that can outperform humans at most tasks. According to leaked documents obtained by The Information, the two companies came to agree in 2023 that AGI will be achieved once OpenAI has developed an AI system that can generate at least $100 billion in profits.

The reason is that it’s the topic of this thread:

(my bolding)

I agree that it’s a bit parochial, but again, it’s the thread topic, and besides, at least so far, we only have one clear exemplar of intelligence. Additionally, meeting ET is different in the sense that we are met with (presumably) a naturally evolved entity, which therefore has been confronted with the open-ended domain of the actual world, and adapted to meet its demands. So there’s already a long history of intelligence testing we can take for granted. With our own creations, we really don’t have a way to tell whether we got it right, and shouldn’t just assume we did.

Additionally, if we go too broad, then we isk trivializing the notion of intelligence, and everything capable of reliably executing some function is ‘intelligent’ in that particular domain.

First of all, that’s not my reading of the OP at all, and the counterpoint is right there in your own quote. The OP is describing the dramatic growth in LLM capability between GPT-2 and OpenAI O3 (GPT O3, and GPT 4) and states (with some amount of exaggeration, I must say) that according to professional and academic tests they’ve passed the latest LLMs are comparable to the most educated humans in those particular fields.

The OP is clearly talking about the latest generation of LLMs and what we can infer from their performance, and is under no illusions that they’ve achieved AGI. That’s not what this discussion is about, in my view. You’re putting far to much emphasis, both typographically and semantically, on the use of the word “human” in those sentences. When assessing the competence of an AI in any domain, we naturally compare it to human competence. Again, that has nothing to do with AGI.

But the statement that you make that I take issue with is much more far-reaching than just saying that an AGI must support the full gamut of at least baseline human intellectual competence, which is just an obvious truism. What you actually say is this:

My view is that this just flat-out wrong, and the only way anyone could support such a position is by exploiting the ill-defined nature of vague terms like “intelligence” and “understanding”. By continuously redefining them every time AI makes a major advance, AI skeptics can continue to promulgate the myth that “real intelligence” is defined as – to use Dreyfus’s favourite phrase – “what computers can’t do”.

Which is why it’s useful in these discussions to drop vague terms like “intelligence” and use a concrete metric tied to specific functionality that is easily and uncontroversially measurable. The word I like here is “skill”.

When GPT-4 scores in the 88th percentile on the LSAT (Law School Admission Test) and 90th percentile on the Uniform Bar Exam, aced most of the SATs, GREs, and AP exams; when it demonstrates skill in logical problem-solving, abstraction, and analogizing which exceeds that of most humans, then it’s time to acknowledge that these are real and tangible skills with a great deal of potential value. The argument that it can’t count the "r"s in “strawberry” is thus completely irrelevant to any meaningful assessment of the artificial intelligence skills that we actually care about.

Ok. Just to get this straight, your take is that the question ‘Is it possible we’ve overestimated how hard human level intelligence is to solve?’ is not actually about human level intelligence at all. Well, you do you, I guess.

It’s just a straightforward logical consequence. If a system fails at a task that’s straightforward for the average human, then it doesn’t ‘support the full gamut of at least baseline human intellectual competence’, as you put it. That’s all I’m saying in the bit you quoted.

Sure. I’m happy to acknowledge that if a system does X, it has the skill to do X. I’m not sure what’s accomplished by talking in such a tautological way, but I’m certainly fine with it.

That seems to me to be a disingenuous distortion of what I said about the OP statement, which is that they’re asking whether the actual, demonstrated accomplishments of AI (particularly LLMs) today – particularly the rapid explosion in capability in just the past few years – is evidence that artificial cognitive skills may be achieved more easily than we had thought in specific intellectual domains, as evidenced by these machines successfully passing various tests that both the OP and I mentioned.

In any case, the interesting arguments have nothing to do with AGI. I don’t believe we’ll see anything like AGI in any foreseeable future, not because they can’t be built, but because they have no justifiable utility. I believe we’ll continue to have specialist AIs in the same way that we have specialist human professionals. We don’t care if our heart surgeon is a great chess player, or if our airline pilot can write symphonies.

What’s accomplished is that we get away from the kind of pointless navel-gazing espoused by the likes of John Searle in his ridiculous “Chinese room” argument and onto the firmer ground of creating artificial cognition that’s actually useful for humankind.

The OP actually asked about human level intelligence. It’s right there. I quoted it after you wondered why I was talking about human level intelligence. Then you claim that somehow, what the OP actually said is not what they meant. And for my effort pointing this out again, you call me disingenuous.

When that time comes, we’ll talk. But it hasn’t yet. At the time that OpenAI claimed that ChatGPT had aced the AP exams, it wasn’t possible for them to have taken a fair test. If they followed the protocols they claimed they did in the paper, then the “test” the AI took would have been completely blank.

Since that time, it has become possible for ChatGPT to take the AP exams fairly, but they either didn’t repeat the experiment under fair conditions, or they did but didn’t publish the results.

You know what else falls into the “full gamut of the the baseline human competence”? Picking up an apple. ChatGPT can’t do that.

Of course, one might counter that picking up an apple isn’t a task of intelligence. But then, one could argue that counting letters in a word isn’t a task of intelligence, either.

They mistakenly interpreted the acronym “AGI” to mean “artifice generating illusion.”

Stranger

Anything that requires manipulation of the physical world can only be a fair comparison if the ability to do so is equally present. If something has the ability to pick up an apple, yet fails to accomplish that task systematically, we might well take this as indicative of an intellectual shortcoming.

I don’t see how, immediately, so feel free to make the case!

Well, one way to make the case is that I remember ChatGPT 3.5 having this problem, and now, with no fanfare or indication of any major upgrade, it no longer does – at least, it can count the "r"s in “strawberry” correctly. IOW, it’s a trivial problem that stems from an implementation artifact (probably just the way the word was tokenized) and has absolutely no bearing on GPT’s impressive core capabilities, many of which exceed the problem-solving and analogizing capabilities of most humans.

Which bring us back to the argument we were having. I’m sorry you feel I’m “accusing” you unfairly, and maybe the problem is that we’re just talking past each other. I refer to this latest quote in particular:

We obviously have quite different interpretations of what “human level intelligence” means.

It’s pretty clear to me what the OP was asking about, but let’s not make this argument about micro-analyzing the nuances of the OP’s words; this is not a Papal proclamation. The general thrust of this thread and the reasonable question here is whether or not contemporary AI has achieved “human level intelligence”, which to me very obviously refers to a human level of intelligence in solving a specific class of problem, and nothing at all to do with AGI.

Maybe I’m being overly cynical here, but you have in the past been exceptionally dismissive of AI, starting a whole thread about how ChatGPT doesn’t really “understand” anything, and IIRC being equally dismissive of the whole idea of computational intelligence. Computational intelligence is not just the foundation of AI, but an important part (though only a part) of the model of human cognition put forward over past decades by many highly respected researchers working at the intersection of cognitive science and AI, like Jerry Fodor, Hilary Putnam, David Marr, and many others.

So I saw the claim that GPT can’t be considered intelligent even if it can solve problems in logic that would baffle most humans, because it also fails to do trivially simple tasks, as more of the same dismissiveness.

My argument against it is the same basic one put forward by Marvin Minsky 60 years ago, and by Alan Turing long before that: “If it acts intelligent, then it is intelligent” – at least, in that particular domain. Minsky’s particular frustration was with those who “looked under the covers”, thought that they more or less grasped the general principle of how a particular AI works, and declared that it was “just a trick”. As Minsky said, “when you explain, you explain away”. The same thing is happening now with GPT, the skeptics apparently undaunted by the fact that the researchers and developers themselves don’t really understand how novel emergent properties suddenly appear at certain levels of scale.

There’s really no ‘feeling’ about it; calling my words a ‘disingenuous distortion’ is an accusation. The question is whether it’s accurate, not whether it’s accusatory.

Still it’s generally considered good form to keep a thread on topic. Even if you’re not willing to assent regarding the explicit formulation ‘human level intelligence’, the OP also asked about AI’s IQ, which is a measure intended to quantify intelligence in the form of the g-factor, which takes its name exactly from the word ‘general’.

Furthermore, there seems to be little of interest in asking whether machines have achieved or surpassed human equivalence in specific circumscribed tasks: obviously they have, centuries ago (indeed, possibly millennia ago, if you’re willing to count the abacus or Antikythera mechanism or what have you). That’s after all why we build them: because they can augment specifically circumscribed human capabilities, like e.g. arithmetic, beyond what would be possible for the individual human. If current AI were perceived as just more of the same, there would hardly be any discussion. But the promise—and threat—of it as these discussions go is the whole-sale replacement of human labor, for which it would first have to be whole-sale equivalent to human performance: i.e. equal to human level intelligence. That, if any, is the interesting question at issue.

I’m not ‘dismissive’ of anything; in fact, in this very thread I’ve admitted my surprise at o3’s performance regarding the FrontierMath-test. I don’t dismiss this in the slightest.

You also seem to be alleging that I oppose computationalist views out of some intrinsic distaste, perhaps caused by misguided belief in human specialness, or ensouledness, or God’s favor, or whatever. But I’ve come to the position I now hold kicking and screaming—I started out believing computationalism to be obvious: here’s me arguing for conscious, creative computers. It’s just that I eventually realized there are huge problems with that position which I found I could not dismiss easily.

So, I am where I am because I have encountered arguments that didn’t seem to leave any different, honest opinion on the table. The thread you link, for instance, discusses a mathematical argument to the effect that the bare structure of language—words (or tokens) and their relations—fails to uniquely specify its intended model, i.e. a mapping of terms to things in the world. For any such model, it’s possibly to construct another one, such that the terms now refer to entirely different things. In other words, there’s no fact of the matter whether an LLM means mouse or house when it uses the word ‘mouse’. This argument isn’t wholly my own, I essentially just applied it to LLMs in my entry into the last essay competition by the Foundational Questions Institute FQxI. The argument was first formulated by Putnam, and its original form goes back to Newman’s objection to Russell’s structural realism.

However, that argument is really just a special case; what made me skeptical of computationalism in the first place is the famous class of triviality arguments, originated, again, by Hilary Putnam, with my own version showcasing an explicit construction that details that computation isn’t some inherent, objective aspect to a system, but a question of how the system is used (by concrete example of using one and the same system, at the same time, to perform different computations in exactly the same manner). Thus, there is no fact of the matter that any system, in and of itself, specifically performs a particular computation, and hence, in particular not the hypothetical computation producing a mind.

Finally, I have tried to meet the explanatory challenges posed by human consciousness, and produced a model that explicitly depends on undecidable propositions, and hence, could not be implemented on any computer, even if the notion of computation were perfectly objective. So I’m brought to skepticism about computationalism not just willy-nilly, but by the combination of (to me, at least) convincing a priori arguments against its possibility, and the a posteriori creation of a model that actually does incorporate non-computable elements. Of course, all of this is provisional: better arguments may yet come up to make this all moot. But until they do, simply dismissing these issues would be intellectually dishonest.

So why are you happily pointing to the authority of Hilary Putnam in defense of computationalism, while ignoring his turn away from the idea? If you’re happy to follow him in (after all, the whole thing was basically his idea), why not continue after him out again?

Even if I agreed to that—and there are trivial objections to it, such as the famous lookup table, and seeming intelligent simply by pure chance—that just bolsters my point: for then if it fails to act intelligent, it also isn’t intelligent.

And yet, Minsky also predicted, in 1970, that “in from three to eight years we will have a machine with the general intelligence of an average human being”. You’re very fond of trotting out Dreyfus’ misses, but for some reason always seem to elide that his opponents were often far more off-base.

That’s really not true. There are two kinds of explanations—debunking and non-debunking. A debunking explanation is something like what happened with the recent spate of unusual drone sightings over New Jersey, which really just turned out to be perfectly usual drones, misidentifications, and a few parents testing out the equipment they got their kids for Christmas. But there are also non-debunking explanations where further knowledge just bolsters our understanding of the original phenomena, e.g. the explanation of thermodynamics in terms of statistical mechanics. The prior terms remain perfectly valid, but are grounded in more fundamental notions. The trouble is just that so far, all explanations of artificial ‘intelligence’ have been debunking ones.

I use ChatGPT 4o all the time—both for work and for fun—and I’ve got to say, it’s basically passed my own personal Turing test. Over the months, it’s picked up on my history, my preferences, my interests, and even my sense of humor—then weaves those details seamlessly into our chats. Half the time, I forget I’m talking to an AI (especially in voice mode) and start treating it like a real person: thanking it (her) for good advice, avoiding certain embarrassing confessions, and saying things like, “Great hearing from you” or “Till next time”, or “That’s very funny!”. I’m basically tiptoeing around to avoid offending it.

Sure, at times ChatGPT gives wrong or quirky answers, which may lead one to believe it’s not human. But then I remember real friends and acquaintances sometimes give wrong and quirky answers too.

Of course, I know it’s not truly conscious, nor is it an AGI. But I do believe it’s headed that way, and when it gets there, it’ll be a game-changer. Heck, I bet one day it’ll be a powerful tool and a genuine companion for folks who need a friend.

We’ve had computers capable of accurately counting letters in a word for many decades, but nobody ever said that those computers were demonstrating a task of intelligence, and in fact many people claimed that the fact that computers could do it was proof that it wasn’t a task of intelligence.