Autonomous robot people could be constructed right now

That’s simply not an accurate summary of their current capabilities. They are capable of completing some tasks that appear to embody a degree of reasoning; certainly that doesn’t always work, and sometimes the outputs are not appropriate, for sure, but when they do work, what they are doing, as a phenomenon, can definitely include reasoning - for example there a demo where an LLM correctly described the predicted behaviour of a Rube Goldberg type machine, just based on a photograph of it. Reasoning is apparently an emergent behavior of just learning to converse (which makes sense, because humans learn it that way too).

Here’s a really simple example:

Me: Mike has 23 apples; he eats 4 of them, and discovers that a further nine are bad, so he throws those away. His friend gives him 6 more apples, which he keeps, and he picks one apple from a tree in the garden. How many apples does Mike have at the end?

ChatGPT: Let’s break down the sequence of events:

  1. Mike starts with 23 apples.
  2. He eats 4 apples, leaving him with 23 - 4 = 19 apples.
  3. He discovers that 9 apples are bad and throws them away, leaving him with 19 - 9 = 10 apples.
  4. His friend gives him 6 more apples, bringing his total to 10 + 6 = 16 apples.
  5. He picks 1 apple from the tree in the garden, increasing his total to 16 + 1 = 17 apples.

So, Mike has 17 apples at the end.

If a human answered in this way, we would call that ‘reasoning’. It doesn’t mean that the LLM has any sort of inner monologue going on (like I would if I was answering), but the phenomenon happening here is a reasoned response.

And bear in mind that in order to answer that seemingly simple question properly, you have to have understood that ‘4 of them’ means ‘4 of the aforementioned 23 apples’ and you have to understand the implications of the terms ‘keep’, ‘throw away’ and ‘pick’.

Humans are quite capable of those sorts of nonsensical outputs too though (flat earth lizard people microchips in the vaxxeems), so I don’t see that as a disqualifier for ‘thinking’ - it’s just broken thinking.

We’re starting to run into the “monsters from the id” problem. Or “do as I say not as I do”.

Further example:

Me: Janet has 10 apples and 4 pears; Janice has 10 apples and 3 pears; Janine has 5 apples and 10 pears. Janet gives two pears in exchange for apples from Janine; Janice steals one of each type of fruit from the other two; Janet points out that three of Janice’s apples are bad and she should throw them away, but she throws away pears by mistake. How many of each fruit does each person have at the end?

ChatGPT: Let’s break down the transactions and actions step by step:

1 - Initial distribution :

  • Janet: 10 apples, 4 pears
  • Janice: 10 apples, 3 pears
  • Janine: 5 apples, 10 pears

2 - Exchange between Janet and Janine :
Janet gives 2 pears to Janine in exchange for apples.

After the exchange:

  • Janet: 12 apples, 2 pears
  • Janine: 3 apples, 12 pears

3 - Janice’s actions :
Janice steals one apple from Janet and one pear from each of Janet and Janine.

After Janice’s actions:

  • Janet: 11 apples, 1 pear
  • Janice: 12 apples, 5 pears
  • Janine: 2 apples, 11 pears

4 - Mistaken disposal by Janice :
Janet points out that three of Janice’s apples are bad, but Janice mistakenly throws away pears instead.

After the disposal:

  • Janet: 11 apples, 1 pear
  • Janice: 12 apples, 2 pears
  • Janine: 2 apples, 11 pears

So, at the end, the fruit distribution is as follows:

  • Janet: 11 apples, 1 pear
  • Janice: 12 apples, 2 pears
  • Janine: 2 apples, 11 pears

Edit: I haven’t checked the accuracy of the above answer, but at a glance it looks like it’s mostly right

Me: Why will some people find it difficult to answer who has how many of each fruit at the end?

ChatGPT: The key difficulty in solving this problem lies in keeping track of the different transactions and actions involving the apples and pears among three individuals with similar sounding names. Each action potentially affects the quantities of both apples and pears held by each person, and there are multiple exchanges and thefts happening in sequence.

To solve the problem accurately, one needs to carefully track each transaction and its impact on the quantities of both apples and pears for each person involved. Additionally, the problem introduces a twist by including a mistake made by one of the individuals, which further complicates the calculation.

Overall, while the problem may seem straightforward at first glance, the challenge arises from the need for precise tracking and calculation of the changing quantities of both types of fruit throughout the series of actions and transactions.

All of this from just ‘predicting the next token’. I submit that if you examine a vast amount of writing and learn to reliably predict the next word in a conversation based on everything you have seen, what you actually learned is (to some degree, probably not perfectly) how to speak and think and reason like the writers of the text you studied.

The main mistake there is making unwarranted assumptions. Assuming that two apples are received in trade for two pears and assuming that only the bad items of fruit are thrown out instead of all of them. Neither of these are givens. It’s a major concern for the LLMs as there is a considerable bias towards providing an answer, right or wrong, rather than asking for more information.

I’m not really sure of your point. Those are assumptions a human might easily make, in fact they were exactly the assumptions I made when I composed the exercise. I think they are reasonable assumptions.

Any real life trading situation is extremely unlikely to have a one to one trade ratio. This is something that a LLM needs to know. It is going to be very difficult to keep vetting these AI and their confidently wong answers.

Now if you substituted pennies and dollars for fruit in your example the LLM might understand not to assume a one to one trade ratio. But it should NEVER make such an assumption. We don’t need it to make such an assumption. That is what is so dangerous about this technology. In 99% of the cases we do NOT need to rush to this confidently wrong answer. We need tools that work with us and ask questions. That didn’t happen here.

I don’t really think that’s very relevant. The question was whether or not the response contains reasoning.

Given the phrase ‘Janet gives two pears in exchange for apples from Janine’, the most reasonable assumption is that the number of apples is not stated because it is an implied 1 to 1 exchange. The LLM reasoned correctly.

The fact that you can’t see the issue is even more reason to slam on the brakes.

The lawsuits are coming where an LLM is going to recommend actions that wind up killing people. The conversational nature of ChatGPT only makes things worse.

We can make all of these calculations in a spreadsheet or by other methods. We can already come up with these answers. 99% of life is continuing to put in checks and continuing to ask and answer questions. Eroding the ability of ChatGPT users to answer their own questions while ChatGPT comes up with confidently wrong answers is a grave error.

It’s not that I can’t see the issue you’re describing, it’s that I disagree it’s an issue in the context of the question being considered here, which question is: do LLMs have any capacity to do something that qualifies as ‘reasoning’?

And for og’s sake, make you have good QA on your diode supply.

Between a prompt and a ‘next word prediction’ response are billions of parameters holding information and code that we don’t understand even remotely. Reducing LLMs down to their last-stage output is not useful.

Claude 3 is exhibiting new behaviours that are very hard to chalk up to ‘just’ next word prediction. For example, researchers did a ‘needle in the haystack’ test, where they bury an unrelated sentence into the context and see if the LLM can remember it. The sentence was about pizza, and they asked claude to name the toppings in the sentence. This is how Claude responded:

That last part was totally Claude, and unsolicited. It was simply asked to find a sentence containing pizza toppings and list them.

I am trying to find another example I saw - a person gave Claude a very complex PH.D thesis containing novel concepts that were not part of its training set, and Claude analyzed it extremely well.

Yeah, it’s a bit like characterising human behaviour as ‘just chemicals being exchanged between synapses’

Yeah, maybe. By the time the first human astronauts get there, Mars might be populated by millions of LLM autonomous robots, all interacting with each other busily. These autonomous machines might have built up an independent culture based loosely on human input, and be just as alien as any human culture encountered by Columbus or Cook on their voyages.

Perhaps the Martian robot culture would decide to kill all the black astronauts as a preventative measure.

All this, and we would still have no guarantee that autonomous robots have true internal experience and consciousness of any kind. Maybe the galaxy is already populated with alien robot explorers who are philosophical zombies, and who outlasted their creators (who died long ago, perhaps out of guilt).

Or perhaps we will make this happen ourselves, uncomfortably soon.

You can have both. One of my favourite fictional robots is Ginger, created by Pat Mills for 2000AD. He was a taxi robot, humanoid but with a chair on his shoulders, and would carry human fare-paying passengers on his back by running at speed through the streets and alleyways of future Earth.

Interestingly, the storyline shows the first taxi-robots appearing just about now.

I would say that if we’re going to send walking robots to Mars, more than two legs is probably better for stability. Perhaps some kind of tripod.

So we’ve joked about killing sprees etc. but there is a flipside to autonomy which is accountability.

Lots of people have recently been referencing the IBM quote: “A computer cannot be held accountable, therefore a computer should never make a management decision.”

Note that this isn’t about the quality of the decision making. The computer might make very good decisions. But decisions have consequences, and sometimes decision makers need to be held accountable for consequences.

So, here:

Somebody is out a piece of garden furniture and whatever it is going to cost to fix that gate. No-one would argue that the life of a hedgehog wasn’t worth a bust up gate and a broken bench, but natural justice demands that the owner be made whole. So who pays?

Is the robot’s owner on the hook for whatever damage the robot does in pursuit of its autonomous goals? How about whoever presented the LLM as an adequate operating system? Or… does the robot have a legal personality? Does it have to now find a way to earn money - can it enter into contracts? - in order to make restitution?

If it didn’t want to do that, could it be taken to court?

Interesting questions; the robot person I am loosely describing in this thread would be, as nearly as possible, a free entity (not a slave or servant, so not owned), so I imagine that would require some legal notion of personhood.

Yeah, I think being a free entity pretty much requires legal personhood. But that of course will raise other questions. Can the robot own property? Start a business? Employ other robots? Employ humans? If it takes on a job, can it join or form a union?

Does it have the right to defend itself from harm? It’s easy enough to imagine some bored teens drawing a cock on its face, or pushing it over repeatedly, or just smashing it up. Could it use force to defend itself? Could it call the police and expect to see the perpetrators arrested for assault?

Can it lobby for robot rights? Can it vote? Can it stand for election?

Corporations have legal personhood, and can do some but not all of these things. But corporations don’t walk around in a body interacting with people, so they don’t need to do some of these things.

Personally, I feel that in the interests of it being accountable for its actions, it would have to have the ability to own property, if only so it could be sued for damages. But that probably means it can be economically active in the same ways I am. I draw the line at voting. I’m on the fence about its right to defend itself.

It might be that for a certain period of time, it would require a responsible legal guardian, before reaching some milestone where it is fully accountable and independent.