The point is that the first junction is stated to have a bizarre ratio of 100:1 major to minor accidents - and this is left unexplained. In real life, the first task in designing a better junction would be to understand why the first configuration has this strange property. A natural part of that process might be to find another equivalent junction design (with the same strange ratio) to see what they had in common.
But that wasn’t the problem that was asked.
As I laid out above, the problem was asked twice. The first time was nonsensical, since it implied that major and minor accidents have equal significance. The second time (after presenting data that showed a bizarre ratio of major to minor accidents without explanation) used the ambiguous term “equivalent” that we are discussing here, so you are assuming your conclusion in saying that this is not what was asked.
This “problem” says a lot more about the people who think it’s some great indicator of intelligence than anything else.
Indeed. In order to impress some posters here with your intelligence, apparently you must:
Overlook the fact that the first statement of the question nonsensically implies that major and minor accidents have equal significance, contrary to what a junction designer’s objective would be in real life.
Shrug off the fact that the questioner has just presented data showing a bizarre ratio of major to minor accidents, without comment let alone explanation.
Guess the correct meaning of the ambiguous word “equivalence” when the question is restated in a different way. To do this, you must apply common sense only selectively: you must assume what a sensible junction designer’s objective would be, contrary to the way the question was first asked; but you must not apply common sense to investigating the strange and unexplained ratio of major to minor accidents in the data, this you must just blindly accept.
I think that “display similar behaviour” or “have the same proportions” is a reasonable interpretation of “equivalent”.
One in three Yale students asked the question interpreted it that way. That shows that it is an interpretation that many intelligent people come to.
You have your plausible use right there. It is more than plausible that people would interpret it that way. Many highly intelligent people DID attach that meaning.
Nah. The point of the video was initially to introduce the idea of “cognitive reflection problems” and to conflate that very specific thing in a very confused way with cognitive biases in general.
The work originate with Shane Fredricks’ “cognitive reflection test”:
Short version about it:
Allegedly measures response inhibition, suppressing the first “intuitive” response that comes to mind.
The claim is questionable for the three item test; the garbled illustration he presented is more an example of how presenting a problem in a confusing misleading way can confuse and mislead; and conflating response inhibition with cognitive biases and even how our heuristics can mislead, is a faulty thought process.
I don’t think anyone has said it is a great indicator of intelligence. Very intelligent people are often prone to the same cognitive misfirings as the rest of us mere mortals.
Everyone can benefit for knowing about our predisposition to jump to “obvious” answers that fit with our presuppositions.
Leaping to conclusions without thinking deeper about questions is a very human trait regardless of intelligence and anyone can learn ways to help avoid that trap.
I’m not sure what the problem is, except that “8” is not the answer to the question being asked. If Layout A had 2000-major and 16-minor, and B had 1000-major and 8-minor, we could reasonably say that Layout A is exactly twice as bad as Layout B, but that’s not the question, which was to find a value for B(minor) that would make B equivalent to A.
For that, you’d need some formula for damage-equivalence, i.e. a “major” accident is deemed to be 100 times as destructive as a “minor” accident (using some calculation based on lives lost, medical costs incurred, automotive repair costs incurred, road delays incurred due to blocked traffic, etc.)
Thus, Layout A produces (2000 x 100) + (16 x 1) = 200016 total “damage units.”
To calculate the necessary number of minor accidents for B (call it Y minor accidents) to make B overall the equivalent of A, we find:
200016 = (1000 x 100) + (Y x 1)
…leading is to Y = 100,016 minor accidents.
And now I’m going to read the thread (and watch the rest of the video), with full confidence that somebody came to the same conclusion but phrased it more elegantly.
Yes, pretty early in the thread this was already the general consensus.
The question as stated is not answerable with any certainty due to ambiguity, lack of clarification of terms, missing information etc.
Which is of course, a key point of the question. People give certain answers when they shouldn’t, when further and more careful interpretation should reasonably suggest that this isn’t as straighforward and certain as the simple sums might initially suggest.
Just putting my 2 cents in: the summary in the OP is unfortunately (unintentionally, I’m sure) misleading, the video is much clearer that it is indeed not about two junctions but choosing between two proposed layouts of the same junction.
However, I still believe that the question is oddly phrased, as he says “what goes here to make these two schemes equivalent”. A realistic way of putting it would be: “now the committee finds that the two schemes are equivalent. Given that, can you tell me what number should go here?” The ‘make’ has the odd suggestion that you deliberately try to ‘make’ the scheme have a certain number of accidents (i.e. improve the scheme, which would imply reducing the number of accidents), instead of simply solving a puzzle to determine what the number is. It i s very counter-intuitive to ‘make’ a traffic scheme by increasing the number of minor accidents.
For me that is a reason why I had a hard time grasping what was intended with ‘equivalent’.
Exactly. But I came up with a guestimate and it was about 1500.
I think one of the mistakes being made here is: thinking that small-N tests of college students tell us anything at all.
It’s quite easy to get dramatic results from experiments with 20 or 30 largely homogenous young adults. As the Replication Crisis has shown, it’s a lot harder to find from larger studies that those results are meaningful.
If I had to predict one thing about the kind of people who get into Yale, it’s that they are very experienced test-takers and they have a well-developed set of habits which they employ - evidently with great success! - when they are given a word problem with lots of numbers.
A guy walks into a classroom, and presents them an oral or written (a crucial difference which the video does not engage with!) little story, containing the word “equivalent” and a 2x2 grid that looks like:
2000:16
1000:?
Just looking at that grid, your brain said “8”. Yale students , who are experienced test takers, think with some justice that they are clever, and believe that speedy answers are better than slow answers take one look at it and write down 8. Or at least a quarter of them do.
Obviously, that’s bad. They should have read/listened to the question more carefully. But the fact that if you put in the work you can con a bunch of smart arse kids is not in itself that amazing.
What results would you get if you tried this with bright 14-year olds? Or 40-year-olds? What about if you tried it with career engineers? What if you took a random sample of the population? What if you present it orally and invite clarification questions? What if you allow group discussion? What if it’s in the middle of a longer test full of straightforward questions?
What exactly have we learnt here?
The lesson we’re asked to take is that you should always read and re-read the question, sense-check your answers, make sure you understand the problem etc. It’s a good lesson! But it’s interesting that there isn’t anything to learn about how to ask questions. How do we avoid ambiguity or sloppy wording? How do we make sure people know exactly what they’re being asked for? How do we overcome this rush to answer in people who are under time pressure, or stressed, or otherwise less likely to double-check their answers?
It’s kind of interesting to show that people make these mistakes, but it would be a lot more interesting to think about systemic approaches to avoiding them.
Frederick’s original survey on this subject was much larger than that (over 3,000 subjects) and the results are in line with the results quoted for this similar question in the video.
All good questions, I’m sure we’d see a variety of significant and insignificant differences. Certainly some of the later points you make are excellent ways of guarding against us leaping to an assumed answer.
Your mention of “group discussion” is an excellent tool that I use often when dealing with complex, messy and ambiguous problem areas. Simply going round the table right at the start and asking people what each of them think we are talking about or what they think we are trying to achieve flushes out so much confusion and ambiguity. I highly recommend it but it is alarming how often it is not done and you find yourself having to confront that confusion much further down the line.
However, seeing as the video is only ten minutes and merely skims the top of a much wider subject I don’t think we can be overly critical of the ground it covers.
Yes, I agree with all of the above.
Again, all good thoughts but we can’t criticise a duck for not being a chicken. The video is merely an introduction to a wider area of interest. I don’t think it badges itself as a diagnosis of and complete solution to this problem.
But the presenter does indeed want to ask and answer exactly the sort of in-depth questions you raise. He doesn’t do it in this short video but here he is speaking on just that sort subject in an interview. He also covers that sort of approach in many of his podcasts that I linked to earlier in the thread.
Also, for those who are interested I have reached out to the makers of the video and Tim Harford and asked more about the intentional or accidental phrasing and ambiguity within the question both here and in its original use by Frederick. If I get anything back I’ll let you all know.
Sure, but no-one’s arguing that the clearly worded items in the Cognitive Reflection Test don’t show a real phenomenon. The argument is that compared to them, the question in the video is ambiguous and sloppily worded in a way that detracts from its ability to demonstrate the same thing.
To put it another way, given that we have an excellent robust paper demonstrating people’s tendency to rush to a quick, wrong answer, what additional learning do we get from the road layout Q? If it’s just confirmation of Frederick’s paper then its worthless because of its small, non-random sample. If it’s meant to highlight a new facet of poor cognitive reflection then, aside from the continuing issues with the sample, what is that new facet?
I mean, I can definitely see the case for an experiment in which an authority figure asks a deliberately badly worded question to see if people identify the lack of relevant info or challenge ambiguity/sloppy wording. But that’s not what this is, or what it claims to be. It claims to give people all the info and a clear question that they ignore in a rush to give the “obvious” answer. But it’s just not achieving that.
Firstly, I’m not sure that we can say that this question was applied to either a small or non-random group. I don’t see anywhere in the video that talks about the overall numbers involved. As for the sample group, if the intent of asking the question was to see how highly educated students approach it, the sample may be perfectly representative. If it were to see how the population in general approach it, clearly it would be severely lacking. But I don’t think we know what the intent was. If anyone does, let me know.
I see the road question as a much more analogous to the way some real-world problems are framed and misinterpreted. Problems that are more complex than the simplistic (and by now, very well known ) questions in the original study. And as it has a lovely hyperbolic “most annoying” label it is a good introduction to the subject to get people’s attention.
To be fair, in the video they do discuss some of those deficiencies and I have reached out to see if that was part of the intent.
Re-watching, it’s very interesting that there were a range of wrong answers. Some said 8, but others said 32 (presumably thinking if you halve one value you then double the other) and some said 1016 (keeping total accidents the same).
Compare with the bat and ball question where I’m pretty sure you get one very common wrong answer (Bat costs $1) but not many. The lily pad question mainly gets a wrong answer of 24 days, and some in the 30s as people try to plot out a curve mentally. This question seems to provoke different wrong answers based on very different interpretations of the problem. Which suggest that it’s partly about leaping to an answer but also partly about interpreting the vague question.
If you presented people with that grid with its tempting empty box and said “What number would make B as bad an option as A?” you might still get lots of 1016 answers, but you wouldn’t get many or any 8s or 32s.
In the video at least, it is very much being presented as telling us something about the general population.
Are you talking about providing people with only that info or giving them further and more carefully worded context from the question?
Because I can see lots of people going with “8” if they are thinking you want them to make both options equally bad in terms of the relative rather than absolute numbers.
Otherwise all the same info and wording that people were given in the original. But instead of “What would make these equivalent?”, “What would make B as bad as A?”.