I enabled “think”, “deep think”, etc. for the models that had it. I didn’t want to bias them by altering the prompt DMC gave though - I pasted it as-is. You can see the chatgpt chain of thought if you click the link.
ETA: part of the fun isn’t nudging the model to get the right answer, but to see if it can get it without knowing how to nudge, or what to direction to nudge it to.
Asking it to be thorough is not altering the query. Asking it to show its work is not altering the query.
ETA: None of the above is nudging anything. It is asking the AI to do its best and show its work (be more verbose in its response).
ETA2: The second question is nudging fer sure but we show the initial response as is. Then we see if we can use that to get to a better answer and the AI will change. I think everyone gets that…I have shown all first responses as given…right or wrong.
If you’re small enough and the goat is big enough you can not only feed the goat at no cost off scrub growing wild on your land, but also ride it into town. So once you have the goat you can sell the car. And stop paying for gasoline.
Just one problem with that, and why overfitting gave the wrong answer. I never mentioned a “goat”, nor the host opening any door for me, which is why my question was not the Monty Hall Problem.
In a list of 10 items, how many commas would there be if I did not use Oxford commas.
Here is the reasoning I was given:
9 commas.
Explanation:
Rule: When listing three or more items in a series, you use commas to separate each item.
No Oxford Comma: Omitting the Oxford comma means you do not place a comma before the conjunction (usually “and” or “or”) that precedes the last item in the list.
Example:
If your list was: item1, item2, item3, item4, item5, item6, item7, item8, item9 and item10.
You would place a comma after each item except the last one before the conjunction, resulting in 9 commas.
I love that it spells out the list, but doesn’t simply count the commas. This is one of the many types of “thinking” that separates computers from humans. We’d simply double check our work if we took the time to write it out like that. "R"s in strawberry and all that.
ChatGPT o3 got Peter Cannon with just your original prompt:
It was Peter Cannon, Thunderbolt—Charlton Comics’ white-gi-clad, Tibetan-trained hero.
In story after story (starting with Thunderbolt#1 in 1966 and popping up many times afterward) Peter would trek back to Tibet’s mystic Cave of the Peaks—billed on the splash pages as “Where All Things Exist!” to meditate, consult ancient scrolls, or scoop up a handy batch of miracle herbs for his ailing pal Tabu.
This would have 0 commas (unless items themselves contain commas)
What if items contain internal commas?
For example: “Paris, France” counts as one item but contains a comma. The question asks about commas in the list structure, not within items, so I’ll assume we’re counting structural commas only.
What if it’s a different type of list?
Semicolon-separated list (for complex items): 0 commas in the list structure
Parenthetical list: (1) item (2) item… → 0 commas
Given that you asked specifically about “not using Oxford commas,” this strongly suggests we’re talking about a standard comma-separated list in a sentence, where the Oxford comma question is relevant.
Therefore, my answer is 8 commas - one between each pair of adjacent items, except between the 9th and 10th items where “and” or “or” would appear instead.
I asked Claude AI this query and added (since its been done) to parse carefully:
Let me identify what’s explicitly stated:
There’s a game show
There’s a new car behind ONE door
There are “terrible prizes” behind the OTHER doors (plural)
I pick door number 1
The host “offers door 2”
Critical ambiguity: What does “offers door 2” mean?
This is NOT the standard Monty Hall problem statement. In the classic version:
You pick a door
The host OPENS a losing door (revealing a bad prize)
The host offers you the chance to SWITCH to the remaining unopened door
But here, the host simply “offers door 2” - this could mean:
The host is offering to let me switch to door 2
The host is offering to open/reveal door 2
The host is suggesting door 2 as an alternative
Also unclear:
How many total doors are there? (The plural “other doors” suggests at least 3 total)
Does the host know where the car is?
What are the rules governing the host’s behavior?
Has any door been opened/revealed yet?
Without these crucial details, I cannot determine if switching is advantageous.
In the classic Monty Hall problem with 3 doors, switching doubles your odds (from 1/3 to 2/3) because the host’s action of revealing a losing door provides information. But your scenario lacks the key element of a door being revealed.
Possible interpretations:
If the host is just offering a switch with no door revealed: Without additional information, switching from door 1 to door 2 provides no advantage - both have equal probability of containing the car.
If this is meant to be Monty Hall but poorly stated: Then yes, switching would be advantageous.
If the host’s offer itself contains information (e.g., the host has a bias or strategy): Then we’d need to know the host’s behavior rules.
The answer depends entirely on details not provided in your question. The most literal reading suggests no advantage to switching, since no information has been revealed to update the probabilities.
The difference is that 0.9̄ has ALL 9s recurring forever, while 0.9999998̄ has a “deficient” digit (8) that appears infinitely often in the decimal expansion, preventing it from reaching 1.
The exact value is 9999998/9999999, which is approximately 0.9999999000000099999990000001…
It looks like the newer versions of many of the engines are starting to get better about overfitting. I tried several others that also got it right. So far only DeepMind failed this one.
I wonder, if in an effort to ease CPU loads, the developers have the AI take some shortcuts. Same as we do when we see the Monty Hall Problem but really isn’t. Once told, look at this a bit more carefully, the AI nails it. But without that extra command to be careful the AI will slack off a bit (same as most people do too).
An interesting insight. I am learning how to make better queries because of this.
One of the common ways of reducing overfitting of an LLM is taking shortcuts, but not with finding the answer. The shortcut is in the training portion and is called Early Stopping.
At this point, I’m wondering if the answer to common queries is simply being cached?
IOW when it identifies this is (or is close to) the Monty Hall problem, it simply spits out the canned answer (Either Monty Hall or not-quite Monty Hall) that was iteratively honed to perfection by many earlier queries, then cached as “the answer” based on sufficient positive user feedback.
Oopsie. Yes 8^3 = 512, not 256 as I thought. Sorry. But that is the answer, Sum of the cubes of the digits. So the sequence continues after 1458 as
702, 351, 153, 153,…
I chose 15 as a starting point at random. Wherever you start you have to end at a cycle (which might be of length 1, that is a fixed point).