Veritasium had a good video on some of the background story.
There was an amusing side effect of that problem, which I’ll spoiler since it partially gives things away:
Due to the error, they ended up scoring the test as if the problem didn’t exist. Unfortunately, this meant that some people lost scholarships, etc. due to falling below a scoring threshold! For example, say there were 100 questions and a student just barely reached the 90% correct needed by some agency. But after removing the bad question, they only got 89/99 correct, or 89.9%, which isn’t quite enough.
They should have scored as if everyone got the question right.