I’d like to thank everyone for their input (well, almost everyone…). I’ve learned a bit about soccer, but mostly I’ve changed my understanding of how an answer must be framed. It seems like if we develop a model and then try to show that the model yields a lot of error or noise, the model will always be assailable as inaccurate.
Perhaps the only way to unequivocally state something about soccer scoring is to review the history and analyze the distribution of actual scores for particular teams over time and then to compare this to a corresponding analysis of another sport. Then we could say something certain like “soccer scores are more or less noisy than baseball scores”. But even so, a critic could claim that because soccer has more injuries or whatever, that this still is not a valid comparison.
One friend of mine put it this way, “I don’t think there’s any real argument against the central issue: that low scoringness constitutes coarser granularity of sampling, which should be less desirable than higher scoringness if you expect scoringness to somehow estimate some goodness metric for soccer playing.” My thesis in a nutshell - soccer scoring is not a reliable metric for goodness in soccer playing.
We discussed what one of the earlier posters had said about my estimate of number of “scoring drives” being too low. I asserted that however you define “scoring drive” doesn’t really matter and I left it open. Regardless of what granularity you use to define “scoring drive” you’ll then have to hit the real-world stats to correlate a percentage chance to score during that interval. The smaller the granularity on “scoring drive”, the correspondingly smaller the probability of scoring must go.
I pointed out to him that even in the model the respondent suggested (i.e. 100 “scoring drives” with Canada having a .25% chance to score and Brazil having a 2% chance to score) we still have a team that is eight times better than another team barely winning by only one point at the buzzer! If anything, this supports my thesis rather than refutes it.
My friend did a little MATLAB program to simulate this more quasi-continuous model. Results:
Brazil = 2%, Canada = .25% ==> P(Brazil wins) = 80% after 100 intervals
Brazil = 2%, Canada = 1% ==> P(Brazil wins) = 60% after 100 intervals
I think my original assertion that a team has to be enormously better than another team to have a reliable chance at winning stands. I don’t dispute that we have some “superpower” teams. In order to have teams that win reliably, they would have to be superpowers, but because of the randomness in the scoring, a group of teams that really have a wide disparity in actual ability to put the ball in the goal will still manage to form a league in which each has a shot at winning. That’s really all I’m saying. If dominant teams emerge, it’s not because they’re somewhat better than the others, it’s because they’re greatly better than the others.
Well, in all honesty, I was hoping the Big Man would step in and give us the final, compelling answer. It seems all we can do is teem and hope…