Statistical validity of soccer scoring

A soccer team’s attacks are not independent events. A team will play differently leading 2-0 than losing 2-0.

And a scoring chance isnt due to the attackers’ skills, but the complex interplay between the attackers’ skills and the defenders’ skills.

Senegal’s defeat of France was not as much of a upset as you might think. Senegal were runners-up in the African Championship, and almost all the Senegalese squad play in the French soccer league in the top division.

There’s only one objective way to judge the better soccer team, and that’s goal scoring. If the US beats Germany 1-0, then they were the better team that day, no matter what anyone says. It doesn’t mean that Germany won’t slaughter them the next time the two teams play, but that for those ninety minutes they were outplayed.

Soccer does not reward scoring opportunities. If it did, Mexico would’ve killed us, and we’d have beaten Poland.

Michael Davies, special columnist for ESPN, is keeping a diary at the tournament and remarks at how little respect US soccer gets from the traditional powers (South America and Europe). After the Mexico win he was told by a British journalist that the US were lucky to have beaten Portugal…Portugal “deserved to win.”

EXACTLY!

If scoring was a matter of luck such as a toin coss, then the top world leagues would not see the results that we have. Rather, what we see are leagues where a couple of teams dominate for example Argentina= River Plate or Boca Juniors, Spain= Real Madrid or Barcelona, England= Manchester or Liverpool, etc., etc. This does not arise out of cheer luck, but due to teams being able to acquire players with high degree playing skill for large amounts of
**$$$ whereas “poorer” teams have to do with less skilled players and hope to G-d they won’t go to a lower division.

True!

Let me add that weaker team may acquire a goal for the following three reasons:

  1. It played better. Scored one goal and clogged their lower part with defense by using a 4-4-2 or even 5-4-1. It happens. Even the New York Yankees lose! :wink:

  2. Penalty. Spain nearly lost it due to a stupid move by Hierro agains Ireland with a penalty. Stronger teams can “give” a goal to a weaker team via this way.

  3. Human error. It occurs in any sport. Example, Italian player against Korea. Italy leads 1-0. Italian player stops a ball in their own penalty area, is not able to stop it properly, inadvertantly plops the ball in front of the Korean player, who proceeds to kick it to the goal. Game goes 1-1. Of course, if the Korean player did not have the skills nor the expertise, he could have kicked the ball no where near the goal.

Let me reiterate, yes, there is always a degree of luck involved in any sport, but it is the great player with skills that win the games. Soccer is no different.

xicanorex

I think that if you were to devise some sort of wargame that would reflect more accurately the chances of each team that you would have to apply a goddly number of factor, before finally applying the random factor genrator such as a dice.

You would need to take into account the individual 'quality of each player, then another on how well the interacted with each other and then include things such as tactical ability of the manager, leadership of the team captain, wether the ground is neutral or which team is at home or away.

The number of scoring attempts is not as relevant as might appear, you would need to asses the quality of each scoring chance, and the quality of the person making the attempt, thus one team with a superb midfield might generate plenty of chances but their strikers might not be good enough to make the goal, and vice-versa.

It is no accident that the best players command the highest wages, and so end up playing for the wealthiest clubs in the world, and you will not be surprised to note that these same clubs win just about every trophy there is, especially in regional competitions such as the UEFA champions league.

You could do an analysis of the wage bill for each team, it would reflect pretty nearly the the actual outcome.

It certainly is different.

World cup: a few pool games (pool is randomly seeded), and 4 more games to win it all. If you start off at least good enough to get out of pool, you only need 4 games.
(Analogy: New England Patriots. Obviously at least a competent team, had 3 arguably lucky games to win the super bowl.)

NBA Basketball: 82 reg season games determine playoff seeding. In any game, there are ~100 possession per team. In playoffs, each round is a multiple game format, so luck is lessened. You must survive 4 rounds of multiple games. Vastly more difficult to beat the good teams.

muttrox, your (or your friend’s) analysis is correct, but the conclusion of your next post is flawed (that the second team wins 83% of the time). Just above that, you said

This is 83%, but it’s the cases of a win or draw. The OP refers to wins alone, which is 49% (confirmed by the numbers you provided). Perhaps it was a bit slippery of the OP to rule out draws there, since one could just as easily say that B is more than twice as likely to win as A (49:17).

Whether this actually applies to soccer scoring may be a moot point, though in my opinion (and from the looks of this thread) it probably doesn’t.

Good catch – it’s my fault for having phrased the problem badly to my friend. I believe your 49:17 would be the correct figure then, or 74%.

I also agree the coin analogy has serious problems, but it was part of the OP, and seemed to drive the poster’s thinking, so I felt obliged to speak to it.

This is definitely the thread of the day IMO.

How about if we examine a sport that has the most scoring chances, i.e. basketball.

In an individual basketball game, a team may have anywhere from 80-100 possessions. Does this high number of scoring chances make it impossible for a less talented team to beat a more talented team? Not really. The best NBA teams will usually end up with a better winning percentage than the best MLB teams, but the poor teams still have a decent shot at winning a game.

This year’s Lakers team lost to several last place teams. They even lost to the Bulls TWICE, including the matchup in Los Angeles.

Even the most talented basketball team may just have a day when it can’t shoot well, just as the most talented soccer team may have trouble getting the ball in the net. Also, the best basketball team may have a hard time playing on a poor team’s home court (and in the World Cup ask Portugal and Italy about playing Korea in Korea). And, basketball is also subject to highly subjective officiating that can greatly affect the outcome of the game (ask Italy about playing Croatia or Korea).

However, the best basketball team (or at least one of the top few) almost always wins the championship.

Actually, I was going to add that point.

I meant to draw a parallel between basketball and soccer’s penchant for one-shot upsets, yet overall dominance by a few teams.

But note the difference in what it takes to win. In Soccer, you can get a lucky pool seed. Or you can have one lucky game and make it out of pool play. Then 4 games… certainly a lot of luck is required, but it’s certainly conceivable for an underdog to make it pretty far. Like the USA

In basketball, the regular season does help your seeding. And each round of the playoffs is multiple games, it’s much harder to be that lucky for so long.

So I hypothesize (you can argue the evidence forever), that there will be many less upsets near the end of basketball playoffs than near the end of The World Cup… that is, who’s left standing.

Yes, when it comes to the NBA playoffs or the World Cup, it is a bit easier for an underdog team to win it all.

However, there still hasn’t been a real underdog team that has won a World Cup, with the possible exception of Uruguay in 1950.

What we’re debating now is whether or not a single elimination tournament is the best way to determine a champion, not whether or not the ultimate outcome of a soccer match is more the result of chance or skill.

I’d like to thank everyone for their input (well, almost everyone…). I’ve learned a bit about soccer, but mostly I’ve changed my understanding of how an answer must be framed. It seems like if we develop a model and then try to show that the model yields a lot of error or noise, the model will always be assailable as inaccurate.

Perhaps the only way to unequivocally state something about soccer scoring is to review the history and analyze the distribution of actual scores for particular teams over time and then to compare this to a corresponding analysis of another sport. Then we could say something certain like “soccer scores are more or less noisy than baseball scores”. But even so, a critic could claim that because soccer has more injuries or whatever, that this still is not a valid comparison.

One friend of mine put it this way, “I don’t think there’s any real argument against the central issue: that low scoringness constitutes coarser granularity of sampling, which should be less desirable than higher scoringness if you expect scoringness to somehow estimate some goodness metric for soccer playing.” My thesis in a nutshell - soccer scoring is not a reliable metric for goodness in soccer playing.

We discussed what one of the earlier posters had said about my estimate of number of “scoring drives” being too low. I asserted that however you define “scoring drive” doesn’t really matter and I left it open. Regardless of what granularity you use to define “scoring drive” you’ll then have to hit the real-world stats to correlate a percentage chance to score during that interval. The smaller the granularity on “scoring drive”, the correspondingly smaller the probability of scoring must go.

I pointed out to him that even in the model the respondent suggested (i.e. 100 “scoring drives” with Canada having a .25% chance to score and Brazil having a 2% chance to score) we still have a team that is eight times better than another team barely winning by only one point at the buzzer! If anything, this supports my thesis rather than refutes it.

My friend did a little MATLAB program to simulate this more quasi-continuous model. Results:

Brazil = 2%, Canada = .25% ==> P(Brazil wins) = 80% after 100 intervals
Brazil = 2%, Canada = 1% ==> P(Brazil wins) = 60% after 100 intervals

I think my original assertion that a team has to be enormously better than another team to have a reliable chance at winning stands. I don’t dispute that we have some “superpower” teams. In order to have teams that win reliably, they would have to be superpowers, but because of the randomness in the scoring, a group of teams that really have a wide disparity in actual ability to put the ball in the goal will still manage to form a league in which each has a shot at winning. That’s really all I’m saying. If dominant teams emerge, it’s not because they’re somewhat better than the others, it’s because they’re greatly better than the others.

Well, in all honesty, I was hoping the Big Man would step in and give us the final, compelling answer. It seems all we can do is teem and hope…

I think that baseball is something of a special case, because so much depends on the starting pitcher. A relatively poor team might have one really good pitcher, and stand a good chance of winning in the one game out of 4 or 5 where he is pitching. That would lead to higher variance in baseball outcomes.

This entire thread is based on the false assumption that score is used to determine which team or individual is better.

Although in most cases the assumption is true, there are games in every sport where one team completely outplays another, yet loses.

Or because while a single game is not a good metric, a whole season is. The granularity of a single game then becomes less relevant, no?

I think your friend has stated the issue very well.

But note that while you can compare various sports to each other, and say which should be better than another at determining who is best (as I’ve been doing with soccer, american football, and basketball), this doesn’t tell you whether or not any of them are reliable metrics at all. There is some threshold of reliability. Maybe all sports easily meet that threshold, maybe none do. Maybe some do, and some don’t. I don’t see anyway to know this unless you get a real precise mathematical model of how the games works, which is next to impossible.

So all you can say is: Soccer is not as good at determinng the better team as sport . You can’t make the added jump that soccer doesn’t meet some arbitrary poorly defined threshold of it’s reliable or it isn’t.

Score is used to determine who wins. In theory that should line up with who is better.

That doesn’t seem very complete to me, but whatever. I don’t think using scoring as a test of “betterness” is crazy in any way, shape, or form.

Almost all player stats in all sports are based off of their ability to get their team a point, either through assist measures or direct scoring.

Well, is not a 100% perfect reliable measurement system since you model of “flipping coins” is a very linear and does not lend itself to a more complex set of real life events.

And that is why I love soccer. It may seem simple, but it has it’s complexity of team play.


**

Well, in ANY sport, at the beginning of the season, ALL have chance of winning. Yet, in real soccer examples, based on a single-table, such as English Premier League, it is only those with better teams that remain on the top flight whereas those with lesser ability teams end up on the bottom table.
**

Sorry, semantics, but it is because they are just better not greatly better.

XicanoreX

I agree with your points Eris, as score generally will show the better team. However, there are instances where the team or individual who performs best do not win.

Score is used to determine a winner, not to determine who played the game better.

For lack of any better way to do this, I will give you an example from American football.

Team A drives into Team B’s territory, and scores a field goal. They later do this again. Team A leads 6-0. Team A, late in the game, drives again deep into B’s territory. Team A throws an interception, returned by Team B for a touchdown. Team B now leads 7-6. Team A again drives into team B’s territory with time running out, only to miss the game winning FG.

Team A outgains Team B in total yardage 450-125. Team A has one turnover to Team B’s 3. Team B was completely outplayed in every aspect of the game, yet emerges victorious 7-6.

I fabricated the above scenario, but anyone who watches or participates in team sports can vouch that these situations occurr. There is nothing more frustrating than knowing you completely outperformed your opponent, yet when the final bell/buzzer sounds, the score is not in your favor.

I think this is one of the great allures of team sport. Victory seems certain for the vastly superior team, until one play changes the final outcome of the contest.

Basically–the best team does not always win.