NFL Playoffs - Probability Question

Over in this thread there is some discussion of the NFL’s scheduling and division system.

An NFL conference of 16 teams is structured into 4 divisions of 4 teams each. 6 of these 16 teams make the playoffs: the top team from each division, plus the 2 top teams that didn’t win their division. This means that if there happens to be a division with none of the top 6 teams, then some team in that division goes to the playoffs, and one of the top 6 does not.

So the question: what is the probability that there’s a division with none of the top 6 teams?

I’m not asking what the odds are that at the end of the season there’s a division winner with a worse record than a team that misses the playoffs (although that might be interesting too). I’m asking about taking teams ranked 1-16 and splitting them into 4 divisions. That is, if you take the numbers 1 to 16 and randomly split them into 4 piles of 4, what’s the probability that at least one of the piles has no number below 7.

I got as far as changing this to splitting 6 green and 10 blue marbles into 4 piles, and looking for a pile of only blue marbles. However I don’t see an easy way to count this.

If I’m understanding correctly, the top team in each division goes to the playoffs.

Hence, each division sends at least one team to the playoffs. Correct?

Once I understand this a little more I can help you answer the probability question.

You are assuming that each teams record is independent of it’s division, right? I don’t know hockey schedules, but I assume you play teams in your own division more frequently, so if they are much better, because you play them more often, your record suffers disproportionately (aka “a tough division”). Should this be ignored for this analysis?

Yes, you understand.

Four divisions of four teams.

The team with the best record in each of these, regardless of the actual record, advances to the playoffs.

Exclusing those teams qualifying for the playoffs above the teams with the best two records also advance to the playoffs.

How’s that?

In order to make this a probablity problem, you must describe mathametically how the final win-loss records would come about.

Let’s just say they are random. Then the “Blue Division” not having one of the six top records can be determined by calculating the chance of drawing six balls from 16, with none being a “blue” ball.

12/16 times 11/15 times 10/14 times 9/13 times 8/12 times 7/11

I’ll leave it to others to mutliply that out. That’s the same chance as the other three divisions but we can’t simply multiply by four, because we would be counting some chances twice – occassions where two divisions do not have one of the top six teams.

The outcomes can’t be random though, because many of the games played are played against someone else within the same conference. If we were teams in the same conference and we played one another, the outcome of that game(s) affects both our final records, so they are not independent of one another etc.

As described by aahala, there are some rather serious dependency issues here that make it really tough to come up with a meaningful number.

Bah, I knew this was going to happen :). I’m not asking about the actual results of a season - I agree that’s too complicated with all the interrelationships of the schedule. I just want to know:

[digression for explanation]
What I’m thinking is that the function of the regular season is basically to sort teams for seeding into the playoffs. If there is no “good” team (top 6) in a division, then this sorting will be wrong, because one of the teams from this bad division makes the playoffs.

When I use “good” in this context, I’m referring to some abstract ordering of the objective quality of the teams. I’m saying that this ordering exists in some sense, and the regular season is an attempt to discover this ordering of the relative strength of teams. A “perfect” schedule and playoff system correctly seeds the objectively best team as #1, second best as #2, etc. The NFL system is not perfect, because when the situation with a “bad” division occurs, a team from this division is placed too high in the sort order.
[/dfe]

That’s a bit easier to compute, but not necessarily relevant. I’ll think about it. The best way to get an actual estimate for this is to count the number of times that it has happened and divide that by the number of football seasons.

Ok. Multiply out what I wrote before then multiply by 4, then subtract

3(8/16 x 7/15 x 6/14 x 5/13 x 4/12 x 3/11)

There is likely a much shorter way to write the solution using fractorials–but I don’t know the formula.

I’ve been thinking of this playing one another in the conference and random results. I’m not positive, but believe if every team played every other(which they don’t) the same number of times, the total results could be considered random.

I see your argument, but I don’t understand where you’re getting those numbers from. This is how I was trying to do it: Looking at one specific division, I get the odds of there being no top-6 team to be:
10/169/158/14*7/13.
( I think this is 10C4 / 16C4 )

Since there are 4 divisions, we go:
4*(10/169/158/14*7/13).

But that double-counts every time that there are two divisions with no top-6 teams, so we have to subtract those. For a specific pair of divisions, the odds of there being no top-6 team in either is:
10/169/158/147/136/125/114/10*3/9

There are 6 pairs of divisions, so the total number of double-counts is:
6*(10/169/158/147/136/125/114/10*3/9)

And the final total is
4*(10/169/158/147/13) - 6(10/169/158/147/136/125/114/10*3/9)

borschevsky

I must have a mental block. I don’t see where you are getting the 10/16. What does the 10 represent?

My first 12/16 is this. To say the final records are random, is the equivalent mathematically of saying we can just randoming pick out of the hat one of the sixteen teams and claim it having the best record.

So, we both start with 16 balls. My hat has 12 non-blue and 4 blue. What’s the chances the six picks will not include a blue - 12/16*11/15 etc.

Now the overlap - there are six equal ways we can have TWO divisions not being one of the top six.

For this, I have 8/16*7/16 etc. and subtract.

Why did I multply by “3” rather than “6”.

Well, imagine I have measured two boards each as 10 feet but I know there is a two-foot over lap. To get the net distance I don’t subtract 2 feet from each lenght and add, I subtract one-half the overlap from each and add.

The 10 is the number of “bad” teams. For a division to have all bad teams, the first team can be one of the 10 bad teams out of 16 total; the second team can be one of the 9 remaining bad teams out of the 15 total remaining, etc. for the four teams. So 10/16 * 9/15 * 8/14 * 7/13.

I think you can also look at this by saying that there are 10987 / 4321 ways to pick a set of 4 bad teams, and 16151413 / 4321 ways to pick a set of any 4 teams. So the odds of getting only bad teams if you choose randomly is:

( 10987 / 4321 ) / ( 16151413 / 4321 )
= 10987 / 16151413

Ok, I see what you’re doing here. You’re marking the balls as 4 “in the division” and 12 “not in the division”, and selecting 6 balls to represent the good teams.

I was marking the balls as 6 good teams and 10 bad teams, and then selecting 4 balls for the division.

Duh.:smiley:

Now I see what you did. It’s the same thing(in the first step). My 12 and 11 cancel and Voila! The same numbers as yours top and bottom.

I decided to run a simulation rather than do an exhaustive search, as I can’t figure out a good way to do the latter. In summary, the probability as caculated above is approximately .440559. After a million trials, the simulated probability comes out to .626306. I’m thinking the calculations may be off.

Just to make sure I didn’t screw up, I’ll publish my code.

football_test.cpp:


#include <iostream>
#include "FootballList.h"

int main()
{
	long double cProbability = (4.0 * 10.0 * 9.0 * 8.0 * 7.0)/(16.0 * 15.0 * 14.0 * 13.0) - (6.0 * 10.0 * 9.0 * 8.0 * 7.0 * 6.0 * 5.0 * 4.0 * 3.0)/(16.0 * 15.0 * 14.0 * 13.0 * 12.0 * 11.0 * 10.0 * 9.0);
	printf( "The calculated probability is %g.
", cProbability );

	FootballList simulator;
	long double sProbability = simulator.Run();
	printf( "The simulated probability is %g.
", sProbability );
	return 0;
}

FootballList.h:


#pragma once

class FootballItem
{
	public:
		FootballItem( int number ) { m_number = number; m_next = m_previous = 0; m_division = 0; }
		virtual ~FootballItem() {}

		int GetNumber() { return m_number; }
		int GetDivision() { return m_division; }
		FootballItem* GetNext() { return m_next; }
		FootballItem* GetPrevious() { return m_previous; }

		void SetDivision( int division) { m_division = division; }
		void SetNext( FootballItem* next ) { m_next = next; }
		void SetPrevious( FootballItem* previous ) { m_previous = previous; }

	private:
		int m_number;
		int m_division;
		FootballItem* m_next;
		FootballItem* m_previous;
};

class FootballList
{
	public:
		FootballList();
		virtual ~FootballList();

		long double Run();

	private:
		void Clean( FootballItem* delendum );
		void MarkRandomItems();
		bool CheckForSuccess();
		void Reset();

		FootballItem* m_first;
};

FootballList.cpp:


#include "FootballList.h"
#include <stdlib.h>
#include <time.h>

const int total_trials = 1000000;

FootballList::FootballList()
{
	FootballItem* last = m_first = new FootballItem( 1 );
	FootballItem* temp = 0;
	for ( int i = 2; i < 17; ++i )
	{
		temp = new FootballItem( i );
		temp->SetPrevious( last );
		last->SetNext( temp );
		last = temp;
	}
}

FootballList::~FootballList()
{
	Clean( m_first );
}

void FootballList::Clean( FootballItem* delendum )
{
	if ( delendum != 0 )
		Clean( delendum->GetNext() );
	delete delendum;
}

long double FootballList::Run()
{
	srand( time( 0 ) );

	long double successes = 0;
	long double trials = 0;
	while ( (int) trials < total_trials )
	{
		MarkRandomItems();

		if ( CheckForSuccess() )
			successes = successes + 1;
		
		Reset();
		trials = trials + 1;
	}
	
	return successes/trials;
}

void FootballList::MarkRandomItems()
{
	int divisions[4] = { 4, 4, 4, 4 };
	int division = 0;
	FootballItem* temp = m_first;
	while ( temp )
	{
		do { division = ((unsigned int) rand()) % 4; } while ( divisions[division] == 0 );
		divisions[division] = divisions[division] - 1;
		temp->SetDivision( division + 1 );
		temp = temp->GetNext();
	}
}

void FootballList::Reset()
{
	FootballItem* temp = m_first;
	while ( temp )
	{
		temp->SetDivision( 0 );
		temp = temp->GetNext();
	}
}

bool FootballList::CheckForSuccess()
{
	bool divisions[4] = { true, true, true, true };

	FootballItem* temp = m_first;
	while ( temp )
	{
		int number = temp->GetNumber();
		if ( number <= 6 )
		{
			divisions[temp->GetDivision() - 1] = false;
		}
		temp = temp->GetNext();
	}

	return (divisions[0] || divisions[1] || divisions[2] || divisions[3]);
}

The 0.440559 is correct. It can be written as
4 * choose(10, 4) / choose(16, 4) - 6 * choose(10, 8) / choose(16, 8)
where (of course) choose(n, k) = n!/k!(n-k)!.

Your practice of adding each ball to the unfilled bins with equal probability gives the wrong answer. It’s easy to see that this would give the wrong probabilites with four balls, two “good” and two “bad”, going into two bins. Your procedure would give a 1/2 chance of the “good” ones winding up in the same bin (if the two “good” ones came first in the procedure, or last; otherwise it’s 1/4). The probability from distributing them in the desired way is 1/3. Kind of subtle.

For what it’s worth, my simulation yields:

probability of at least one “bad” division = 0.44064 +/- 0.00016
probability of two “bad” divisions = 0.020967 +/- 0.000045

Unfortunately, you can’t do that as the NFL started this format in 2002. None of the other big 4 have had this division format so we can’t even look to other sports for comparison.

We can look at the meager historical data we have and see for the most part, it’s iffy. In 2002, the top 6 teams from the NFC did indeed make the playoffs and while there was a division winner tied with a wild card team (SF & NYG were 10-6) no team made the playoffs that should not have. The AFC was iffy as there was a division winner at 9-7 plus 4 teams tied at 9-7 fighting for 2 WC spots. Could that 9-7 divison winner have been worse than the other 9-7 teams? Possible, but not likely as they had the tie-breaker with 2 of those teams.

In 2003, the NFC was once again right on with the top 6 making the playoffs and a division winner havign the same record as a WC team. Still, everyone made it that should’ve. Once again, the AFC was a little cluttered with a division winner having the same record as a non-playoff team. Could that 10-6 division winner have been worse than the 10-6 non-playoff team? Maybe.

As you no doubt realize, the actual 1-16 rankings are built upon divisional considerations. Thus, the divisions themselves are woven into the fabric of the rankings. That’s not really an issue in your question, I just thought I’d point it out.

To get four drawings of a set of 10 within 16 possibilities would be:

(10/16) * (9/15) * (8/14) * (7/13) = (10987) / (16151413) = 5040 / 43680 = 11.5%

That’s an 11.5% chance that the first division you run through the trial will have no team seeded in the top 6. Hmmm, with three other divisions, it gets too confusing for me. So let me tackle this from a different viewpoint.

What is the chance that all four division winners will be in the top 6?

That seems easier to compute. The basic question is, where do the top 6 go? Since all 16 are to be randomly distributed, the top 6 are randomly distributed as well. Thus we can completely ignore the bottom 10 altogether.

It doesn’t matter where the first goes. The second now has 15 possible slots, 3 of which are the same division as the first. The second has 14, the third has 13, etcetera.

So you can simply generate 6 random numbers between 1 and 16. Assume the East is 1-4, the North 5-8, the South 9-12, and the West 13-16. Allow no duplicates, and count up the number of times you get bad divisions.

I wrote a simulation in Excel so that anybody can run it. Follow these steps:

  1. Open Excel
  2. Tools => Macro => Visual Basic Editor
  3. Insert => Module
  4. Copy and paste the code at the bottom of ths post to that new, blank module
  5. Go to the Immediate Window (View => Immediate Window)
  6. Type “test” (without the quotes) and hit enter.

The results I get are that all is good in the world 35% of the time, and 65% of the time you get at least one scrub division. (57% you get exactly 1, 8% you get exactly 2.)

Where did I go wrong? Or am I right and the 44% is wrong? :confused:


Option Explicit

Private mintTopSix(1 To 6) As Integer
Private mblnDivision(1 To 4) As Boolean

Public Function Test()
    Dim i As Long
    Dim lngTrial As Long
    Dim lngTrials As Long
    Dim intNext As Integer
    Dim intFailures As Integer
    Dim lngStats(0 To 4) As Long
    
    Randomize
    
    lngTrials = 100000
    For lngTrial = 1 To lngTrials
        For i = 1 To 4
            mblnDivision(i) = False
        Next
        mintTopSix(1) = Int(16 * Rnd + 1)
        For i = 2 To 6
            intNext = 0
            Do While StillLooking(intNext, i)
                intNext = Int(16 * Rnd + 1)
            Loop
            mblnDivision(((intNext - 1) \ 4) + 1) = True
            mintTopSix(i) = intNext
        Next
        intFailures = 0
        For i = 1 To 4
            If Not mblnDivision(i) Then intFailures = intFailures + 1
        Next
        lngStats(intFailures) = lngStats(intFailures) + 1
    Next
    Debug.Print "Total Trials: " & lngTrials
    For i = 0 To 4
        Debug.Print "# of times with " & i & " bad divisions: " & lngStats(i) & " (" & Int(lngStats(i) / lngTrials * 10000) / 100 & "%)"
    Next
End Function

Public Function StillLooking(pintNext As Integer, plngCurrent As Long) As Boolean
    Dim i As Long
    
    If pintNext = 0 Then
        StillLooking = True
    Else
        StillLooking = False
        For i = 1 To plngCurrent - 1
            If pintNext = mintTopSix(i) Then StillLooking = True
        Next
    End If
End Function

It seems obvious that it is going to happen sooner rather than later. In 2003 the AFC playoff teams were (*****):

AFC East
*****New England 14-2
Miami 10-6

AFC North
*****Baltimore 10-6
Cincinatti 8-8

AFC South
*****Indianapolis 12-4
*****Tennessee 12-4

AFC West
*****Kansas City 13-3
*****Denver 10-6

Had Baltimore lost one of the games they won (say the 44-41 victory over Seattle) then their 9-7 result would still have won the AFC North but would have left them seventh best team behind Miami’s 10-6 in the AFC East.

The fact that a team plays 6 games against its own conference and 4 against the “opposing” conference will tend to make matters worse.

Say the AFC North consisted of 4 weak but even teams. If they split their own conference games and all lose to the NFC North teams every team will be 3-7 with only 6 other games left to improve their percentage.