What are examples of authentic Bell Curves

BTW, nice class plan!

Seconding that this is an awesome lesson. Reading up on the Pareto curve, I’m not seeing anyone else claiming that human ability is represented by it, so I’m not convinced by that claim.

Could you do a simple game of skill, such as wastebasket basketball? Ten shots from a moderately difficult distance, everyone tries it and records their score.

I’m curious whether the number of texts in the last 24 hours would qualify, or even be trackable. Maybe the number of distinct text chains that have been active “today and yesterday”? Looking at my phone, I’ve got seven, but I have no idea what that would look like in a high school class.

Thought about it but I really need a pre-made bell curve because I don’t have time to do enough trials to get a true bell curve. But if I could find that premade bell curve they could certainly take ten shots and calculate their z-scores

I demonstrate the CLT through an online plinko simuation so they do get exposure to that math but given my limited time I want the unit to be more about using a bell curve for testing and how bell curves can be misused and I’ll leave the in-depth discussion of how the CLT makes bell curves to their MAT135 college professor.

I was thinking about something like flicking paint at a bullseye and measuring the distance of each spot of paint from the center. Every flick would create a bunch of data points so it wouldn’t take much time to build a large data set. You use a different color each time to make it reasonable to count each new flick, and when you’re done you have a cool piece of art.

If the kids are genuinely aiming for the center instead of going completely wild and fully randomizing the distribution, I think you’d get a bell curve.

Use the ‘sputter’ setting on the butter gun we already have in the Physics Lab storeroom.

It well known in many fields that human performance is often on a Pareto scale. Not just performance, but causes for many effects. This is the origin of the 80/20 rule, which is remarkably robust across many activities. For example, as an educator you may have heard that 80% of classroom disruptions are caused by 20% of the students. In organization theory, 80% of the productivity of a company comes from 20% of the employees. And if you look at just those 20%, you find that 80% of their value comes from 20% of that cohort, so 64% of corporate value comes from 4% of the employees, and 64% of classroom disruption is likely caused by roughly 4% of students. Classroom disruptions are not normally distributed, and there is no bell curve of ‘disruptedness’.

In software development, the 80/20 rule shows up all over the place. 80% of bugs are created by 20% of the developers, for instance. If you gave students an exercise to plot bugs per programmer, you wouldn’t get anything close to a bell curve.

This isn’t a human thing per se - the same result comes out of any complex environment with causes and effects. For example, Pareto himself noticed that 80% of peas in a harvest come from 20% of the pods. So the student’s grass seed experiment mentioned above may not lead to a normal distrinution. instead, you might find length has a pareto distribution.

Pareto also discovered that 80% of the wealth in Italy was held by 20% of the people. Today, 115 years later, in many countries with widely varying laws and tax structures, and after many attempts to ‘fix’ it, 80% of the wealth is still held by roughly 20% of the people in most coumtries. The 80/20 rule is remarkably robust.

It’s not a hard mathematical number, but more of a convergent point or attractor that leads to an emergence of a Pareto distribution.

When I got my Six Sigma certification, we spent a lot of time on the Pareto Distribution, because it comes up again and again in industry, and learning it can make you a LOT more productive. 80% of sales come from 20% of products. 80% of factory downtime comes from 20% of problems. 80% of management problems come from 20% of the managers. And so it goes. Complex environments with humans in them often wind up with Pareto distributions of causes and effects. Our toolset in Six Sigma was heavily based on finding the ‘vital’ 20% and focusing effort on it, wherever it was.

The first test you should do on any cause/effect data is a test for normal distribution, because very often it’s not.

If you want to talk pure human performance, here’s an example from baseball:

This explains why there are multi-million dollar players. They really do stand out from everyone else in very measurable ways.

GE, Motorola and other companies took this to heart years ago by creating A,B,C categories for employees. ‘A’ employees are the top 10 or 20% - the ‘vital few’ that make or break an organization. ‘B’ employees are the middle tier - important but replaceable, who do a good job but don’t stand out. ‘C’ employees are the bottom 20%, and they create 80% of the problems and little value, and either have to be managed into the ranks of the ‘B’ employees or let go. The prime role of HR in those days was to find and hire ‘A’ talent and keep them happy, fill the ranks with ‘B’ talent, and ease the ‘C’ employees out of the company or get them into a Performance Improvement Plan. All based on Pareto distribution of performance among employees.

Dumb question about the Central Limit Theorem if you don’t mind. (And a slight hijack.)

So let’s say the distribution for a large population is not Gaussian - it’s some weird shape. I grab a random sample. It is also not Gaussian. I calculate the mean of the sample. Lather, rinse, repeat. All of the means (from all of the samples) will have a Gaussian distribution, correct?

Assuming the above is correct… so what? What “good” is the Central Limit Theorem? How is the (Gaussian) distribution of all the means useful for me?

It’s the basis for other statistical procedures, like hypothesis testing. Since you know that the set of all possible samples of size n from a particular population have sample means that follow a bell-shaped curve, you can calculate the probability that the sample mean you got when you collected your particular sample would be as high/low/extreme as it was, if some (null) hypothesis were true. If this probability is low enough, this gives you justification for rejecting this null hypothesis.

Here’s something that might be interesting. 2023 batting averages for Major League Baseball players.

This is based on an average of 3.1 plate appearances for games, or roughly 500 per season so it eliminates outliers like bench players and those whose seasons were limited by injury. Still, it’s a good sample size, and the source material lets you break the universe into subset (e.g., by league, position, left- or right-handed, etc.)

Looks like a Pareto distribution. Notice that the batting average of the first 25 players dropped from .354 to .278. The next 25 only dropped from .277 to .267. The third from .266 to .258.

Classroom scores tend to be bimodal, with one peak from students who studied and another from students who didn’t.

If the actual distribution meets certain requirements. Not all do. If you try that with something on a Lorentzian distribution, for instance, you’ll get another Lorentzian distribution.

Is that true if the teacher grades on a curve?

I think the ACT is built to match a curve (based on previously administered tests). In other words, they want the test to produce a bell curve (ideally). That way a school can have some reliable way to know which students are ahead of the curve and desirable to have in their school.

I admit I have no experience of this stuff so looking forward to the corrections on this.

I’ve never had a teacher grade me on a classic bell curve, where it’s mostly Cs, equal Bs and Ds, equal As and Fs. I’ve had plenty of teachers grade on “curves”, but they’ve always been these weirdly idiosyncratic constructions that varied wildly from teacher to teacher.

FWIW I asked this question in another thread so as not to hijack this question (I saw a mod earlier in another thread politely telling people to not hijack so trying to be good).

Raw scores tend to be bimodal. A teacher could process them in any way imaginable, some of which would preserve the bimodality and some of which wouldn’t.

The point of the Central Limit theorem is that even if your basic distribution isn’t gaussian, if you take a lot of samples it’ll tend over the long run to produce a gaussian.

Take that example I gave of adding twelve random numbers together. The ideal random number between 0 and 1 will have a flat distribution – every number between 0 and 1 is equally likely.
Now pick two random number and add them togeter. The probability distribution of the sum isn’t flat anymore – it’s triangular, starting with virtually zero probability of 0 or 2, but with the highest probability of 1. A plot of probability goes from zero at 0 to a peal at 1, then abruptly turns and heads back down to zero at 2.
Now take the sum of three random numbers. The plot of the probability now has four sections. It now starts to resemble a gaussian more, with a peak at 1.5. If you keep adding random numbers together and plotting the probability distribution, by the time you get to twelve random numbers it starts looking pretty smooth and gaussian, with a peak at 6. (Thaty’s why you subtract 6 if you want a distriibution peaking at zero).

The probability distribution of the sum of all the pips on a lot of dice behave the same way, only “quantized” (you can only have integral values). The more dice you throw, the more it resembles a gaussian.

But here’s the interesting part – we did this with random numbers having a uniform distribution. But the Central Limit theorem says that if you start with any distribution and add enough of the random numbers together, you;ll get a good approximation of a gaussian. It’s also true if your distributions don’t all have the same width. The result of adding enough items together, even if they don’t start out with a gaussian distribution, will be a gaussian distribution.

That’s why it’s so important, and is the natural result of concatenating large numbers of thijngs together.

There are cases where your distribution won’t look like a gaussian. Poisson statistics, or bimodal distributions where you don’t do something like adding random numbers together will retain their characteristic shapes, so you deal with them using the appropriate statistics. But the gaussian shows up as the likeliest distribution so much of the time that a huge amount of our statistical treatment assumes that your distribution looks like that.

Well, for values of “80” and “20” that can vary by 50% or more, at least.

The Gini index of income/wealth inequality, for example, shows a much wider range of inequality levels across countries than a simple “80/20” division. Many of which levels have indeed been quite responsive to economic policy choices.

The idea that “the 80/20 rule” is useful across a broad spectrum of human activities, in any more precisely quantitative way than the common-sense general interpretation “a few effects are comparatively big, but most effects are comparatively small” is AFAICT not borne out by any rigorous analysis. Factual cites to the contrary welcome, of course.

I never said it was precise. I said the opposite. In your own quote of me you apparently missed the word ‘roughly’, and you didn’t bother to quote the next sentence:

Do you deny that many human activities and results follow a Pareto distribution?

I don’t think that sentence really makes sense. To claim that an unspecified set of very vaguely described real-world phenomena are “converging” to an 80/20 split, or that the 80/20 number is an “attractor” value for the evolution of all those systems, is to suggest technical properties for the phenomena that AFAICT aren’t supported by analysis.

I think you might be conflating the concepts “Pareto principle”, which as I said is basically the common-sense observation that “in many real-world systems a few effects are comparatively big while most are comparatively small” and “Pareto distribution”, which is a specifically defined probability distribution function of a random variable characterized by given parameter values.

The particular parameter values in a given Pareto distribution don’t necessarily produce the 80/20 split that people are usually talking about when they refer to the Pareto principle.

I agree that many human activities and results conform to the above general observation “a few effects are comparatively big while most are comparatively small”, but I think it’s a pretty elementary fact that doesn’t tell us much about the quantitative sizes of the effects in any given system.

I also agree that many human activities and results follow some kind of Pareto distribution in the actual mathematical sense, with specific values given for its scale and shape parameters. But those parameters, as I said, may lead to a wide variety of results, not necessarily the 80/20 split that seems to be generally intended by the more common buzzword use of “Pareto”.

ISTM that this is still marginally relevant to the OP’s query about “authentic Bell Curves”, since what we’re discussing is relevant to the level of quantitative exactitude we need to distinguish between, for example, the behavior “a few effects are comparatively big while most are comparatively small” (Pareto principle) and the behavior “most effects are middling while a few are bigger or smaller” (bell-curve-ish).

But if the mods or the OP think we’re straying too far from the topic here, feel free to resume the discussion elsewhere.