When have you ever seen or used a stem-and-leaf or box-and-whiskers graph in the wild

Ditto.

But I’ll never see another Sankey diagram without thinking of Aella’s birthday gangbang chart…

That really cries out for more explanation or a cite. I’m not doubting you, I’m just ignorant of whatever the rest of this story is.

It’s reached KnowYourMeme. Not safe for work if your boss actually reads the text. :laughing:

Aella has a substack you can google if you want to know more.

That’s the same kind of diagram as that famous map of Napoleon’s incursion into Russia.

And not safe for work, indeed.

I had to make a few box-and-whisker graphs (by hand!) during one of my grad-school statistics classes, in the late '80s. I see them every now and again, in online articles where there’s data analysis being done, and I still know how to interpret them, but I haven’t had to create one in over 35 years.

I don’t recognize the term “stem-and-leaf graph,” and I don’t know that I’ve ever seen one in the wild (or recognized it for what it its).

Thank you. Great cite.

I am retired and I am the Boss. Very interesting. Time to read further of this Aella person.


@kenobi_65:
I don’t think of stem-and-leaf tables as graphs. They’re a different sort of utterly non-visual categorization of values. In CS terms wed call that a bucketization of data values. So I quibble a bit with the OP on this stuff too.


As to the formal box+whiskers, it seems to be a way of visualizing the basic stats of one sample set drawn from some much larger blob of data. With the unstated assumption that you’d take lots of sample sets of the same data and convert each sample set into an individual box+whiskers, then plot the series of sample sets as a series of box+whiskers icons.

I’m just a bit lost on what sort of data blob and samples would make sense to do this with.

Typical for a vaccine study might be a box showing the log of the antibodies in the blood samples of all the participants on days zero, 3, 8, 10, 14, and 21. So six little boxes with whiskers, each summarizing the data for a day post vaccination.

I frequently use box plots at work. It is an easy way to compare the distribution of multiple groups within the entire data set. The only caution that I would have is that every software package seems to do the box and whiskers slightly differently.

It may be slight differences in calculating the 25th, 50th (median), and 75th percentiles. How do you calculate a median with an even number of data points? There are different ways to do it.
Mostly it is how to define the length of the whiskers, and the outliers. Each program seems to use something slightly different.
For that reason, I always recommend describing how the box and whisker plot is made, which most of my clients (and a few of my colleagues) do not do.

The one built into Excel doesn’t calculate any of that. It just plots the bar based on numbers you enter into an array. You can calculate whatever values you want to shove into that array.

Yeah, as with any other graph, it’s often important to label what you’ve done.