Accuracy, precision, something else?

I have a fever. When I take my temperature, it varies by a degree or two from reading to reading. However, it gives me the results to the nearest tenth of a degree.

Does that mean it’s precise but not accurate? Looking online, precision seems to be about repeatability, not number of significant digits. I don’t know how accurate it is, I guess, because I’m not going to shove it in my ear thirty or forty times and log it.

Anyway, factual question is, what measurement means “repeatable”, which one means “returns the actual value” (on average, I guess?) and which one means “returns results to the nearest (hilarious) value”?

Accuracy is a metric of how close the measurement device can get to the theoretically true value without bias. In many cases, this is driven by the means of calibration; i.e. if you are calibrating one thermometer with a heat source controlled by another thermometer, it can only be as or less accurate than the controlling thermometer.

Precision is the metric for how small the deviation of measurement from the true value can be, assuming ‘perfect’ accuracy (no bias), i.e. the +/- value on a measurement scale. This is generally driven by the resolution of the measurement scale, and sometimes limited by fundamental physical constraints such as diffraction, or the anatomical limits like optical resolution of the unaided human eye.

Repeatability is a complicated issue which combines the innate accuracy (i.e. how well “zero’d” the device remains from measurement to measurement), precision, the ability of the user to perform and read the measurement consistently (particularly with an analog device), and the innate variability of the object or phenomenon being measured. Repeatability can only be assessed statistically by assessing a sampling of measurements in relation to the hypothetical population of all measurements and then applying that to a distribution (or using Bayesian methods, updating update a prior inference with posterior measurements).

In the case of the o.p.’s example, human body temperature will vary depend upon the method of measurement and normal fluctuations in body temperature even absent of any aggravating factors (illness, diet, exertion, et cetera), so repeated measurements need to be taken to establish a baseline or verify that an abnormal measurement actually reflects the condition. The thermometer can be highly repeatable but not accurate (bias), and as precise as the tick marks or number of figures on the display allow for, assuming that it can be consistent (not affected by thermal expansion and contraction or electronic ‘noise’).

This is all part of a field of science and metaphysics known as measurement theory, and can become complex and tangled in definitions of what is ‘true’ measurement, particularly when dealing with highly dynamic systems or where the act of measurement influences the phenomenon being measured.

Stranger

Over there in the USA, you use that outdated measure of average temperature for a human of 98.6°F.

At the end of the nineteenth century, Wunderlich determined 37.0 °C as the average normal value of a healthy person’s body temperature within the normal range 36.2–37.5 °C. He used a mercury thermometer under the armpits of about 25,000 people.

Assuming that your actual temperature was constant, if the thermometer registered the same value, down to the decimal point, every time, if would be very precise. If this value was not correct, it would not be very accurate.

If the thermometer registers different values, but shows them with apparent high accuracy (1/10’s of a degree), what you usually have is a device with “empty resolution.” The noise and uncertainty in the reading is far greater than the resolution of the device, so even though it shows 4 digits, it really can only be trusted to 3 digits (or less). It would take many measurements before the readings averaged out and the 4th digit was useful.

2+2=4.001 gives a very precise sum, but not an accurate one.

Resolution and repeatability are components of precision.
Precision, drift, and calibration are components of accuracy.
Accuracy is often the thing that rolls up everything you care about, but not always. For example, you could have a thermometer that was defective or calibrated wrong, but with excellent precision, and it could do a fine job of telling you when your fever started to come down.

And, there are some things with way better precision than accuracy. The idea of a “datum” in surveying is a good example. You could have a small island with a few buildings, some streets, a dock, various other things, and of course a coastline. You might survey all of these things with great precision, such that your map can tell you within an inch how far your front door is from the end of the dock. However, you might only know true latitude and longitude to within ten feet. Experts in these matters will create a datum for that particular island, which in effect chooses a somewhat arbitrary latitude and longitude for some reference point on the island. If somebody comes along with a GPS receiver, they can select your island’s datum instead of one of the modern global ones such as “WGS83”. Their results will then agree nicely with your map.

For an even more extreme example, consider your house inside that map. It might have 1/8" precision in wall placement such that the exterior dimensions are very predictable, but putting them on a map involves a relatively much bigger uncertainty.

Throughout the day, your core body temperature typically fluctuates between 36.5–37.5 °C (97.7–99.5 °F). It doesn’t remain constant; instead, it varies in sync with your circadian rhythm.

So, your thermometer may be precise and accurate, but you’re not.

Back in high school science lab, we were taught some basic rules of thumb. For example, you have to use some common sense to estimate the precision of your thermometer readings. If it is a graduated column with marks every 0.1 degree (or the electronic equivalent), then maybe you can read to within plus or minus half of that. There may also be other, potentially larger, uncertainties depending on the instrument.

Now, one reason your readings vary by a degree or two from reading to reading is that your temperature varies during the day, so I would not assume that your thermometer is broken without further evidence.

People do repeat measurements for statistical reasons, namely, under certain assumptions the standard error (of your average value) will be inversely proportional to the square root of the number of measurements.

Thanks! I think the “something else” I was looking for is “empty precision.”

Here are four readings, almost no time between measurements in each ear, in order:

Right ear: 101.9, 100.6

Left ear: 103.4, 102.5

I endeavored to use it the same way each time (could be user error, of course, but at least it’s repeated user error).

Could also just be a low quality thermometer. Thanks, everyone, for the explanations!

Couldn’t each ear be at a different temperature? Not necessarily a mysterious thermometer problem. In any case, your uncertainty does not at all appear to be dominated by the thermometer precision.

You could say (simply averaging the four readings) that you got a value of 102.1 ± 0.6.

Which still doesn’t give the average normal temperature for any one of those 25,000. Some people run cooler than average, some people run warmer than average.

Each ear was off by a degree from reading to reading.

Just did it again:

Left ear: 101.9, 100.0
Right ear: 98.0, 100.8, 101.1 (I did three here, because the first one seems obviously wrong and was probably user error)

I took readings as quickly as I could record them – left, left, right, right, right.

For starters, I would try using a different thermometer and/or a different point, like under the tongue. I don’t see why a thermometer should be that crappy.

My daughter just bought me a new under-the-tongue one. For my next trick, I’ll look up how long I have to wait after eating or drinking.

ETA: 30 minutes.

NB there are also gallium-alloy medical thermometers (doubt anyone is going to sell you mercury)

CVS doesn’t sell any non-digital thermometers at all.

Anyway, three readings with the new one (orally!): 101.3, 101.5, 101.4

So, that’s a lot more consistent. Tried the ear one again, still all over the map.

Mercury is readily available. It has a number of uses such as gold mining and to create compounds. It is pretty expensive, over $200 a lb. in small quantities. That surprises me because there should be a lot of unused and reclaimed mercury available.

I did not mean that 1 lb of mercury for mad science experiments is not available; I meant that for a pharmacy to sell a mercury thermometer is today illegal in, e.g., the entire EU.

Not that there is some intrinsic unsurmountable problem with digital thermometers, but I have seen at least one glass thermometer with a reference mark on the glass where someone actually calibrated it :slight_smile:

One or two tenths of a degree being the order of precision we should expect.

I have also come across this definition (often using an archery target as an analogy) in many texts, but it differs from how I was taught to understand the concept of precision during my time as a specialist in efficiency and emissions measurement, and from how it is generally used by engineers in my experience.

Regardless of terminology, it is useful to distinguish the following attributes of a measured value: how close the measurement is to a theoretical “true” value, how similar the result will be if the same measurement is taken again, and to what number of significant figures the result is stated.

Metrology texts use the term “precision” to refer to the second of these three, where I understand it to mean the third.

I would describe the OP’s thermometer as “false precision”. Which is also often seen from students with fancy calculators. Yes, your calculator will give you twelve decimal places, but that doesn’t mean that your answer is actually that precise.

If significant digits are used correctly, then the number of digits is a measure of precision. But they often aren’t.