Of course it isn’t. Like so many things in life, this is a matter of calibration. In the December 2025 issue of The Stata Journal, Nick Cox summarizes current thinking on simplicity:
Simplicity as a goal or guide, even if elusive or illusive, is often recommended, or on the contrary, recommended against. Discussions range from the epigrammatic to entire monographs (for example, Sober (2015)). I like Alfred North Whitehead’s (1920, 163) summary, “Seek simplicity and distrust it”, as well as Sydney Brenner’s (1997; 2019, 67) warning against Occam’s Broom, “used to sweep under the carpet any unpalatable facts that did not support the hypothesis”. For the record, there is no evidence that Albert Einstein said that “A theory should be made as simple as possible, but not simpler”, or something to that effect (Calaprice 2011, 475). But someone else undoubtedly did.
I added the Reddit link explaining Occam’s broom to the above quote. Wiki dictionary’s definition: “(humorous, philosophy) The dishonest tendency to conceal relevant facts.”
It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.
The model that accurately captures reality is the best model.
The best model is therefore as complex as the slice of reality it intends to represent.
For models of spherical cows falling in vacuums that can be very simple indeed.
Conversely, for a model of everything going on in a single human cell, the simplest accurate model will have thousands or tens of thousands of interacting components.
I mean, it’s fit for purpose. When we were developing mathematical models (for implementation into business processes), I suggested that the simplest model was the best first model, and then after that you could start considering how close to best you were. Sometimes that simplest, first model was very good, sometimes it wasn’t.
All I really want to add here is about this fellow Occam, sometimes celebrated, sometimes maligned.
“Occam’s Broom” is the disreputable refuge of dishonest scientists. It has no place in ethical scientific practice.
“Occam’s Razor”, however, has been unfairly maligned. It’s quite closely aligned with the apocryphal story of what Einstein allegedly said, though what he really did say was close. The basis of my belief here is entirely statistical – it’s the belief that among a bunch of competing hypotheses or explanations for some observed phenomenon, the simplest one that fits all the known facts is the one that’s likely to be true more often than alternative ones that are more convoluted or posit facts not in evidence. This is by no means always the case. I just said “correct more often than not”.
Small disagreements with Newtonian physics at extreme scales of distance or velocity, where relativity becomes relevant, would be more likely ascribed to measurement errors than anything else. The idea that a whole new realm of physics was about to be born to explain these small discrepancies would have seemed ridiculous. But how often does someone like Einstein come along to reveal a whole new depth of understanding? It’s usually safer, on average, to stick with Occam than to rely on the next Einstein.
Depends on your perspective. If pi = 3 because you first calculate pi and then round to 3, then 3 is actually more complicated (because it requires an extra step, rounding) than pi.
The meta-challenge is knowing when your model is good enough. E.g. pi=3 is plenty good for most mental math and DIY estimation. But woe betide the folks who compute a satellite orbit using pi=3. Same model; both good enough and not good enough depending on what it’s applied to.
Particularly for predictive models, it is very challenging to know not guess, what factors will be important in the near future that are presently just lost in the sauce.
So I’ll suggest overall that the OP question: “Is the simplest model the best model?” is itself a very simplistic reductionist question to ask. Certainly worth discussing, and I don’t mean any of this as a criticism.
But we really need to elaborate all the ways it’s too simple of a model to actually use to discuss the requisite complexity of models in general.
I think the simplest model that meets all the criteria necessary for a successful solution is the best model. As Scotty dryly commented, “The more you overthink the pluming, the easier it is to stop up the drain.”
Agreed. My gut level response is, “I can’t answer this question at that level of generality.” The thread title was a wrapper for a somewhat whimsical quote I came across and thought I might share. (I wanted this to be a debate though, as opposed to MPSIMS.)
The quote was buried in an article about quadratic approximation in a statistical context. The underlying issue reflected the concern that the researcher veering away from linearity might be increasing accuracy or they might just be fitting their curve to noise. If your model is intended to predict scenarios that are outside the historical record, quadratic approximations can go very wrong. If your model is intended to understand the historical record, specifically one aspect of the historical record, then throwing in extra variables can help rule out the possibility of confounding factors.
So, it depends. That’s probably the answer to all overly general questions: it depends.
Griffen provided a good a good counter-example: pi estimation. 3 is arguably simpler than 3.14, which is simpler than 3.14159, which is simpler than the underlying transcendental number. Or not: these sorts of discussions can run aground when the central term - simplicity - has not been adequately defined. Which is why Einstein’s original treatment wasn’t particularly epigrammatic.
That was how it struck me too. Hence the aphorism about “as simple as possible, but no simpler.” That trailing disclaimer is essential to the functioning of the maxim.
Ideally you want all the signal and none of the noise. Leaving out parameters that are just noise is the way to achieve simplicity. Leaving out parameters that are signal is how you achieve the failure of too simple. It’s the parameters that are some of each, and especially some of each to differing degrees depending on the values of yet other parameters that bring the devil out to play in your details. And laugh in your face.
I almost included a paragraph on overfitting risks of too-complex models.
So I’d argue strongly against this. Simplicity absolutely should be both the goal and the guide. Occams Razor is a really good philosophy. IF two solutions are equally correct then the simpler one is better. Obviously if one is more correct that’s just the price you pay for a better understanding of the world. But it’s equally wrong to add unnecessary complexity as it is to sweep facts under the rug to protect your simple model IMO
“Correct” and “simpler” do a lot of heavy lifting. From another perspective, the more complex model may be easier to work with, e.g. an incremental model versus an elegant, simple, but impractical model, such as Runge-Kutta numerical integration versus Sundman’s Theorem.
Or maybe consider two models for losing weight: (1) eat less calories than you burn, (2) keep to 2000 cal per day, per the attached meal plan, and every M W F run five miles or do X sets of stadiums, and attend Sat pm weight loss support meetings…
Though the Runge Kutta method is a better model than Sundman’s in most circumstances. In fact in practice you can use both to produce a model that is even more complex but more accurate.
Though we have to separate algorithms from models. It’s fairly common to have complicated less accurate model that’s is used in place of a simple model more accurate one, because in practice it can be implemented as an algorithm that can be run on a computer. That’s pretty much the case for all numerical methods they were pretty much an intellectual curiosity before computers came along (why would you use a more complicated worse model when there is a more accurate simpler one?)
Though again that’s just an implementation detail the fact this actually works better is a demonstration of the the fact that there is a better, more complicated, model (taking into account metabolism, genetics, psychology, etc) than just weight loss = calories eaten - calories burnt.
This is a really good example of a simple but less accurate model being worse in practice than a complex more accurate model. It’s not that weight loss = calories eaten - calories burnt is wrong, per se. it just doesn’t explain all the details and the devil, as they say, is in the details. In this case the extra complexity is worthwhile and adds value.