Is the simplest model the best model?

It illustrates well the rejoinders, “It depends”, and “For what purpose”? Epigrams about simplicity invariably have undefined terms and undefined context. You need at least a few sentences to be meaningful and even then clarity could be added by comparisons to other contexts.


Nick Cox mentions Sober’s 2015 work, Ockhams Razors: A User’s Manual. I haven’t read it but here’s a review:

https://ndpr.nd.edu/reviews/ockhams-razors-a-users-manual/

Intro paragraph from the academic journal Notre Dame Philosophical Reviews:

Elliott Sober’s book is a welcome and impressive contribution to current philosophy of science literature on confirmation and scientific reasoning. Ockham’s razor is, roughly, the idea that simpler or more parsimonious explanations, hypotheses, or models should be preferred, other things being equal. While the idea that simplicity is a theoretical virtue is familiar to scientists and philosophers and some philosophical literature exists on the topic, Sober’s book is the only up-to-date philosophical monograph on the topic I know of. Moreover, Sober’s book is clearly written, broad in its scope, extremely astute, and generally unafraid of relevant technicalities while remaining user-friendly. In short, this is a book well worth reading.

314 pages!

Woah, there’s some astuteness in paragraph 2. I add bolding:

…Two main themes emerge from this historical overview. The first is that there are, as Sober’s title suggests, several versions of Ockham’s razor. Thus, Sober distinguishes the razor of silence from the razor of denial. The first recommends agnosticism about causes that are unneeded to explain the phenomena, while the second recommends inferring that superfluous causes do not exist (p. 59). The second theme… . Hence, a shift towards secularism and away from a teleological conception of nature in science and philosophy in the twentieth century led to new rationales for Ockham’s razor.

Put me in the razor of silence camp. I doubt I’ll buy the book, but I plan to finish the review.

You could argue that is adding complexity without producing a more accurate model :wink:

In general, I’m not a fan of debating semantics but I feel like the answer really depends on how you want to define your terms.

I could, for example, say that the best model is the one with the greatest financial ROI and it would be darn impossible to debate that - for most examples of real world use of modeling - without delving into ivory tower mysticism, despite the “more obviously clear” answers that are more in the realm of things like “the one that best matches observations, even minus constant rebalancing” or “the one that has the most grounded and least debatable foundations”.

Hard agree.

I think I’m missing your point. ROI can be criticized because it doesn’t take into account opportunity cost, inflation, or risk. Nothing mystical about that. Though maybe your use of ROI is embedded in a wider model that takes such things into account.

Or maybe you are saying that you can’t debate choices of objective functions on the level of maximizing profit vs maximizing some measure of social welfare. While descriptions of the world can be debated, values are trickier.

Or maybe you are saying that you can’t tell whether your ROI model best predicts firm behavior. If so, I would disagree.

Apologies: I just don’t follow. I do agree with your thesis.

Let’s say that I build a model to tell me when to invest and how much to buy.

If I plonk $4b into developing the model for ultimate perfection and if takes 30 days per run to spit out an answer, gobbling several tens of thousands of dollars in electricity per day of running, then it’s going to be a long road to getting more value out of my model than I put into it.

If I can spitball a good enough number on the back of an envelope and start cranking out reliable, above-market results tomorrow, then that’s a far better ROI.

Models are, generally, simplifications of reality and/or require finer and finer voxel sizes or larger agent counts or more numerous simulation runs, etc. to produce more accurate results. At some point the complexity (e.g. R&D) of fleshing out the simplifications into more nuanced and complete versions, or power and time costs of running the model are going to swamp out what you actually get back in value from the model.

Ultimately, models are generated to fulfill some goal. The goal is generally going to be the measure, not the model itself. And, in real world use, most models are probably built to produce a financial profit.

So the best model is the one that has the best financial ROI over its lifetime.

By “real world use” are you excluding scientific models?

No, there you’d be judging based on lives saved but, the more you invest into “weather prediction sim”, the less you’re investing into “volcano explosion sim” and “asteroid prediction sim”. There’s still a tradeoff where you’re balancing budget to payoff.

I was just saying that there’s probably more people at work building models of things in the private sector than the public so financial outcome is more often liable to prove to be the relevant metric rather than a lives saved one.

To that I would add that while complexity sometimes scales with cost, that is not always the case. Parsimonious and variable heavy models can be equally costly to run, so the question turns to predictive power, robustness, reliability, or other criteria.

One reason that simple models are better than complex ones is that with a simple model it’s harder to forget that what you have got is in fact only a model.

I’ve been reading recently about the Ricardian Vice, which is the tendency to:

  1. Find a real world problem
  2. Create a model of it, which is to say, make a bunch of simplifying assumptions
  3. Solve the problem in the model
  4. Assume that you have solved the problem in the real world.

You can find this everywhere, from people drawing crayon lines on maps of the Middle East while asking “why don’t they just build a pipeline” to the Global Financial Crash, which tragedy was threnodied by a Greek chorus of financial whizzkids repeating “but the models said that shouldn’t ever happen”.

Even with a simple model you might be tempted to ignore everything in the real world that you didn’t put into the model, but you’ve less excuse for doing so.

The discussion around Occam’s razor (or more generally the principle of parsimony) is often framed in terms of “simple” or “simplest” and ironically that is a dangerous oversimplification. You often hear it presented as “the simplest explanation is the correct one” or even just “the simplest explanation is the best.” “Simple” is an ambiguous and hugely load bearing word here.

I think it helps to reframe it - and this is closer to the original wording anyway - as “when making an explanation you should introduce the fewest new assumptions possible” - let me give an example to show why this is important.

If I offer an explanation for the history of life on Earth, that it likely evolved from some primordial process around energy gradients that became self-replicating molecules and gradually life, and then it evolved through selective pressures over billions of years into countless forms of life with incredible diversity, and that these naturalistic processes alone can explain that history and that diversity… and a creationist merely offers “God did it”

The common misunderstanding of Occam’s razor could justify selecting the latter explanation as being simpler. You’re offering a complex process with incredible diversity and variation and construction over time. Your opponent is offering divine intervention. It sounds simpler. But it doesn’t introduce the fewest new assumptions. We can explain the history of life on Earth pretty well with a naturalistic model. Abiogenesis is a separable hypothesis which is somewhat speculative (a new assumption) but it’s a far more reasonable new assumption than invoking God because it’s grounded in our understanding of chemistry and not in wishful thinking. Our scientific explanation is relatively complicated. But it doesn’t require massive new assumptions. It uses things that we’ve established to be true (or at least extremely consistently good explanations that have held up time after time when tested) through repeated rigorous study and model building. In comparison, “God” is quite a bit less known and more contentious.

So “God did it” is simpler. That matches the lay misconception of the Occam’s razor better. Which is why the lay misconception is wrong. “God did it” requires a massive new assumption, which is why it actually violates Occam’s razor.

When Napoleon asked Laplace why he didn’t explain the role of God in his models of the solar system, Laplace said “I have no need of that hypothesis” – not that God doesn’t exist, but that his model of the solar system works without having to invoke God, so why invoke God? It makes the explanation worse. Less parsimonious. That was actually a bit of a bold example of scientific and philosophical rigor in a time when it was common to work God into every explanation unnecessarily to satisfy cultural expectations. But it was rigorous and exactly in the spirit of Occam’s razor.

Let me give you a hypothetical example. I go to the grocery store. I see my neighbor’s car in the parking lot. I go inside and see my neighbor shopping. After I’ve purchased my groceries and leave, I see my neighbors car is gone from the parking lot. When I drive home, I see his car in his own driveway. The most obvious explanation? He drove himself to the grocery store, bought some groceries, and drove home.

An alternate explanation? He drove to the grocery store to buy groceries, loaded the groceries into his car, and then an alien space ship showed up, lifted his car with a tractor beam, and flew him home, dropping his car off in his driveway. He unpacked his groceries and went inside.

These two competing explanations are the same number of steps (arguably, depending on how you define steps) and so they’re equally “simple”, but one is MASSIVELY preferred over the other. We know that people drive their cars to get places. It’s routine, it’s established. Introducing an alien space ship with a tractor beam is a MASSIVE new assumption. Two, actually. There’s no reason to assume that tractor beams are practical tech even for aliens. And what’s more damning is that it introduces no new explanatory power. The mundane explanation about driving home fully explains all observations. There is nothing the alien space ship explains that isn’t already explained. There is no reason to make the alien spaceship explanation. It adds nothing. And you might think “the driving home explanation is a little simpler anyway, so it would win anyway” and you’re right - I could probably come up with a more air tight example - but it’s somewhat irrelevant. The introduction of the new assumption of the alien space ship is the massive crime, not that it arguably increases the number of steps.

My opinion, this is what Occam’s Razor (and more broadly the principle of parsimony) were designed to say – examine why you’re adding elements to your explanation. when are they new assumptions? do those new assumptions give more explanatory power? if not, discard those new assumptions barring compelling evidence that would change how much explanatory power they offer.

What’s the function of the model?

I mean there are models whose function is to provide a simplified version of something so one aspect of reality can be considered at time. The assume a spherical cow case.

Those that are best trying to predict what will occur in the future.

Those that are intended to make good enough predictions at minimal cost (however you define the cost).

And those that are just trying to explain what happened to the leftovers.

And why is the model complex? Sometimes a simple models gets layers added onto it to explain why it was not making good predictions - everything revolves around earth, pretty simple, gets complicated with epicycles.

Of course simplest is not always best. But generally for most purposes the simple explanation that appears to do the job needed is the default best until it isn’t doing the job well enough.

From as statistical point of view a simple model that fits the data fairly well is often better than a more complicated model that fits the data extremely well. This is because there is such a thing known as “over-fitting”. Basically your additional complexity is just modeling the noise of your individual observations rather than the underlying signal, and so will perform worse on new data.

As a result when testing model accuracy most methods apply a penalty related to the complexity of the model.