A thought I’ve often had, which seems relevant but in a way I’m not entirely sure how to draw out, is that there are three steps involved in differentiation as it’s normally conceived, and correspondingly three steps involved in integration, and it’s useful to see these separated and how they operate:
By a punctile function, I’ll mean any old function whose inputs are individual points, whatever that means. By a sum-extensive function, I’ll mean a function whose inputs are regions, and such that whenever smaller regions are combined to make a big region, the function’s value on the big region is the sum of its values on the smaller regions. By a mean-extensive function, I’ll mean a function whose inputs are regions, and such that whenever smaller regions are combined to make a big region, the function’s value on the big region is the weighted average of its values on the smaller regions, each region weighted in proportion to its size.
The first step in differentiation is this: you have a punctile function. You turn this into a sum-extensive function defined over intervals, which gives the difference in the function between the endpoints of the interval.
Another step in differentiation is this: you have a sum-extensive function. You turn this into a mean-extensive function, simply by dividing its outputs by its inputs’ sizes.
The last step in differentiation is this: You have a mean-extensive function. You turn this into a punctile function, whose value at a point is the limiting value of the mean-extensive function as its input regions shrink towards that point.
(We can do all this in various higher-dimensional ways, not just for intervals, but I’ll talk in terms of intervals for now)
So differentiation goes punctile original -> sum-extensive differences -> mean-extensive rates -> punctile instantaneous rates.
But many physical examples involve seeing none of the functions involved directly except the sum-extensive differences one.
When we measure something’s speed, we typically have direct access to the distance it has travelled over particular time intervals, but nothing else. (Rulers and metronomes feel primitive in ways which speedometers don’t). We then infer by calculation its average speed over those time intervals (dividing by the amount of time taken), or its instantaneous speed at particular times (approximated by using very small time intervals). And we do not even start with a punctile original here (we do not have some distinguished 0 point in the world; we measure distances as differences from the start).
Integration is the opposite of this:
The first step in integration is to take a punctile function, and turn it into the mean-extensive function giving its average value over intervals.
The next step in integration is take a mean-extensive function, and turn it into a sum-extensive function, simply by multiplying its outputs by its inputs’ sizes.
The last step in integration is to take a sum-extensive function defined over intervals, and turn it into a punctile function, by choosing an arbitrary point and function value at that point (the famed +C), and then assigning values at other points in accordance with the specified differences. [This last step can also fail to be possible in higher-dimensional contexts, the difference between “conservative” and “non-conservative” fields…]
This last step of integration as ordinarily conceived is least important, since it involves those arbitrary choices [and can fail to be possible]. So let’s ignore it. Integration, thus, goes punctile original -> mean-extensive averages -> sum-extensive accumulations.
And in the same way, let us ignore the first step of differentiation, so that we conceive of it as taking sum-extensive differences -> mean-extensive rates -> punctile instantaneous rates.
As noted before, I think the thing we witness most directly in many familiar physical cases is the sum-extensive quantity, and the others are inferred from this.
So differentiation takes a thing we witness directly, and produces from it the corresponding calculated entity. While integration conversely takes a calculated entity, and produces from this the corresponding sort of thing we would witness directly. And somehow, this seems less natural. The thing we would witness directly we would start by witnessing, not end by seeking to find out. And so this maybe has something to do with why differentiation seems conceptually natural and integration conceptually unnatural for some?
My thoughts are inchoate, I don’t know, but something like this may be in play. This is not a polished post, this is a stream of thought ramble, but I’ll put it out there for now.