I have a continually changing number of items and each item can be any age greater than or equal to 0.
My problem is that Excel (not a statistical tool I know) interpolates a percentile between the set of items. So I have a finite number of items and the 95th percentile provided by Excel lies somewhere between 2 of them.
How accurate is this? Should I be truncating the 95th percentile at the value closest too, but less that the Excel provided value? Should I be using a completely different method or tool for this? This seems like it makes sense as time is continuous, but my items are discrete.
Virtually every time percentile is used it’s used on discrete measurements. Even in cases when the underlying item is continuous, you’d be dealing with discrete sampling. You’ll only ever have an exact 95% percentile if the number of observations is divisible by 20.
Within sample, it doesn’t really matter what you do. If, for example, there are 86 items, then the top 4 are above the 95th %tile and the bottom 82 are below it. Any number between those two would serve perfectly well in informing everyone whether or not they were above the 95th %tile.
Now if you were using that sample to predict the 95th %tile for another group (e.g., a later set of students to take the same test), it would probably be better to interpolate. In this case, because .0586 = 4.3, you’d report 0.7the fourth best item + 0.3*the fifth best (or you could use more complicated multi-point interpolation). You might round this number to the extent that measures themselves only had discrete outcomes. (E.g, if they were prices it would make sense to report the 95th %tile to the nearest cent.)