I actually asked this question a while ago but I didn’t get a satisfactory answer.
Probably because I wasn’t specific enough.

If I have a data set, with a mean of 0 and a standard deviation of 1, and I assume that it is normally distributed, then I would expect that 95% of the data lies between -2 and 2. Similarly, I can construct any confidence interval, for any data set with a known mean and standard deviation, if I assume a normal distribution.
If my data set has kurtosis of 400 then I know that the confidence intervals are wrong as the tails are rather fat and the distribution is peaked around the mean.

My question is, has anybody figured out how to incorporate skewness and/or kurtosis to calculate reasonably accurate confidence intervals

As long as you have iid draws, you can compute the 2.5% and 97.5% quantiles of your sample and use that as a (consistent) estimate of the population quantiles. This will work regardless of the distribution that you’re drawing from. That’s not really a confidence interval in the sense that the phrase is usually used, but I think it matches up well with what you’re trying to do.

Without having an assumption on the distribution of the population, you cannot meaningfully construct confidence intervals. At very least, it has never been discussed in any of the statistics classes I’ve taken. I don’t have a ton of advanced statistics knowledge, but I would think that there are any number of distributions with the first few moments the same. There might be a standard distribution you’re supposed to assume if you calculate the skewness and kurtosis to be significantly different from a normal distribution, but I’m not familiar with it.

The best way to construct confidence intervals without assuming an underlying distribution is to do them empirically as ultrafilter mentioned.

If I’m understanding you correctly: When you have a mean and a standard deviation and no other information, one would typically assume or approximate the distribution as Gaussian. And you’re asking, if one has a mean, standard deviation, and kurtosis and/or skew, if there’s some other distribution (or more precisely, family of distributions with those three or four parameters) which one should assume as an approximation?

What everyone stated above is correct. We generally assume a normal distributions when constructing confidence intervals because (due to the central limit theorem) normal distributions crop up all the time, particularly when dealing with conglomerations of small perturbations. If you are observing data with a kurtosis of 400 then you are clearly not observing a normal distribution. The question is what distribution are you observing.

If you have a large number of data points, you can use the empirical confidence intervals as described by ultrafilter. But if you have only 50 data points, then your confidence intervals are going to be dependent on just a few outlier points.

If you have a understanding of the process that went into making the data, then that might also suggest some distribution. If all this fails, and if your data is sort of bell shaped, then you might want to look at Generalized normal distributions which have an additional “shape parameter” along side the mean and variance that adds kurtosis or skewness to the normal distribution. You can plug in your observed skewness and kurtosis and solve for this shape parameter using the formulas in the link above. This is called method of moments estimation.

At the end of the day its probably also good to check that your data fits the distribution you assumed. This can be done using theKolmogorov–Smirnov test

On thinking about it some more, you might also get some use out of Chebyshev’s inequality. No matter what the distribution, nor how non-Gaussian it is, no more than 1/k[sup]2[/sup] can lie beyond k standard deviations (so, for instance, for a 99% confidence interval, 10 standard deviations will always suffice). Note that this is only an upper bound: It might still be possible for the confidence intervals to be an arbitrarily small fraction of the standard deviation, depending on the distribution, and it’s even possible for a perfectly well-defined distribution to have infinite standard deviation.

Chebyshev’s inequality is pretty harsh. For a 95% confidence interval it basically assumes you have a 3 point with 95% of the data at 1 point in the middle and 5% at the either of the two ends. This is great for proving theorems where you want to consider every possible case, but except in the most extreme cases its confidence limits will be way over estimated. To look for fractions of the SD to get confidence intervals, we get back to assuming or fitting an alternate parametric distribution.