I am looking for some good practical guidelines about testing whether data are normally distributed. I have probably tended to err on the side of using nonparametric statistics. I tend to just plot a histogram, check for some known “gotchas” and then essentially just do the Shapiro-Wilk test which is built into my statistics package (currently STATISTICA v. 7), except when there are tied (more than one sample with the same result)or censored (below detection limit) data. I also usually try log transforming the data to see if that makes a difference. Perhaps fortunately, I have not run across too much data that appear to be normally distributed minus the censored values. A new project has guidelines which require us to test for normality, specifically citing the Anderson-Darling and Ryan-Joiner tests. Rather than continuing to push the buttons and check boxes in a stats package, I’m trying to do this the right way and understand the underlying strengths and weaknesses of these methods. If it makes a difference these are samples from groundwater monitoring wells.
In researching this a bit further, I have looked around online and found the suggested tests and a few other tests that are not built into my stats package, including the D’Agostino-Pearson K-squared test. Some of these appear to be available in excel add-ins or in R, which I have not used but have been meaning to try to learn.
Anderson-Darling: I read differing recommendations about appropriate sample sizes for this test, and I’m not sure whether whether the test should be used with censored data.
The Ryan-Joiner test appears to be available in MINITAB, but I cannot find out much more about it; I did not download the trial software or try and find a reference for this test from the primary literature yet.
The D’Agostino-Pearson K squared / omnibus test seems very promising for use where there are tied data. I found an Excel add-in called SOLVERSTAT that purports to compute this statistic.