In the thread How to distinguish between minority opinions and misinformation, many posters said a scientist must formulate a hypothesis when conducting research. And indeed, this has been hammered in to me all my life, starting in grade school. But I’ve also read that a scientist should formulate a null hypothesis (not a hypothesis) when performing research, and the results can either “reject” or “not reject” the null hypothesis. Which makes perfect sense to me.
So here’s what I’m wondering about: when someone performs research and formulates a hypothesis, is the null hypothesis automatically the opposite of the hypothesis? Or when someone performs research and formulates a null hypothesis, is it automatically the opposite of the hypothesis? Also, is it possible that one could exist, but not the other?
I have no expertise in this area.
I will just add that Wikipedia says a null hypothesis is used in statistics. It is not to prove the “opposite” of something but, rather, that something occurred by chance.
Investopedia makes the same case.
The best way I’ve seen in expressed is that, when doing a study you always start with a decision between two options that you are trying to make. The “default option” is the option that you would take on that decision if you were not allowed to collect any data at all (and as such, it is kind of a political/cultural decision). The null hypothesis is what must be true about the data to be collected in the study for the default option to be justified.
For example, in a where the decision is whether to use Fartromax™ on cancer patients, the default option would be not to use Fartromax™, on the grounds that most substances aren’t good cancer treatments so we don’t use them as such without getting data. The null hypothesis is then that patients given Fartromax™ do no better than patients not given Fartromax™.
Then you do the study, and decide whether the data collected is consistent with the null hypothesis or not. If you decide that it is not then you are justified to switching from the default option to the alternative option in your original decision.
That is essentially correct. The null hypothesis is that whatever event you are examining occurred by chance or due to effects other than the cause. If “the null hypothesis is rejected”, it means that the likelihood that the event (or collection of events) occurred by chance is below some statistical threshold; usually 5% is used. So, for instance, if your hypothesis is that a pair of six-sided die are biased, you roll them a hundred times each in twenty independent trials and look at the difference between the expected distribution and your experimental results and find that the distribution exceeds some expected variants more than once in your twenty trials, the null hypothesis (that the exceedence is just due to random variation) is rejected and you expect that there is some factor biasing your results.
Note that just because the null hypothesis is rejected does not mean that your preferred hypothesis is verified; just that there is some source of bias in excess of random variation of a normal distribution (or whatever distribution you are using). This is the origin of the “correlation is not causation” soundbite which is technically correct but it does not mean that you cannot use correlation as evidence of causal effect if you can show a causal mechanism and apparent absence of other influences.
The frequentist approach is most useful in experiments where you can run repeated controlled trials, or situations where you have a large amount of relatively homogeneous data like astronomical observations. It does not work very well in the analysis of most real world data where there can be many cofounding variables unless you can use some kind of analysis of variance (ANOVA) or other techniques to demonstrate a dominant affect of particular parameters, and even then it is rare to have clean enough data or large enough sample sets to formulate and reject a null hypothesis.
Stranger