My point is that testing the hypothesis often isn’t as straightforward as it seems, and you can end up doing bad science without realizing. Bad statistical methods are everywhere (see replication crisis) and even when people seem to be using good statistical methods there can be hidden flaws.
Apart from the papers and statistical tests, there is a social layer in science that sometimes works to filter out bad ideas (parapsychology - (man that is typo-prone!)) and sometimes serves to overlook problems (e.g. power poses, much of priming). In the very long run, the data pushes towards correct interpretations, but the process of overcoming the influence of particular personalities or social groups can take decades (“science advances one funeral at a time”).
Edit: People actually have a paper on the funeral effect: http://www.nber.org/digest/mar16/w21788.html