I read the news reports with great interest. Since, I am not entirely convinced that heavy marijuana is as benign as people say… But I wasn’t willing to pay $30 to read the research paper. The one commentto the actual paper says basically the same thing as Mr. Lior. What I can’t believe is that it got reported so widely and none of the main stream media, pointed even the basic flaws, for example the sample size of 20…

Hell 20 gets into the plural of anecdotes is not data range. Both The Journal of Neuroscience and the MSM should be ashamed of themselves.

I don’t know, 20 participants is a lot for some studies that involve in depth studies of the brain. In social psychology, large sample sizes are needed because the effect being studied is very small, and easily overwhelmed by random noise. But, if the effect is really large, but don’t need a large sample size to see the effect. My question is how long the effect last more than anything.

It is actually worse than 20. There were only 7 self-reported casual users. I suppose you could get away with 7 subjects if the difference were dramatic. Like measuring the impact of bullets in the head on brain functioning. 7 people would clearly be enough to conclude that being shot head impairs functioning. Anyway the effect wasn’t large, not large enough to be statistically significant. The author came up with a new term; trending toward statically significant, which if I correctly remember from statistics class is no different than statistically insignificant. At least at high 90%,or 95% confidence level.

Also the value of “statistical significance” is wide open to debate. As I recall (no reference at hand) the guy who came up with the notion of statistical significance hated the idea that it would be applied as a filter for scientific results.

So let us say for the sake of argument you have very high confidence in the statistics arising from your data – error, variance and so on – and you have demonstrated that the results are “statistically significant” with respect to your hypothesis. That means usually that there is a probability of 0.95 that the results are accurate and are not just the result of bad die rolls in the sample. So 1 in 20 such studies are bogus? But more importantly, the meaning of the results is still open to debate. So you reject the null hypothesis. Have you detected a correlation or a causation? Is there a hidden variable controlling your sample? There may be no way to tell from such elementary forms of analysis, and yet that is often the key consideration made by reviewers in deciding whether science is good or bad.

More the way it is applied and interpreted. The p level is a useful measure of whether a result is likely due to chance, and that’s how it should be used, but at higher (% due to chance) p levels the likelihood of making a Type I error (reject the null hypothesis when it is true) is higher. Recent* discussion has called for the threshold p value (which has long been “the norm”) to be lower than 0.05 (ie. lower than 5% likelihood due to chance), more like 0.005 or 0.001 to reduce the chance of Type I errors. If you do a statistical test and your p value is < 0.0001 then you can be pretty sure you’re looking at an actual effect, taking into account other factors (eg. r value, a measure of variability), other variables involved in the analysis, common sense, etc.

You are right though that alternatives to null-hypothesis testing have been proposed (eg. Bayesian stats) to directly address this, but p values are still useful when applied correctly.

*Recent as in recently brought up, yet again, because this has been going on for a long time. Most people who use statistics are aware that 0.05 should be used with great caution.

I think statistical significance is a first level filter. If a study can’t show statistical significance than its hardly worth publishing much less getting widely quoted in the popular press like this one. I suspect way more than 5% of studies are wrong. How many medical or astronomical studies published 100 or even 50 years ago are actually correct. Certainly not 95%, my bet is closer to 95% are at least partially wrong. As you suggest statistically significance is only a first step. In the case of this study there is also the issue with correlation or a causation and many other issues.

Marijuana is a drug, and a very popular one, and if someone wants to show that is a dangerous drug for young people, than they damn well should follow protocols for doing a drug test. I am no expert on clinical drug testing but I’ve listen to enough Dr./Entrepreneurs describe the process to know this isn’t close.

The blogger says this about the study.

This is quite possibly the worst paper I’ve read all year (as some of my previous blog posts show I am saying something with this statement). Here is a breakdown of some of the issues with the paper:

Yes. Probably a lot more than 5% of studies are wrong, but not necessarily due to not understanding statistics, more due to either misfeasance, malfeasance, or nonfeasance, or often a combination, when there is some reward in store for a positive result. Look at In the Pipeline on bad ALS studies in mice. As you can see, the published results are not only wrong, but often are so completely wrong as to be ludicrous. The difference between a published +35% lifespan increase due to minocycline and the new study’s results of -2% is pretty dramatic…