power

Why nobody wants to work with 10% false discovery rate (FDR)

jmages microarray, NGS, Statistic

When I was giving workshops on statistics and data analysis for a mayor American company for biological data analysis software we were explaining basic concepts in statistical design using ANOVA (Analysis of variance). We used a simple expression microarray data set containing a factor of interest (Trisomy 21; yes/no), a random factor (Subject ID) and a nested factor (samples were derived from different organs). This allowed us to build up a nice and simple ANOVA model. However the hitch was, that no single gene was significantly differentially expressed between Trisomy 21 subjects and normal controls using the usual 1% or 5% false discovery rates (p <= 0.01 and p <= 0.05).
When we then suggested to use a 10% FDR, meaning that every tenth gene would be a false positive event, a murmur went through the audience: “What? What are they thinking? Seriously?”. It seems that this option is not popular at all. But why? Looking at the resulting gene list of some 20 genes, over 90% are located on Chromosome 21. Doesn´t that make a lot of sense? Admittedly, the biological significance of a gene list is not always that obvious, but there are a lot of ways to check it. This might be knowledge about the biological expectations (like having a lot of TNFalpha responisve genes in an immunological experiment) or checking Gene Ontology (GO) and Pathway involvement. And there is always the possibility to double-check your NGS or microarray results using alternative methods, for example qPCR or Elisas.

All in all we try to adapt our selection criteria with having the whole layout of the experiment in mind. Last but not least your aim and your success is the most important factor.
By the way. Our final argument for the 10% false postive rate was always: What if your are in the casino and for each game you have a 10% probability to loose? Not too bad, is it?

 

A man who ‘rejects’ a hypothesis provisionally, as a matter of habitual practice, when the significance is at the 1% level or higher, will certainly be mistaken in not more than 1% of such decisions. For when the hypothesis is correct he will be mistaken in just 1% of these cases, and when it is incorrect he will never be mistaken in rejection. […] However, the calculation is absurdly academic, for in fact no scientific worker has a fixed level of significance at which from year to year, and in all circumstances, he rejects hypotheses; he rather gives his mind to each particular case in the light of his evidence and his ideas.Sir Ronald A. Fisher, from Statistical Methods and Scientific Inference (1956)