99% impossible: a valid, or falsifiable, internal meta-analysis

Vosgerau, Joachim; Simonsohn, Uri; Nelson, Leif D.; Simmons, Joseph P.

Several researchers have relied on, or advocated for, internal meta-analysis, which involves statistically aggregating multiple studies in a paper to assess their overall evidential value. Advocates of internal meta-analysis argue that it provides an efficient approach to increasing statistical power and solving the file-drawer problem. Here we show that the validity of internal meta-analysis rests on the assumption that no studies or analyses were selectively reported. That is, the technique is only valid if (a) all conducted studies were included (i.e., an empty file drawer), and (b) for each included study, exactly one analysis was attempted (i.e., there was no p-hacking). We show that even very small doses of selective reporting invalidate internal meta-analysis. For example, the kind of minimal p-hacking that increases the false-positive rate of 1 study to just 8% increases the false-positive rate of a 10-study internal meta-analysis to 83%. If selective reporting is approximately zero, but not exactly zero, then internal meta-analysis is invalid. To be valid, (a) an internal meta-analysis would need to contain exclusively studies that were properly preregistered, (b) those preregistrations would have to be followed in all essential aspects, and (c) the decision of whether to include a given study in an internal meta-analysis would have to be made before any of those studies are run.