Why public evidence convergence predicts failure, not success — and what this means for target prioritisation.
4.1 — The Novelty Confound
The most parsimonious explanation for our findings is that public evidence convergence is a measurement of research attention, and research attention flows toward novel and uncertain biology. Mechanisms that are well-understood and clinically validated generate fewer publications, fewer GWAS follow-up studies, and fewer CRISPR screens — there is simply less to discover. Mechanisms that are novel, contested, or under active investigation generate dense publication trails and rapid evidence accumulation. Novel mechanisms, in turn, carry higher clinical risk: first-in-class drugs have historically lower approval rates than follow-on drugs targeting validated biology [5, 6, 18]. The stealth cohort is therefore enriched for follow-on programs targeting well-understood mechanisms — exactly the population in which clinical success is most likely.
4.2 — Distinguishing Convergence from Genetic Support
Our findings do not contradict Nelson et al. or Minikel et al. [1, 2]. Genetic support is a binary, mechanistically specific signal: a genetic variant alters a target's function and produces a phenotype consistent with the disease. Convergence is an aggregate, time-resolved velocity measure that integrates ten heterogeneous evidence types, most of which are not mechanistic in any direct sense (literature mining, differential expression, pathway annotation). Genetic support validates the causal relationship between target perturbation and disease; convergence tracks the volume and velocity of attention. The two signals are categorically different, and the conflation of them in target prioritisation tooling is, we argue, a source of systematic bias.
4.3 — The Research Attention Paradox
If public evidence is a measure of attention, then high convergence is a marker of unresolved scientific debate, not settled validation. The targets that attract the most public scientific interest are precisely the ones where the underlying biology is least settled — where competing groups are publishing contradictory findings, where new screens are being run because old ones produced inconclusive results, where the literature is dense because the question is not yet answered. Established mechanisms with clinically successful precedent generate comparatively little new science, because there is little new science to do.
The Research Attention Paradox
The targets that attract the most public scientific interest are precisely the ones where the biology is least settled. High convergence velocity is a marker of scientific excitement, not scientific confidence.
4.4 — Survivorship Bias
The absolute hurdle rates we observe — 85.3% to 93.8% for unsupported programs — exceed published industry benchmarks of 52% to 58% from BIO/Informa, Hay et al., and Wong et al. [5, 6, 12]. This discrepancy is informative. ChEMBL preferentially captures programs that are publicly reported, indexed, and tracked. Stealth programs that fail are systematically underrepresented; they may never reach a publication, a press release, or a regulatory filing. Our analysis cannot fully correct for this, and the absolute rates should therefore be interpreted as relative comparisons rather than population-level estimates. Crucially, however, survivorship bias would have to be implausibly differential — heavily favouring the stealth cohort — to fully explain a 28-percentage-point composite gap.
4.5 — Case Study: PD-1/PD-L1
The PD-1 axis offers an instructive precedent. Yasumasa Ishida and Tasuku Honjo's discovery of PD-1 (Ishida et al., 1992) [7] preceded clinical development by approximately two decades, and the public evidence base remained sparse throughout that period. Topalian et al.'s landmark clinical proof-of-concept paper (NEJM, 2012) [8] preceded — rather than followed — the explosion of PD-1/PD-L1 publications that now dominates oncology literature. This is a common pattern in transformative therapeutics: the science follows the medicine. By the time public evidence has converged around a target, the most informative clinical signals have often already been generated internally.
4.6 — Implications for Target Prioritisation
Current target prioritisation frameworks that weight public evidence convergence positively — whether explicitly through scoring rubrics or implicitly through analyst attention — may be introducing systematic bias toward higher-risk targets. We do not argue that public evidence is uninformative. We argue that target prioritisation should distinguish carefully between evidence types that validate mechanism (chiefly human genetic support) and evidence types that track attention (literature volume, publication velocity, pathway annotation density). Failing to make this distinction risks penalising precisely the kind of well-understood, clinically tractable biology that drives the bulk of approved medicines.
4.7 — Limitations
This is an observational study; causal inference is not possible from these data. ChEMBL coverage and completeness vary across companies, modalities, and eras. The convergence score aggregates heterogeneous evidence types that differ in mechanistic specificity, and alternative weighting schemes may produce somewhat different magnitudes (though the direction is robust to reasonable perturbations). Lead time calculations assume that public evidence precedes — rather than reflects — internal pharmaceutical decision-making, which is true on average but not universally. Finally, we cannot fully exclude unmeasured confounders correlated with both convergence and clinical risk.