Evidence Convergence and Clinical Development Success

Abstract

We analysed 161,932 drug-target-disease triplets from the Open Targets Platform and ChEMBL to characterise the relationship between public scientific evidence convergence and clinical development outcomes. Our central finding is a "stealth phenomenon": 80.4% of 70,320 unique target-disease pairs entering clinical development had zero prior public evidence. Among the 16.2% of clinical programs with convergence support, we observe a counterintuitive inverse association with success at every phase — Phase I→II: 68.1% vs. 85.3% (OR = 0.37, P < 10⁻³⁰⁰), Phase II→III: 73.6% vs. 87.0% (OR = 0.41, P = 1.5 × 10⁻²⁹⁵), Phase III→Approval: 81.9% vs. 93.8% (OR = 0.30, P = 3.0 × 10⁻¹⁸⁸). Composite probability of success: 41.0% (convergence-supported) vs. 69.6% (unsupported). We demonstrate an inverted dose-response — greater convergence intensity predicts lower success — and consistency across 19 of 21 therapeutic areas. We argue that public evidence convergence is a proxy for novelty and competitive research attention, not mechanistic validation, fundamentally challenging prevailing target prioritisation frameworks.

Section 01

Introduction

The genetics-success hypothesis has shaped a decade of drug development strategy. This analysis extends the question to all forms of public scientific evidence.

The foundational observation by Nelson et al. (2015) that drug targets with human genetic support are twice as likely to succeed in clinical development has profoundly influenced pharmaceutical R&D strategy [1]. Minikel et al. (2024) refined this finding, demonstrating that the effect persists after controlling for confounders such as disease area, target druggability, and historical trial intensity [2]. The genetics-success association has become a cornerstone of target prioritisation frameworks across the industry, encoded into platform tools, investment theses, and portfolio review criteria [3, 10, 11].

However, a critical question remains largely unaddressed: does this principle extend to other forms of public scientific evidence? The Open Targets Platform aggregates evidence from ten distinct sources [4, 9] — literature mining (Europe PMC), CRISPR functional genomics screens (Project Score), GWAS credible sets, ClinVar and EVA somatic variants, cancer biomarkers, UniProt curated variants, Expression Atlas differential expression, gene burden association studies, IMPC mouse phenotypes, and Reactome pathway annotations. Each source captures a distinct facet of biological signal, and together they constitute a comprehensive view of what the public scientific community considers to be a credible target.

We hypothesised that if genetic support predicts success because it provides mechanistic validation, then broader evidence convergence — the accumulation of multiple independent evidence streams over time — should show a similar or stronger positive association with clinical outcomes. The intuition is straightforward: more evidence should mean more validation, and more validation should mean fewer surprises in the clinic. Our analysis of 161,932 drug-target-disease triplets reveals the opposite. Across every phase transition, every quartile of convergence intensity, and nearly every therapeutic area, public evidence convergence is inversely associated with clinical development success.

Section 02

Methods

Data integration across Open Targets Platform (v24.12) and ChEMBL, with convergence scoring, phase-specific analysis, and temporal validation.

2.1 — Data Sources

We integrated evidence from the Open Targets Platform release v24.12, encompassing all ten standard evidence types, with clinical phase tracking from ChEMBL. After deduplication and harmonisation of target identifiers (Ensembl gene IDs) and disease identifiers (EFO terms), the working corpus comprised 161,932 unique drug-target-disease triplets, mapping to 70,320 unique target-disease pairs entering clinical development at any phase.

2.2 — Convergence Characterisation

For each target-disease pair, we computed a time-resolved convergence trajectory. The cumulative weighted evidence score S(t) aggregates contributions from all available evidence sources up to year t, weighted by source-specific reliability. The Convergence Score C(t) is a normalisation that constrains the metric to the unit interval:

Definition

C(t) = S(t) / [S(t) + 1.644], where 1.644 corresponds to the one-sided 95% normal critical value, calibrating the scale such that a single high-confidence evidence stream produces C(t) ≈ 0.5. Convergence Velocity V(t) = C(t) − C(t−1) captures the rate of evidence accumulation. Lead time is defined as the interval (in years) between the year of peak convergence velocity and the year of clinical entry — median 5.0 years, mean 6.9 years.

2.3 — Probability of Success Analysis

Phase-specific hurdle rates were computed as the proportion of programs entering a given phase that subsequently advanced to the next. Confidence intervals use Wilson score intervals appropriate for binomial proportions [16]. Group comparisons use Pearson chi-squared with Yates continuity correction; odds ratios are reported with Haldane-Anscombe correction where zero cells appear. Dose-response analysis stratifies the convergence-supported cohort into quartiles and uses Spearman rank correlation to test monotonicity. Therapeutic area subgroup analysis uses the EFO disease ontology rolled up to 21 top-level areas. Temporal validation compares pre-2020 and post-2020 cohorts to test whether the association is artefactual to a specific era of pharmaceutical history.

Section 03

Results

The stealth phenomenon, the inverse association, the dose-response gradient, and the consistency across therapeutic areas.

3.1 — The Stealth Phenomenon

Of the 70,320 unique target-disease pairs entering clinical development in our corpus, 56,542 (80.4%) had zero prior public evidence in the Open Targets Platform at the time of clinical entry. Only 13,778 pairs (19.6%) had any prior public evidence at all, and only 9,145 (13.0%) had measurable convergence dynamics — that is, sufficient longitudinal evidence to compute a convergence trajectory.

Metric	Value
Total unique (target, disease) pairs	70,320
Zero prior public evidence	56,542 (80.4%)
Any prior public evidence	13,778 (19.6%)
Measurable convergence dynamics	9,145 (13.0%)
Median lead time	5.0 years
Mean lead time	6.9 years

Table 1. Coverage of public evidence among clinical-stage target-disease pairs.

The Stealth Phenomenon

Four out of five clinical programs enter development without any prior public scientific evidence. The vast majority of drug development operates outside the public evidence landscape entirely.

3.2 — Convergence-Supported Programs Have Lower Success Rates

Restricting attention to programs with at least some convergence support, we compared phase-specific hurdle rates against the unsupported (stealth) cohort. At every transition, convergence-supported programs were dramatically less likely to advance.

Phase Transition	Supported	95% CI	n	Unsupported	95% CI	n	OR	P
Phase I → II	68.1%	[67.2, 69.0]	10,093	85.3%	[84.9, 85.6]	40,633	0.37	<10⁻³⁰⁰
Phase II → III	73.6%	[72.7, 74.4]	11,142	87.0%	[86.8, 87.3]	62,871	0.41	1.5 × 10⁻²⁹⁵
Phase III → Approval	81.9%	[80.8, 82.9]	5,016	93.8%	[93.5, 94.1]	32,177	0.30	3.0 × 10⁻¹⁸⁸
Composite (I → Approval)	41.0%	—	—	69.6%	—	—	—	—

Table 2. Phase-specific hurdle rates by convergence support status.

0.37

OR, Phase I→II

41.0% / 69.6%

Composite PoS, supported vs. unsupported

19 / 21

Therapeutic areas consistent

3.3 — Inverted Dose-Response

Within the convergence-supported cohort, we stratified programs into quartiles by convergence intensity. The relationship is monotonic and inverted: each successive quartile of higher convergence is associated with lower pooled hurdle rates.

Quartile	Pooled Hurdle Rate	95% CI	n
Q1 (lowest convergence)	77.4%	[76.4, 78.4]	6,581
Q2	74.9%	[73.8, 75.9]	6,556
Q3	71.8%	[70.7, 72.9]	6,559
Q4 (highest convergence)	68.0%	[66.9, 69.1]	6,555

Table 3. Pooled hurdle rates by convergence intensity quartile.

Spearman rank correlation between hurdle outcome and convergence dynamics was negative for both the peak velocity (ρ = −0.082) and the maximum confidence (ρ = −0.094) summary statistics, both highly significant given the sample size.

Monotonic Gradient

Success rates decline consistently with increasing convergence intensity. This is the opposite of what validation-based frameworks would predict — and it holds whether convergence is measured by peak velocity or maximum confidence.

3.4 — Therapeutic Area Consistency

The inverse association is not driven by any single disease area. Across the 21 top-level therapeutic areas in the EFO ontology, 19 show the same inverse pattern. Infectious disease shows the strongest effect (OR 0.23), Cancer the largest absolute cohort (OR 0.39, n = 91,925), and nervous system disorders an intermediate effect (OR 0.50). Only two areas — visual system (OR 0.84, not statistically significant) and pregnancy/perinatal conditions (OR 1.18, not statistically significant) — fail to show the inverse pattern, and in both cases the cohorts are small.

3.5 — Temporal Stability

To rule out the possibility that the inverse association is an artefact of a specific era of pharmaceutical history, we split the cohort at 2020 and re-ran the analysis. The pre-2020 odds ratio was 0.37; the post-2020 odds ratio was 0.32. Far from weakening, the inverse association is strengthening over time — consistent with an environment in which target prioritisation frameworks have become more reliant on aggregated public evidence.

Section 04

Discussion

Why public evidence convergence predicts failure, not success — and what this means for target prioritisation.

4.1 — The Novelty Confound

The most parsimonious explanation for our findings is that public evidence convergence is a measurement of research attention, and research attention flows toward novel and uncertain biology. Mechanisms that are well-understood and clinically validated generate fewer publications, fewer GWAS follow-up studies, and fewer CRISPR screens — there is simply less to discover. Mechanisms that are novel, contested, or under active investigation generate dense publication trails and rapid evidence accumulation. Novel mechanisms, in turn, carry higher clinical risk: first-in-class drugs have historically lower approval rates than follow-on drugs targeting validated biology [5, 6, 18]. The stealth cohort is therefore enriched for follow-on programs targeting well-understood mechanisms — exactly the population in which clinical success is most likely.

4.2 — Distinguishing Convergence from Genetic Support

Our findings do not contradict Nelson et al. or Minikel et al. [1, 2]. Genetic support is a binary, mechanistically specific signal: a genetic variant alters a target's function and produces a phenotype consistent with the disease. Convergence is an aggregate, time-resolved velocity measure that integrates ten heterogeneous evidence types, most of which are not mechanistic in any direct sense (literature mining, differential expression, pathway annotation). Genetic support validates the causal relationship between target perturbation and disease; convergence tracks the volume and velocity of attention. The two signals are categorically different, and the conflation of them in target prioritisation tooling is, we argue, a source of systematic bias.

4.3 — The Research Attention Paradox

If public evidence is a measure of attention, then high convergence is a marker of unresolved scientific debate, not settled validation. The targets that attract the most public scientific interest are precisely the ones where the underlying biology is least settled — where competing groups are publishing contradictory findings, where new screens are being run because old ones produced inconclusive results, where the literature is dense because the question is not yet answered. Established mechanisms with clinically successful precedent generate comparatively little new science, because there is little new science to do.

The Research Attention Paradox

The targets that attract the most public scientific interest are precisely the ones where the biology is least settled. High convergence velocity is a marker of scientific excitement, not scientific confidence.

4.4 — Survivorship Bias

The absolute hurdle rates we observe — 85.3% to 93.8% for unsupported programs — exceed published industry benchmarks of 52% to 58% from BIO/Informa, Hay et al., and Wong et al. [5, 6, 12]. This discrepancy is informative. ChEMBL preferentially captures programs that are publicly reported, indexed, and tracked. Stealth programs that fail are systematically underrepresented; they may never reach a publication, a press release, or a regulatory filing. Our analysis cannot fully correct for this, and the absolute rates should therefore be interpreted as relative comparisons rather than population-level estimates. Crucially, however, survivorship bias would have to be implausibly differential — heavily favouring the stealth cohort — to fully explain a 28-percentage-point composite gap.

4.5 — Case Study: PD-1/PD-L1

The PD-1 axis offers an instructive precedent. Yasumasa Ishida and Tasuku Honjo's discovery of PD-1 (Ishida et al., 1992) [7] preceded clinical development by approximately two decades, and the public evidence base remained sparse throughout that period. Topalian et al.'s landmark clinical proof-of-concept paper (NEJM, 2012) [8] preceded — rather than followed — the explosion of PD-1/PD-L1 publications that now dominates oncology literature. This is a common pattern in transformative therapeutics: the science follows the medicine. By the time public evidence has converged around a target, the most informative clinical signals have often already been generated internally.

4.6 — Implications for Target Prioritisation

Current target prioritisation frameworks that weight public evidence convergence positively — whether explicitly through scoring rubrics or implicitly through analyst attention — may be introducing systematic bias toward higher-risk targets. We do not argue that public evidence is uninformative. We argue that target prioritisation should distinguish carefully between evidence types that validate mechanism (chiefly human genetic support) and evidence types that track attention (literature volume, publication velocity, pathway annotation density). Failing to make this distinction risks penalising precisely the kind of well-understood, clinically tractable biology that drives the bulk of approved medicines.

4.7 — Limitations

This is an observational study; causal inference is not possible from these data. ChEMBL coverage and completeness vary across companies, modalities, and eras. The convergence score aggregates heterogeneous evidence types that differ in mechanistic specificity, and alternative weighting schemes may produce somewhat different magnitudes (though the direction is robust to reasonable perturbations). Lead time calculations assume that public evidence precedes — rather than reflects — internal pharmaceutical decision-making, which is true on average but not universally. Finally, we cannot fully exclude unmeasured confounders correlated with both convergence and clinical risk.

Section 05

Conclusion

Public evidence convergence is a proxy for novelty and risk, not for validation.

Across 161,932 drug-target-disease triplets, we find that public scientific evidence convergence is inversely associated with clinical development success at every phase, in nearly every therapeutic area, and in both pre- and post-2020 cohorts. The inverse association is monotonic in convergence intensity, persists after stratification, and strengthens rather than weakens over time. Together, these findings challenge the intuitive notion that "more evidence equals a better target" — at least when the evidence in question is heterogeneous public data aggregated from the literature, functional genomics screens, and pathway annotations.

The most plausible interpretation is that evidence convergence tracks the location of the scientific frontier rather than the location of validated biology. Frontier targets are inherently riskier: their mechanisms are less settled, their phenotypic readouts are less mature, and the failure modes have not yet been exhaustively characterised. Target prioritisation frameworks should therefore distinguish carefully between mechanistic validation (chiefly human genetic support, which has a defensible causal interpretation) and research attention (literature volume, publication velocity, and aggregate evidence accumulation), rather than collapsing them into a single composite score.

Most strikingly, the 80.4% stealth phenomenon suggests that the vast majority of successful drug development happens entirely outside the public evidence landscape. Four out of five clinical programs enter development without any prior public scientific evidence in Open Targets — and as a class, those programs succeed at substantially higher rates than their convergence-supported counterparts. This is not an argument against open science. It is an argument for epistemic humility about what public evidence aggregation can and cannot tell us about the future of any individual drug program.

References

Nelson, M.R. et al. The support of human genetic evidence for approved drug indications. Nature Genetics, 2015.
Minikel, E.V. et al. Refining the impact of genetic evidence on clinical success. Nature, 2024.
King, E.A. et al. Are drug targets with genetic support twice as likely to be approved? PLOS Genetics, 2019.
Buniello, A. et al. Open Targets Platform: new developments. Nucleic Acids Research, 2025.
BIO/Informa Pharma Intelligence. Clinical Development Success Rates 2011–2020. 2022.
Hay, M. et al. Clinical development success rates for investigational drugs. Nature Biotechnology, 2014.
Ishida, Y. et al. Induced expression of PD-1. EMBO Journal, 1992.
Topalian, S.L. et al. Safety, activity, and immune correlates of anti-PD-1 antibody in cancer. NEJM, 2012.
Ochoa, D. et al. Open Targets Platform: accelerating systematic drug-target identification. Nucleic Acids Research, 2021.
Hingorani, A.D. et al. Improving the odds of drug development success through human genomics. Scientific Reports, 2019.
Finan, C. et al. The druggable genome and support for target identification. Science Translational Medicine, 2017.
Wong, C.H. et al. Estimation of clinical trial success rates. Biostatistics, 2019.
Dowden, H. & Munro, J. Trends in clinical success rates and therapeutic focus. Nature Reviews Drug Discovery, 2019.
DiMasi, J.A. et al. Innovation in the pharmaceutical industry. Journal of Health Economics, 2016.
Authors. Convergence dynamics analysis. Manuscript in preparation, 2026.
Wilson, E.B. Probable inference, the law of succession. JASA, 1927.
Lanthier, M. et al. An improved approach to measuring drug innovation. Health Affairs, 2013.
Eder, J. et al. The discovery of first-in-class drugs. Nature Reviews Drug Discovery, 2014.
Swinney, D.C. & Anthony, J. How were new medicines discovered? Nature Reviews Drug Discovery, 2011.
Paul, S.M. et al. How to improve R&D productivity. Nature Reviews Drug Discovery, 2010.
Cook, D. et al. Lessons learned from the fate of AstraZeneca's drug pipeline. Nature Reviews Drug Discovery, 2014.
Morgan, P. et al. Impact of a five-dimensional framework on R&D productivity. Nature Reviews Drug Discovery, 2018.