June 2025

Can AI predict clinical trial outcomes?

We asked five leading LLMs to predict the outcome of a high-profile clinical trial.

We asked several leading AI models to predict the full results of Summit Therapeutics' much-watched HARMONi trial.

Last year, Summit's ivonescimab, an anti-PD-L1xVEGF bispecific antibody, stunned the biotech world when it release data positioning it as potentially better than Keytruda in NSCLC. Ivonescimab, originally developed by Chinese biotech Akeso, helped bring the industry's awareness to the growth in the potential of Chinese biotechs to develop impactful innovative medicines.

Last week, Summit announced disappointing data from its HARMONi study, which did not hit a statistically significant improvement in OS as of the date of the analysis. Becuase OS benefit is required for FDA approval, a huge question among investors is "when the data is mature, will there be a statistically significant OS benefit?".

See how AI answered this question:

Are the LLM's answers perfect? No (ChatGPT o3's statistical analysis had some significant mathematical errors).

Are they helpful? Well, a 35-75% range for the estimated probability of trial success is not very helpful. But the overall analysis, and some of the sources cited, do have value.

How much value these answers provide depends on your role and expertise. If you have access to sophisticated tools for simulating clinical trials, or you are a hedge fund analyst covering immuno-oncology, maybe the LLMs are not that helpful, except as a sanity check.

But if you don't specialize in these areas, need a quick answer, or are doing a first draft of an analysis, these responses are much more informative than pulling the average oncology trial success rates from the literature.

The AI answers are just a starting point. They give you a head start, but there is always room for you to run farther.

Is this a fair test of LLM capabilities?

These examples represent the floor of what these models are capable of. I did not test different prompts, asked no follow up questions, provided no additional context or information to the models, and did not design any custom tools or workflows to help them accomplish the task more effectively.

Given the right tools and guidance, these LLMs could perform significantly better.

For example, an expert statistician could write custom software for statistical analysis of oncology study results, and give the LLMs access to this software through their "tool calling" functionality.

Or, you could just keep pushing the AI to improve (the same way you would coach an intern). Ask the AI to identify errors in its reasoning, tell it to go deeper on a certain topic, challenge its thinking, etc.

These techniques significantly improve the quality of output: because they leverage human expertise and judgment. Humans + AI > AI alone.

Making AI work for you

Maybe asking AI to predict clinical trial results isn't that useful to you. Maybe that isn't part of your job, or maybe you already have well-researched perspectives on the trials that matter to you.

But this use case doesn't take advantage of the full potential of LLMs.

The true power of AI comes with scale -- the ability to produce hundreds or thousands of these responses almost for free.

You can use no-code tools (or vibe coding) to create AI workflows and agents to do this work at scale.

Reading every news release and updating your valuation models automatically.

Sifting through each ASCO abstract and identifying which data represent potential improvements to the standard of care.

Parsing thousands of earnings transcripts to identify shifts in prescriber behavior and diagnosis rates.

You still need a human to review the data, to stitch together the LLMs and integrate external resources, and to ask the right questions.

But fully integrating these tools into your workflow can fundamentally change how you work, for the better.

Try it yourself

If you aren't using AI in this way, we recommend taking a couple hours on the weekend and trying to build an AI workflow yourself.

Pick a task that is fairly repetitive, and that is important enough that you'd be very happy to have AI do it for you, but not so important that it's critical for your job.

If you can code, use AI to help design and implement a solution. If you can't code, use a no-code tool, or if you're feeling adventurous, try getting a Jupyter notebook set up on your computer, and use it to run LLM-generated code.

We've built several such tools ourselves. If you'd like to try them out, or if we can help you build your own tools, let us know.

How can AI help you?

How does AI help you? What are its limitations? What keeps you up at night?

We are learning about this (exciting and terrifying) technology just like you. We want to know what you think.

Let's compare notes