Shutterstock
Up to now, the sheer amount of work required in bringing intelligent automation systems up to speed and validating them for life sciences R&D purposes, e.g. to transform adverse event case intake, has threatened to undermine the business case. But now the large language models (LLMs) used to power Generative AI are bringing down those barriers and bolstering compliance. The opportunity surrounds on-the-fly data discovery, ‘in context’ learning, and narrative extrapolation – in a way that’s ‘explainable’ to regulators. Ramesh Ramani and RaviKanth Valigari, AI experts at ArisGlobal, explore what’s possible.
Where high volumes of information exist across different formats, and originate via different channels (as in the case of safety monitoring, for instance) - there is a significant administrative overhead involved in distilling any significant findings and making them useable. And it is here that the latest advances in artificial intelligence (AI) and machine learning (ML) offer substantial process transformation potential. Not only in relation to efficiency, but also significantly improved accuracy - once the software knows what it is looking for.
Generative AI (GenAI) technology, using large language models (LLMs), is lighting the way here, quickly understanding what to look out for and ably summarising key findings for the user – and, crucially, without the need for painstaking ‘training’ by overstretched teams, or validation of each configuration.
Building Trust
In a drug development context, Safety and Regulatory requirements present an enormous data burden which consumes vast resources and usually carries a time-based penalty (e.g. linked to prompt adverse event notification/safety reporting, or affecting speed to market). While process automation solutions have existed for some time to lighten the manual load and enhance efficiency, there have been two main sticking points up to now: how to swiftly train modern AI algorithms so that they pick up on only what’s significant; and how to satisfy the authorities’ need for accuracy and transparency.
LLMs (the vast data banks referred to by GenAI tools), and advanced natural language processing (NLP) techniques like retrieval-augmented generation (RAG), are now being applied to fill these gaps and make advanced automation a safe and reliable reality in key life sciences R&D processes, and crucially without the need for continuous, painstaking oversight. (In simple terms, RAG simplifies the process of fine-tuning AI models by allowing LLMs to integrate proprietary data with publicly-available information, giving them a bigger pool of knowledge - and context - to draw from.)
Context Matters: Applying GenAI-type Techniques to New Data
The biggest breakthrough with all of this is that specialised applications can now be developed that can apply GenAI-type techniques, contextually, to data they haven’t seen before – learning from and processing the contents on the fly.
For drug developers, this has the potential to transform numerous labour-intensive processes, ranging from dynamic data extraction associated with adverse event (AE) intake; to safety case narrative generation; to narrative theme analysis in safety signal detection; to the drafting of safety reports. And solutions for all of these use cases are coming down the line.
Importantly, carefully-combined LLM and RAG capabilities are sufficiently transparent and explainable to regulators for the technology to be acceptable as safe and reliable. Responsible AI and AI compliance are particularly critical in life sciences use cases, so it is essential that companies deploy solutions that are proven and transparent. The LLM/RAG approach addresses potential concerns about data security and privacy, too, as it does not require the use of potentially-sensitive patient data for algorithm training/machine learning. It also stands up to validation, by way of periodic sampling by human team members; sampling which can be calibrated as confidence grows in the technology’s performance - ensuring that efforts to monitor its integrity do not undermine the significant efficiency gains that are possible.
Circumventing the Endless Validation Cycle
The trouble with ML solutions up to now has been the training burden. For instance, in the case of adverse event recording, systems would need to be shown what to look for in the information provided via a range of different channels and formats, before extracting and processing it. For each different source type, a new configuration of the software would be needed too, pushing up the training overhead, and overall expense including the maintenance burden each time the technology was updated.
LLMs make it possible to bypass the need to train AI models or algorithms on what to look out for and/or what something means, so that a single technology solution can handle all variations of incoming data. RAG patterns can play an important role here, in explaining a standard operation procedure to an LLM using natural language, so that the system knows what to do with each of many thousands of forms – without the need for special configuration for each relative format.
The potential impact is impressive. Application of LLM-RAG technology to transform AE case intake has been shown to deliver upwards of 65% efficiency gains, with 90%+ data extraction accuracy and quality in early pilots. In the case of safety case narrative generation, the same technology is already demonstrating 80-85% consistency in the summaries it creates. And that’s from a standing start, without prior exposure.
Retrieving Key Data in Context = Unprecedented Process Streamlining
The ability to retrieve data in context, rather than via a ‘Control F’ (find all) command (e.g. everything among a content set that mentions headaches), could transform a range of processes linked to safety/adverse event discovery and reporting.
Certainly, it lays the foundation for drug developers to substantially streamline some of their most demanding data-based processes. In due course, these will also include the drafting of hefty regulatory safety reports, with advanced automation generating the preliminary narrative; and narrative theme analysis in safety signal detection. Here, there is vast scope for the technology to help in distilling trends that have not been captured in the structured data. These could include a history of drug abuse, or of people living with obesity, across 500 patient narratives that are potentially of interest. The potential is extremely exciting.
It is this kind of development that is now being avidly discussed at meetings of the industry’s new global GenAI Council. Any hesitation about adopting smarter automation out of reliability or compliance fears has now been superseded by a hunger to embrace new iterations of the technology which directly address those concerns and offer tangible step changes in productivity and efficiency.