Cloud migration in genomic research: The answer to large data sets?

by

EPM's Q&A with Jack Dix, head of healthcare strategy and consulting at Kainos, examines technology in genomics research and how cloud data transformation can help to build better patient outcomes.

Genomics research involves a huge volume of data that needs to be processed, and here Jack highlights why Genomics England’s shift to the cloud has helped researchers to manage this data and improve collaboration – helping to increase understanding of disease origins, identify new treatments, and improve patient outcomes.

Q. Can you explain the importance of technology in genomics research? 

To understand the importance of technology in genomics research, you only have to look at the Human Genome Project. When this project began, it was cutting edge. Using the technology available at the time, it took 10 years to sequence the first human genome, and 13 years to fully complete the project. With advances in technology, that same process can now be completed in under 24 hours. As a result, millions of genomes have been sequenced globally, making the study of diseases associated with genetic mutations more accurate than ever.  

The challenge now isn’t a lack of data, but having too much. The volume and quality of patient records, genomics data and other critical biometric data is growing daily. So, the evolution of technology is key to ensuring continued innovation in genomics research. 

Q. Could you tell us more about Kainos’ work with Our Future Health? 

Kainos’ work with Our Future Health is a great example of a public sector, citizen-facing journey that’s delivering real value. The programme encourages people from all walks of life to share their genomic data so researchers can improve the detection, prevention, and treatment of disease. 

Our Future Health recognised this goal could only be achieved via a user journey that made registering and sharing data accessible to a diverse range of people. Kainos worked with Our Future Health to build the “front door” to the programme – its online citizen portal. Our aim was to design a data-sharing process that was as user-friendly and frictionless as possible to prevent dropout and deliver against the programme ambition to recruit 5 million volunteers that are representative of the UK population. Our experience of similar projects and knowledge of the genomics landscape was invaluable in creating a portal that enables people to share their data for the greater good.  

Q. How can moving to cloud data support genomic researchers? 

The data available to genomics researchers is a rich resource for discovery. However, storing and analysing this data requires High-Performance Computing (HPC) that delivers a level of processing power and scalability that the legacy, on-premises systems used by many labs simply cannot offer. This can result in significant delays to processing, valuable data going unused and considerable on-going costs – all which impact on the effective delivery of research that is critical to advancing drug discovery, improving diagnostic yield, and powering precision medicine initiatives.  

Working in partnership with AWS (Amazon Web Services), we have supported organisations such as Genomics England deliver high-performance computing architectures in the cloud that are optimised for both speed and cost. At Genomics England, the cloud has enabled the migration of petabytes of genomic data and allowed researchers to perform common tasks in just 23 seconds that previously took 25 hours. 

So, migrating to cloud plays a vital role in enabling genomics researchers to access this valuable data asset. It offers the storage and management capabilities needed to get the most out of massive data sets and enables researchers to access and analyse data much more effectively. 

Q. How can cloud data transformation support better patient outcomes? 

Cloud technologies enable researchers to leverage the wealth of genomics data available at scale and extract valuable insights. Achieving this through legacy systems is significantly more challenging, time consuming and costly, limiting the data available for research and the analysis that can be conducted on it.   

Cloud has also led to a democratisation of data, helping to advance collaboration by allowing researchers to share large amounts of data with fellow scientists and partners. No matter where researchers are based, cloud enables them to work together to unearth new discoveries with designed-in security and privacy controls.  

There are also huge savings to be made in operational costs by migrating to cloud. It eradicates the need to invest in or maintain storage hardware, allowing labs to re-focus valuable time, money, and resources into research.  

Q. Are there any challenges with cloud data transformation for genomics? 

There are always challenges around moving from a legacy system to cloud, which isn’t unique to the field of genomics. However, the volume of data in genomics can exacerbate this issue, and any disruption to services could damage research projects considerably. So, cloud migration projects in the genomics sector must be executed efficiently, which requires a unique understanding of challenges in the field. 

Moreover, many genomics labs often don’t have the in-house digital skills required to make a success of cloud in the long-term. So, it’s vital that staff involved in the migration project are empowered to support and extend the new platform after initial migration, and users of the platform, such as academic researchers, are appropriately trained to maximise the benefits afforded through new research environments and analytical capabilities.  

Finally, the data held by genomics researchers is also extremely sensitive. So, any migration of data to the cloud must uphold the security standards required to safeguard the end-user, aligning to emerging principles and standards for trusted research environments (TREs). 

Q. Anything else you would like to add? 

The promise of genomics research in helping us to understand and tackle global health issues cannot be underestimated. Yet, it is equally important to build citizen trust around how data is used. This is still an emerging field, and while there are guidelines and models for best practice, no common standards exist yet.  

Investment in TREs, designed to enable the access and use of data within a secure environment, is increasing with initiatives being funded at national and regional levels. A recent review commissioned by the UK’s Department of Health and Social Care, conducted by Professor Ben Goldacre, has recommended that TREs become ‘the norm’ to help build public trust through improved transparency which we see as critical to accelerating genomics research whilst protecting patients’ privacy. 

Organisations are starting to apply these recommendations, such as the TRE accreditation process launched by OFH in January, that shows great promise in establishing common requirements for platforms, data governance, security and privacy that support a Five Safes framework – safe projects, safe people, safe data, safe settings and safe outputs. 

Back to topbutton