Revolutionising our understanding of cancer risk using a huge variety of data to advance early detection of the disease – it sounds as ambitious as it is exciting, but is it possible? We caught up with Professor Antonis Antoniou, who will be working to shape the proposal for the Cancer Data Driven Detection (CD3) initiative, to talk partnerships, team science and the challenges facing implementation of all this data clout…
Firstly, congratulations on landing the directorship of CD3 – tell us about the initiative…
Thank you! It’s a great pleasure and a privilege to have the opportunity to lead CD3.
I believe the CD3 initiative is exactly what is needed to move the field of data-driven cancer detection forward and I look forward working with the rest of the cancer research community to achieve just that.
Our vision is that CD3 will provide a much deeper understanding of who is most at risk of developing cancer and thus greatly advance our ability to prevent, detect and diagnose cancer early. To do this, we’ll build an open and inclusive network of multidisciplinary researchers – the CD3 research community. This community will leverage the UK’s population-scale electronic health record infrastructure and its strengths in cancer multi-omics, epidemiology, advanced analytics, as well as its participatory approach to research.
Through infrastructure and assets, we hope to transform our ability to perform cancer risk factor discovery and cancer risk modelling. That will mean establishing UK-wide multidisciplinary expertise and capacity in cancer data science and analytics – but there will be many other important parts of this as well. Assembling FAIR compliant multimodal data resources is key – and that will include clinical, epidemiological, multi-omic, behavioural, environmental and imaging data (from linked healthcare and administrative datasets at UK population-wide scale and from consented cohorts). The community will develop and apply advanced analytical techniques to these datasets to identify novel cancer risk factors and build multifactorial models with clinical utility to improve cancer risk predictions for both asymptomatic and symptomatic populations.
Central to CD3 will be partnerships. Not only with patients, the public and practitioners, but also with other key initiatives and infrastructures across health data science.
This is, of course, just the beginning for CD3 – what do you think the greatest challenges will be as the initiative progresses?
Enabling access to the different datasets and assembling the datasets together will be key. However, current governance may not permit centralisation. To achieve scale, it’ll be necessary to ensure analyses can be carried out across multiple secure data environments. It’ll therefore be critical to work with our partners from Health Data Research UK , Administrative Data Research UK and The Alan Turing Institute, all of whom have expertise in these areas. This will ensure federated analytics across cancer-specific data resources – with the relevant IT infrastructure and secure data environments – are put in place.
“Data resources are creating unique opportunities to improve cancer risk prediction. They also encourage researchers working on similar problems to work collaboratively instead of competing.”
Another challenge is that we’ll be assembling heterogeneous datasets with varying data types and formats. These will clearly need to be harmonised if we can extract all the potential from aggregated data whilst minimising biases. For this, it’s vital we adopt data harmonisation standards for cancer risk factors and outcomes.
It’s also true that the regulatory environment around the use and governance of data is changing rapidly, so this will require careful monitoring throughout the CD3 programme.
This really does signify a pretty big shift in the way researchers of all disciplines could think about the data they generate – and the data which they could access – do you think the cancer research community will be receptive to this change in mind-set?
I find open and team science very attractive, and this was one of the key reasons I applied for this position. The CD3 initiative is a result of extensive consultations with the wider research community, which shows that there is a growing recognition of the importance of open and team science.
There is already the realisation in the scientific community that large scale collaborative studies and access to large scale data resources lead to impactful research – especially in cancer population health sciences. And, of course, there are existing resources – for example UK Biobank, Genomics England, the linked healthcare datasets across the four nations in the UK and the resources being developed by HDR UK, ADR UK and The Alan Turing Institute. So, I believe that the cancer research community is already receptive to the idea of accessing and sharing data.
As such resources become more widely available, they are creating unique opportunities to improve cancer risk prediction. They also encourage researchers working on similar problems to work collaboratively instead of competing. This can help minimise inefficiencies in the system.
I believe resources such as those that will be generated by CD3 and open science will nurture and enable research excellence across the entire UK.
Predicting cancer risk is one thing, but to really have an impact what other challenges do we need to overcome in terms of population stratification and early detection activity?
There are several challenges that will have to be addressed before we can implement cancer risk assessment and population stratification in routine clinical practice.
I think governance issues and attitudes around the use of data for cancer risk prediction and the subsequent integration of risk prediction tools within electronic health records needs to be thought about carefully. As does the health system’s ability to reach people for cancer risk assessment without increasing inequalities.
Health systems also need to address the acceptability of risk stratified early detection to the public and practitioners. Part of that will mean an improvement in the training of the healthcare workforce in multifactorial cancer risk assessment and, of course, ensuring there is sufficient capacity within the healthcare system. Importantly we need to understand the benefits and harms of such activities.
“Central to CD3 will also be our patients and public partners who will be supported to co-design and co-produce the CD3 efforts, with a specific focus on inclusion of diverse communities and under-served groups.”
Although CD3 is a discovery and translational research effort, the initiative will consider the challenges of future implementation. Involvement of NHS staff and healthcare system stakeholders throughout the programme will be important. Central to CD3 will also be our patients and public partners who will be supported to co-design and co-produce the CD3 efforts, with a specific focus on inclusion of diverse communities and under-served groups.
Tell us a little about your academic and research interests up to this point
I’m a data scientist with expertise in cancer risk prediction and cancer prevention.
My research has focused on understanding the risk conferred by cancer susceptibility genes, such as BRCA1, BRCA2 and PALB2, and on advancing cancer risk prediction through the development of multifactorial cancer risk prediction models and tools that are used in routine clinical practice. They include the development of the BOADICEA breast and ovarian cancer risk prediction model and its implementation into the free-access CanRisk tool – endorsed by several national and international clinical management guidelines to support decisions on cancer risks.
The research in my group currently focuses on four areas: improving cancer risk stratification by developing multifactorial algorithms; identifying and characterising novel cancer risks factors; developing user-friendly tools to facilitate personalised risk prediction in routine clinical practice; and on operationalising routine multifactorial cancer risk assessment in clinical practice through trials and feasibility studies. These align closely with the strategic objectives of the CD3 initiative.
Cancer Data Driven Detection (CD3) is a national research initiative dedicated to using data to revolutionise our understanding of cancer risk and allow early detection of cancers.
Cancer Research UK are leading the project with support from Health Data Research UK, The Alan Turing Institute and the Economic and Social Research Council’s Administrative Data Research UK programme.
Our Early Detection and Diagnosis of Cancer Roadmap set out that at present we lack a deep understanding of who is most at risk of developing cancer and that addressing this would transform our ability to detect and prevent it.
This programme will be an exemplar of our Research Data Strategy, demonstrating the value of taking a data-driven approach to cancer research.