Medical data paves the way for machine learning

An European consortium aims to transform the field of prostate cancer care by unlocking the potential of big data and big data analytics.

The servers at HZDR and CASUS will host the PIONEER Prostate Cancer Big Data Platform.
The servers at HZDR and CASUS will host the PIONEER Prostate Cancer Big Data Platform.
Source: HZDR/Oliver Killig

The Center for Advanced Systems Understanding (CASUS) at the Helmholtz-Zentrum Dresden-Rossendorf (HZDR) has joined PIONEER, a 12.8m euro project funded by the public-private partnership Innovative Medicines Initiative 2 (IMI2). The European consortium aims to transform the field of prostate cancer care by unlocking the potential of big data and big data analytics. 

Spread all across Europe, databases from clinical studies, public registries and electronic health records contain clinical data from thousands of prostate cancer patients. Poineer collects, anonymizes and assembles these diverse data sets. CASUS takes over the task of providing a new centralized data and analytics platform for Poineer. The cloud-based platform will provide data access and machine learning analytics capabilities for both academia and industry researchers. Poineer operates both a central and federated model of data sharing. For the federated model, CASUS will take on the challenge of establishing a federated analytics network. The use of both data sharing models has allowed Poineer to maximize both data protection and data utilization.

“The expertise of HZDR’s CASUS in large-scale data management will provide a secure, scalable and sustainable infrastructure to host the Poineer Prostate Cancer Big Data Platform. We are excited to embark on this next stage of Poineer with CASUS,” said Prof. James N’Dow, Academic lead for Poineer and Adjunct Secretary General of the European Association of Urology.

Besides providing the Poineer Big Data Platform cloud infrastructure, CASUS will also set up and support federated data analysis for all members of the consortium. For Dr. Michael Bussmann, Scientific Head of the Görlitz-based research center, this aspect is of paramount importance: “By developing advanced machine learning algorithms, we expect to come up with better predictive models of patient outcomes and disease progression. The focus is on established and new clinical and biological indicators, so-called biomarkers. We will try to find out if and how recording such biomarkers improves predictions throughout a prostate cancer patient’s care pathway.”

Safeguarding data access and data protection

Many stakeholders throughout the healthcare system collect medical data. Data-driven machine learning is considered a powerful tool for analyzing these data. To achieve this, however, data must all be available in a common format and all data protection concerns must be adequately addressed.

Poineer operates with two data access models – a central and federated model. In the central model, a copy of the data is transferred to Poineer, converted and stored in a central data warehouse for research. In the federated model, data owners standardize their own data sets and set up analytical tools within their own data environment supplying Poineer with aggregated results from requested analytic tasks. In this data access model, the data does not leave its original site. Data from a variety of sources are effectively temporarily “linked” in order to address specific remote queries. Poineer is thus bringing the analysis to the data. Within Poineer, CASUS will be responsible for the coordination and management of both data utilization models.

Within Poineer the data has been redacted to ensure sufficient anonymity such that the identification of the person to whom the data relates is virtually impossible. The data within Poineer’s Big Data Platform is not classified as personal data and as such the use of the data complies with all applicable data protection laws at the EU level. These data fall outside the scope of the EU’s General Data Protection Regulation while maintaining their clinical relevance.

Open questions in prostate cancer research

In general, Poineer aims to both identify and close knowledge gaps in prostate cancer research. Among the most pressing open questions determined so far are: What are the relevant tumor-specific and patient-specific variables that affect prognosis of prostate cancer patients suitable for active surveillance? What is the natural history of prostate cancer patients undergoing conservative management (i.e., watchful waiting) and what is the impact of comorbidities and life expectancy on long-term outcomes? By scrutinizing data from diverse populations of prostate cancer patients across different stages of the disease (and from different European countries) Poineer is expected to provide evidence-based answers to these questions to facilitate improved shared-decision making between physicians and patients. The final goal is to not only improve prostate-cancer related outcomes but also to increase healthcare system efficiency and the overall quality of health and social care.

Subscribe to our newsletter

Related articles

AI Eve augments genetic tests

AI Eve augments genetic tests

AI model called EVE shows remarkable capacity to interpret the meaning of gene variants in humans as benign or disease-causing.

Enabling AI-driven advances without sacrificing privacy

Enabling AI-driven advances without sacrificing privacy

Secure AI Labs is expanding access to encrypted health care data to advance AI-driven innovation in the field.

AI could crack the language of cancer

AI could crack the language of cancer

Powerful algorithms used by Netflix, Amazon and Facebook can ‘predict’ the biological language of cancer and neurodegenerative diseases like Alzheimer's.

AI uncovers missing info about ethnicity in population health

AI uncovers missing info about ethnicity in population health

Machine learning can be used to fill a significant gap in Canadian public health data related to ethnicity and Aboriginal status, according to research by a University of Alberta research epidemiologist.

Machine learning system sorts out materials' databases

Machine learning system sorts out materials' databases

Scientists have used machin -learning to organize the chemical diversity found in the ever-growing databases for the popular metal-organic framework materials.

Federated learning allows hospitals to share data privately

Federated learning allows hospitals to share data privately

Researchers have shown that federated learning is successful in the context of brain imaging, by being able to analyze MRI scans of brain tumor patients and distinguish healthy brain tissue from cancerous regions.

Deep learning identifies molecular patterns of cancer

Deep learning identifies molecular patterns of cancer

An AI platform can analyze genomic data extremely quickly, picking out key patterns to classify different types of colorectal tumors and improve the drug discovery process.

AI identifies key patterns of infant movements

AI identifies key patterns of infant movements

A video recording of an infant lying in bed can be analyzed with artificial intelligence to extract quantitative information useful for assessing the child’s development as well as the efficacy of ongoing therapy.

Beware of medical ‘shadow’ records, ‘black box’ tools

Beware of medical ‘shadow’ records, ‘black box’ tools

A team of experts led by two University of Michigan researchers calls for attention to this shadow record.

Popular articles

Subscribe to Newsletter