HDRN Canada launches learning series with a crash course in federated analysis
“This series has been long in the making,” remarked Dr. Kim McGrail, Scientific Director of HDRN Canada, as she kicked off the inaugural session of the Federated Analysis: State of the Science Collective Learning Series. Curated by HDRN Canada, this seven-part limited series offers an in-depth exploration of federated analysis, from the current practice of distributed analysis to artificial intelligence and federated learning — unprecedented in the Canadian context.
To build the series, HDRN Canada pulled together a steering committee of academic and industry experts in data analysis, data linkage, health services research, health equity, and biostatistics from across Canada, led by Dr. Kim McGrail, HDRN Canada Scientific Director. “Pretty early on in the planning of this series, we realized certain critical elements were missing, such as common definitions for the type of data analyses that we used,” explained Dr. McGrail.
We’re not saying that federated analysis is going to take over and be the only way that we use data, but more options can make more things possible. ~ Dr. Kim McGrail, HDRN Canada CEO
Pooled analysis, for example, is the analysis of individual data that are combined from multiple locations and/or multiple different sources of data. “This is what we think of as the traditional analysis when we’re doing multi-regional research. We put all the data in a single place—it’s pooled,” said McGrail. “Distributed data is the idea that they are stored across multiple organizations, institutions or data centres, whereas federated data is distributed data that are able to be analyzed together while remaining separate.”
With new developments in statistical analysis, federated analysis can be a powerful tool for researchers, added Dr. McGrail, noting that it’s not a one-size-fits-all method but rather an additional option for data analysis. “We’re not assuming that there’s any one particular approach that’s a single solution or a silver bullet. We’re not saying that federated analysis is going to take over and be the only way that we use data, but more options can make more things possible,” she said.
How do we approach federated analysis? Three components are needed according to Dr. Robert Platt, CNODES Executive Co-Lead: consistency of a common protocol and analytic plan, data harmonization, and a common data model. “This means everybody’s working from exactly the same protocol and analytic steps,” said Platt. “We standardize the data using this common protocol approach so that those datasets all look fairly identical.” These are the first steps of making data ready, interoperable and reusable for analysis. In other words, harmonization or standardization of the data is key to building a common data model, a standardized logical infrastructure organizing and operating data from many sources, with analytic code that can be run in any province or territory across the country.
Federating data helps to address data localization, data sovereignty, ongoing community engagement and responsibilities regarding the use of data. Despite the potential benefits of federated analysis, several challenges remain, including privacy laws and organizational differences.“There’s data stewardship and an approach to risk and decision-making about whether data can flow outside of institutions or across borders,” said McGrail. “There’s the size and complexity of data, for example, genomic information that can be very complicated, or data file sizes get so large that often they need to stay where they are because of the technical complexities of providing protection.”
But, as Platt noted, the technical and methodological problems presented by federated analysis are manageable. “The challenges here are primarily at the level of the legal, political, and financial side of getting the energy and funding behind data curation, preparation and documentation,” said Platt. Continued McGrail: “We’ll need some form of common data infrastructure, ideally integrated with existing structures, meaning we don’t necessarily need to build entirely new things. We can leverage the investments that have already been made.”
That’s why HDRN Canada launched the Federated Analysis: State of the Science Collective Learning Series: to foster a common understanding of federated analysis and its untapped potential to address research ambitions and challenges. “We want to translate the lessons learned into tools and resources available to the broader research community,” said McGrail. “And in some cases, this is going to mean developing new tools and possibly advocating for policies or investments or other resources that can help support multi-regional research.”
Federated Analysis: State of the Science Collective Learning Series runs monthly until July. It may be followed by a conference to showcase learnings developed over the life of the series. Watch past recordings on HDRN Canada’s YouTube channel.