Data Standards and Infrastructure
Charge / Overview
Deliver IT standards and infrastructure for IHCC to enable population-scale genomic and biomolecular data accessible across international borders.
- Create a federated solution with one or two central hubs for discovery across the IHCC network.
The solution would contain:
- Federated data discovery/authentication alignment
- Interoperability / harmonization of data
- Cohort level metadata representation
- Federated research analyses
- Clinical applications
- Findable, Accessible, Interoperable, Reusable (FAIR) data analyses
- Ethical, Legal and Social Implications (ELSI) of data sharing
- Create an Atlas.
A barrier to enhanced reuse is finding the data from the queries of users (researchers, clinicians). An Atlas is a short-term solution in which high-level metadata queries can be conducted with an interactive, searchable database that is updated from cohorts regularly.
- Create a harmonized cohort metadata by first understanding the complexity of metadata in IHCC, and next creating semantic harmonization using established semantic mapping techniques, e.g.:
- International Statistical Classification of Diseases and Related Problems (ICD10)