SIMBioData: Standardized integration of multi-omics biomedical data
Research Project | 2 Project Members
As technological advances enable the collection of vast datasets of biomedical measurements, many ongoing studies attempt to decipher various aspects of human health from such data. Although the focus has been primarily on genetic information, other data modalities, such as the abundances of RNAs and proteins within cells and tissues, relate more directly to phenotypes. However, these latter modalities raise significantly more data analysis challenges, and so far, the emphasis in large consortia has been almost exclusively on data production, curation and storage. Efforts to standardize analysis methods so as to allow application on a large scale without the need for subjective choices, are virtually nonexistent. Moreover, while measures have been put in place to ensure that the data generated in scientific studies satisfies the FAIR principles, FAIRification of methods does not help in addressing issues of data quality, internal consistency, and interpretability of analysis outputs. We propose that to really harness the potential of the wealth of omics data for biomedical research, it is essential to establish a standardized, sustainable and evolvable method infrastructure for extracting biophysically-meaningful quantities and underlying regulatory information. In particular, only by providing standardized methods that extract biophysically-meaningful quantities in a transparent manner, will it become possible to quantitatively compare and integrate results from omics data across different modalities and experimental approaches. In addition, we feel that our project will provide an ideal prototype for the analysis component of the SwissBioData initiative, which is scheduled to start after the completion of our project.