UNIverse - Public Research Portal
Project cover

Open Research Data (ORD) best practices for computational macromolecular models. (Short title: ModelArchive)

Research Project
 | 
01.01.2023
 - 31.12.2024

Open Research Data (ORD) best practices for computational macromolecular models Biological macromolecules such as proteins, DNA or RNA are essential for almost all biological processes. To gain insights into their function, life science research relies on accurate information on their 3D structure. Typically, such structures are determined experimentally at atomic resolution via X-ray crystallography, NMR, and increasingly single particle cryo EM techniques. In recent years, computational methods for structure prediction have made impressive progress, achieving near-experimental accuracy in predicting 3D structures of proteins. This breakthrough has large implications for structure-based approaches in different research fields, including life sciences, biomedical research, ecology, protein engineering, biotechnology and green chemistry. Not surprisingly, the journal Nature has nominated protein structure prediction as "Method of the Year 2021". Since the creation of the Protein Data Bank in 1971, the structural biology community pioneered open research data principles. The PDB (https://www.wwpdb.org) is the single global archive of 3D structures of biological macromolecules determined by experimental techniques, but not for structures obtained through computational modelling. As a consequence, computational models are often stored in undefined locations in a variety of incompatible formats, and lack essential metadata indicating their usability (e.g. model quality estimates or licence information). Following a recommendation given in an international community workshop, we have developed an archive for computed macromolecular structures (https://modelarchive.org) and an extension of the mmCIF data format to store metadata. With the technical infrastructure of ModelArchive now established, we are in a good position to further develop respective ORD practices in our community. This includes promotion of best practices for data and metadata interoperability standards, collaborating with scientific journals and funding agencies on establishing deposition policies, improving reusability of protein models by promoting accuracy estimates, and interlinking with other ORD resources to make models easily findable and accessible.

Funding

Open Research Data (ORD) best practices for computational macromolecular models. (Short title: ModelArchive)

Foundations / Associations (GrantsTool), 01.2023-12.2024 (24)
PI : Schwede, Torsten.

Members (7)

Profile Photo

Torsten Schwede

Principal Investigator
FEMALE avatar

Leila Alexander

Project Member
MALE avatar

Stefan Bienert

Project Member
MALE avatar

Xavier Robin

Project Member
MALE avatar

Gabriel Studer

Project Member
MALE avatar

Gerardo Tauriello

Project Member
Show more