TRIPLE is a newly funded European initiative that will enable an unprecedented level of interoperable data sharing between researchers from any science domain. Access to both public and selectively shared private research data will be facilitated through innovative solutions. Coordinated by SIB, harnessing its expertise in data FAIRification, knowledge representation and open databases, the project brings together partners from Belgium and the Czech Republic. Through the discoveries it will enable, TRIPLE will benefit the scientific community at large as well as society, with one of its first applications being the search for organisms that could help degrade pollutants.

1. Safely storing private data in vaults

Solid Pods are a new technology that can secure data in vaults. This way, sensitive datasets such as those from early stages of research can maintain their privacy while still being made accessible to defined groups when desired, and then to all once appropriate. 

Leveraging a wealth of previously unexploited data

Imagine a future in which one can have access, at once, to data from open resources as well as to private data selectively shared by others, such as unpublished data. Such access to an untapped aggregated body of knowledge would be a tremendous boost for research reproducibility and accelerate discoveries. It is the goal of TRIPLE or Transforming RDF Interoperability with Solid Pods for next Level Experience. This new project, funded by the European CHIST-ERA call to foster open research data (ORD), is led by SIB and brings together the Institute of Organic Chemistry and Biochemistry at the Czech Academy of Sciences (IOCB Prague) and the University of Ghent (Belgium).

“Acting as the coordinator of such a European project is a recognition of both our long-term expertise on the front of open research data and of our capability in bringing together multidisciplinary actors at a large scale,” says Christophe Dessimoz, SIB’s Executive Director. “TRIPLE aims to build a cornerstone on which integrated searches can be performed over public-private research data.”

Read more about our coordination roles in large-scale projects, from diabetes to personalized health

2. A first use case to identify organisms to break down pollutants using SIB Resources

As a first use case of TRIPLE, a search will be conducted for organisms capable of breaking down pollutants, a process known as bioremediation. This is particularly challenging since it requires to retrieve, from public databases and private datasets, the proteins within organisms that perform the appropriate biochemical reactions. This will be done in particular using the SIB Resources Rhea, UniProtKB and OMA. A new methodology will be developed to simplify such complex queries for domain scientists.   

Innovative solutions building on open research data expertise

To achieve this aim, TRIPLE will be combining the multidisciplinary expertise of the project partners in making research data open as well as enabling seamless interoperability with private data stored in secure data vaults known as Solid Pods (see box 1). 

“A powerful way to make the most of research data is by making datasets well-documented and interoperable, such that they can then be queried jointly, i.e. in conjunction. By creating knowledge graphs from data, existing interconnections can be leveraged to extract new insights from the wealth of data through federated queries. All these skills are core to SIB,” says Ana Claudia Sima of the Knowledge Representation unit at SIB’s Vital-IT group. 

Read more about how we foster ORD

Three SIB Groups will thus be providing their know-how to the project: data FAIRification in the context of large European public-private partnerships, knowledge representation and metabolic modelling at the Vital-IT group; and developing interconnected open software and databases, such as the SIB Resources UniProtKB and Rhea at the Swiss-Prot group and the OMA browser at the Comparative Genomics group. Our teams are joining forces with the Bioinformatics group at IOCB Prague and the Internet Technology and Data Science Lab (IDLab) at the University of Ghent, who bring their knowledge in developing open chemoinformatics tools, the semantic web as well as the use of Solid Pods.

Far-reaching impacts from the scientific community to society

TRIPLE will benefit software developers and data producers by providing improved documentation and new tools to make complex federated queries more efficient. Life science researchers and others interested in the produced data will also be able to mine online resources more powerfully and integrate their own data while ensuring it is FAIR. Society at large will eventually benefit from the results and knowledge gained from this increased exploitation of data (see box 2). 

SIB Members involved: