Two further resources developed by SIB Groups have been designated as of crucial importance for life sciences, as a result of the latest edition of the selection process led by ELIXIR at the European level: Cellosaurus – a cell lines’ encyclopedia – developed by Amos Bairoch at the University of Geneva in the context of the CALIPHO Group, and Rhea – a biochemical reactions knowledgebase – developed by the Swiss-Prot Group led by Alan Bridge. “After UniProtKB/Swiss-Prot and STRING, Cellosaurus and Rhea are the third and fourth Swiss-made resources to be added to the ELIXIR Core Data Resource portfolio: this is excellent news and a key step in ensuring their sustainability and a long-term access to high-quality biological data for users from around the world”, explains Christine Durinx, SIB Executive Director.
What are ELIXIR Core Data Resources?
ELIXIR Core Data Resources are a set of European data resources of fundamental importance to the wider life-science community and the long-term preservation of biological data. Identification of the ELIXIR Core Data Resources involves a careful evaluation of the multiple facets of the data resources. The details of the selection criteria are described in the F1000R ELIXIR track article 'Identifying ELIXIR Core Data Resources'.
We asked Group Leader Amos Bairoch of Cellosaurus, and members of the Rhea team at Swiss-Prot (Kristian Axelsen, Senior Biocurator; Alan Bridge, Group Leader and Anne Morgat, Resource Manager) to shed light on the implications of this announcement and to tell us about their resource.
Could you describe the resource in a nutshell?
Cellosaurus / Amos: The Cellosaurus is a database of cell lines, i.e. cultures of human and animal cells that can be grown for prolonged periods in vitro, used in the context of life sciences research. It contains a wealth of information on over 125,000 cell lines that have been created since the 1950s.
Rhea / Kristian: Rhea is a knowledgebase of biochemical reactions described using chemical species from the ChEBI (Chemical Entities of Biological Interest) dictionary of small molecules. Rhea is the reference vocabulary for enzyme and transport protein annotation in UniProt.
For what kind of research or to address which type of research questions is the resource instrumental?
Cellosaurus / Amos: Almost everyone doing in vitro research makes use of cell lines: they are essential for instance to study vaccine production, drug metabolism, gene function, or even artificial skin generation. Depending on their research questions, scientists need to know what cell lines to use and where to find them: this is the sort of information they will find in the Cellosaurus.
Rhea / Alan: Rhea provides an essential support to computational approaches to study and engineer enzymes and the metabolic systems in which they function. These approaches have the potential to revolutionize our understanding of metabolism and to open up new routes to the production of biofuels, drugs, and other desirable chemicals.
Can you cite one particularly exciting example of an application of the resource, or of a recent study done using it?
Cellosaurus / Amos: Up to 30% of all research carried out on cancer cell lines is estimated to be affected by cell line misidentification or contamination. To alleviate this fundamental problem, a very important application of Cellosaurus is the CLASTR tool which allows searching for similarity of the STR profile of user cell lines against those stored in the resource.
Rhea / Anne: One recent study used Rhea and UniProt in the identification of potential metabolites as early biomarkers for neurodevelopmental defects, and therapeutic targets for schizophrenia.
How did it get to where it is now? What is it about the SIB ecosystem that enabled its inception and success?
Cellosaurus / Amos: The Cellosaurus originated as a thesaurus of cell lines in the context of neXtProt, a SIB Resource. We wanted to record what cell lines had been used in experiments producing data sets captured in neXtProt. This small-scale effort grew into a full-fledged knowledge resource, and in May 2015 it became available as a standalone resource on Expasy, the SIB bioinformatics resource portal. A big thanks to the Expasy team and in particular Elisabeth Gasteiger (User Experience & Support Manager, Swiss-Prot Group) for deploying the Cellosaurus on the web and contributing to this success!
Rhea / Alan: SIB provides an excellent environment for knowledge resource development, working to identify and support emerging and mature knowledge resources in much the same way as ELIXIR at the European level. SIB support has been absolutely crucial to the development of Rhea, which is closely coordinated with that of UniProt and other SIB resources such as ENZYME and SwissLipids. Anne: Also crucial are collaborations with our international partners – particularly the ChEBI team at EMBL-EBI, as well as The Gene Ontology Consortium and Reactome, with whom we are working closely to improve the coverage and interoperability of our respective knowledge resources, and with teams from other ELIXIR nodes such as ELIXIR-CZ, with whom we collaborate closely on implementation of advanced chemical structure searches in Rhea.
How do you anticipate its CDR status will influence its development?
Cellosaurus / Amos: It will help in getting the Cellosaurus better funded and will allow us to further develop the resource’s query system and its interoperability with other resources, for instance through a semantic model, an API as well as a SPARQL-based search engine. These developments will greatly benefit bioinformatics users of the resource both in academia and industry.
Rhea / Alan: Core Data Resources are recognized as being of fundamental importance for the long-term preservation of biological data. CDR status will boost the profile of Rhea enormously both within the scientific (user) community and will we hope eventually help secure more stable funding.
Read more about the complete set of recently selected resources, the process and the importance of the Core Data Resources on ELIXIR’s website.
Don’t miss the upcoming February streamed course "A guided tour through Cellosaurus" and register here. And get ready for the October streamed course “Mining enzyme data in UniProtKB using Rhea" here.