SIB and other leading infrastructures and biodiversity information experts will make crucial knowledge on our planet’s species openly available in FAIR, machine-readable and AI-ready formats. The ‘Disentis roadmap’ aims to extract and link biodiversity data and knowledge from an estimated 500 million pages of research publications by 2035.

The Disentis roadmap has been signed by 24 major natural history collections, research infrastructures, journal publishers, and global biodiversity networks and a further 38 individual experts in five continents to date. In addition to SIB, these include the National Museum of Natural History in Paris, the Kew Royal Botanic Gardens, the Global Biodiversity Information Facility, Pensoft Publishers, and the Biodiversity Information Standards community.

See all roadmap signatories

Enhancing biodiversity data discovery, access and reuse

Scientists have collected a wealth of data on the natural world over the past 300 years, including species descriptions, species distributions, and insights into drivers of environmental change. These data are vital for halting the ongoing biodiversity crisisimplementing One Health approaches and training accurate AI models. However, much of this knowledge is not fully open, accessible and/or connected. This is a major roadblock to scientific progress, evidence-based policies and informed decisions.

The Disentis roadmap has been signed by 24 major natural history collections, research infrastructures, journal publishers, and global biodiversity networks and a further 38 individual experts in five continents to date. In addition to SIB, these include the National Museum of Natural History in Paris, the Kew Royal Botanic Gardens, the Global Biodiversity Information Facility, Pensoft Publishers, and the Biodiversity Information Standards community.

See all roadmap signatories

The Disentis Roadmap is a 10-year plan to ‘liberate’ these data from research publications. SIB contributed to its drafting, is a signatory, and will support its implementation as part of an international collaboration. The roadmap falls under a wider area of work by our Environmental Bioinformatics group to integrate biodiversity and environmental knowledge from multiple sources for more meaningful analyses – and supports our mission to unlock the potential of biodata to enable innovation for a better future.

An open science framework linking new species data with published knowledge

The project will extract information from digitized PDF articles using text mining technologies and annotation workflows – such as those developed by Plazi, a digital taxonomic literature repository working closely with the SIB Text Mining group. The biodiversity data liberated from scholarly publications are made openly available at the Biodiversity Literature Repository Zenodo community hosted by CERN. The digital library can then feed into other platforms serving as key complementary sources of knowledge for biodiversity research today, including the Biodiversity PMC resource and other open, linked data infrastructures. This process enables data on new species, and the physical location of cited stored specimens, to be openly accessible in near-real time and available for long-term access. 

The roadmap’s specific goals for 2035 are that:

  • all major public biodiversity research funders and academic publishers will encourage and enable publication of data adhering to the FAIR principles (findable, accessible, interoperable and reusable);
  • biodiversity-focused publications will be accessible in machine-actionable formats, with all non-copyrightable parts of articles flowing into public data repositories;
  • published research on biodiversity will be ‘fully AI-ready’, that is openly available for AI training and properly labelled for ingestion by machine-learning modelled, within appropriate ethical and legal frameworks;
  • dedicated funding from research and infrastructure grants will be reserved for ensuring access to biodiversity data and knowledge.

The final ‘Biodiversity Libroscope’ will fill a much-needed niche of next-generation literature services and tools delivering high-quality data and other research objects (such as images, references and taxonomic features) on biological taxa, their relations between each other and with the environment, and their impact and importance for nature conservation, ecosystem services, and people.

Data science experts joining forces with publishers and biodiversity practitioners

The Disentis roadmap is the output of a symposium on biodiversity knowledge held in August 2024, which brought together leading biodiversity, open science, and data management experts – including representatives from SIB, who contributed bioinformatics expertise on data infrastructure, information extraction and linking, legal considerations, and downstream use of data. 

The symposium and roadmap are a follow up to the 2014 Bouchout Declaration for Open Biodiversity Knowledge Management, signed by over 300 global institutions and biodiversity experts.