eBioDiv

Funding Agency: Swissuniversities
PI: Swiss Insitute of Bioinformatics/Haute École spécialisée de Suisse occidentale, Geneva
Dates: September 2021–August 2023

The Earth’s scholarly knowledge about species diversity (biodiversity) is included in a corpus of several hundred million pages of publications spanning over 250 years, with an arbitrary starting point of 1753 for plants and 1758 for animals. Each year an estimated 19,000 animal and plant species and a multiple of augmentations of data are added to the already approximately known 1.9M species. The data about each species are included in highly structured taxonomic treatments and figures. Increasingly these treatments include implicit links to the data used to describe and augment it, such as –omic and digitized specimen data produced by SwissBioCollection. Because of its structure, this data can be extracted automatically to a high degree, bidirectionally linked from the literature to the cited resources and vice versa, made FAIR, and immediately reused by data aggregators such as GBIF and other researchers.

The proposed e-BiodDiv will on the one hand provide a service for Swiss biodiversity scientists to access and disseminate their research data about species in legacy and prospective publications, provide access to data about their collections, scientists and specimens. It will complement the recently funded SwissBioCollection program and genomic data. On the other hand, importing treatments into the Swiss Institute of Bioinformatics Literature Services (SIBiLS) and Europe PMC (ePMC) opens them for text and data mining through SIBiLS dedicated tools, and by the life science community.To a large extent, this customized service is based on existing services such as the Biodiversity Literature Repository (BLR), Plazi TreatmentBank (TB), and Zenodo. The long term goal is to integrate this service in the portfolio of SIBiLS linking the biodiversity research data infrastructure in Switzerland with the -omics infrastructure.

This project will complement and make use of the Horizon 2020 funded research infrastructure (BiCIKL) development by providing the specific annotation services for the Swiss based scientists, and make a production level import of taxonomic treatments into SIBiLS. In addition to Patrick Ruch’s team at HES-SO and SIB (Swiss Institute of Bioinformatics) and Beat Esterman’s team at BFH, the project is built on the direct contribution of Plazi and the Natural History Museum of Bern (NMBE).

The Swiss-based BLR and TB are the world’s leading services for liberating biodiversity data imprisoned in PDFs (portable document format) and a major contributor to the Global Biodiversity Information Facility (GBIF) where the FAIR data is reused.