Text Mining and Biodiversity Research Infrastructure

May 24, 2023

Conference:	SwissText 2023
Location:	Espa. de l’Europe 11, Neuchatel, Switzerland
Date and Time:	Mon, 12 Jun 2023 12:00 AM UTC +01:00h
Session:	Text Mining and Biodiversity Research Infrastructure
Description:	we will introduce the “biodiversity PMC” built and maintained by SIBiLS, Zenodo and Plazi, making use of the recently reviewed copyright law in Switzerland

The scientific knowledge on biodiversity is imprisoned in a daily growing corpus of hundred millions of pages of scientific publications. This knowledge is needed to better understand the dynamics and dimensions of the global biodiversity crisis, to understand the impact of climate change on the distribution of species or to understand the viral spillover from animals to humans. This knowledge is very difficult to access because it is unstructured, in printed formats, including portable data format (PDF), which are difficult to machine operate, or closed access. The power of access to millions of machine actionable articles in PMC, including millions of supplementary data files, or tens of millions of abstracts in PubMed and tools to annotate and mine and discover new facts is obvious. These tools could be used for TDM and annotations of biodiversity literature - but the PMC/PubMed equivalent has not been available for publications in the biodiversity domain, hence the need for a BiodiversityPMC!

In this workshop we will introduce the fledgling “biodiversity PMC” built and maintained by SIBiLS, Zenodo and Plazi, making use of the recently reviewed copyright law in Switzerland. Legal, institutional, technical aspects from processing to long term storage and accessing and annotating of the data will be discussed. This will be complemented by the research questions driving this effort from discovering known biodiversity, to extracting traits to study the impact of climate change to annotating biotic interactions to understand viral spillover to build question/answer systems.