EJT and GBIF Unveil New Portal to Enhance Access to Taxonomic Data

In a significant advancement for scientific research accessibility, the European Journal of Taxonomy (EJT) has launched a new portal hosted by the Global Biodiversity Information Facility. This pioneering platform allows researchers to delve into taxonomic data directly through the EJT’s website, marking a first in the realm of taxonomic journals.
The portal facilitates the exploration of detailed taxonomic treatments and the citation of specimens, enhancing the transparency and usability of data cited in the journal’s research articles. This development represents a crucial step in transforming data from traditional formats like print and PDF into structures that integrate seamlessly with knowledge graphs, large language models, and other artificial intelligence tools.
The initiative builds on the existing collaboration between Plazi and GBIF, which has already resulted in the creation of a comprehensive research data lifecycle, resulting in 1,800 journal articles that have cited material freed by Plazi. The data liberation efforts have been primarily supported by grants from the Arcadia Fund, contracts with EJT, the Paris Muséum national d’Histoire naturelle, the European Commission, and Swissuniversities.
The newly introduced EJT Portal stands out by utilizing data from Plazi’s TreatmentBank. This service has converted taxonomic publications into over 53,000 treatment datasets in GBIF, accounting for 48% of all datasets submitted to the facility. Unlike traditional GBIF datasets, which are based on the digitization of natural history collections, these datasets include material citations within taxonomic treatments, providing a rich source of expert-verified taxonomic identities.
The datasets offer state-of-the-art research results, encompassing not only new species discoveries but also revisiting previously identified specimens. They are particularly valuable for documenting species from underreported geographic areas, providing the only digital evidence available for some species.
Conversion from non-digital sources involves sophisticated tools that transform PDFs into structured, machine-readable formats. This process includes the identification and annotation of semantic structures such as taxonomic names and material citations. Although automation has significantly streamlined the conversion process, variations in the original texts can lead to errors, which are addressed through quality control measures and human curation. The portal also includes a mechanism for reporting errors, which are then immediately corrected and updated in the GBIF database.
This collaboration with publishers has also led to improvements in how articles are structured and published, reducing potential sources of error. The Biodiversity Data Journal and other publications have started producing articles in formats that facilitate error-free data conversion, hopefully encouraging more authors and publishers to adopt these practices.
The EJT portal not only enhances the visibility of the journal’s articles but also improves access to them. Each material citation and treatment links directly to the associated articles, significantly increasing the potential for discovery. The portal offers various dashboards that provide insights into the content of the articles and the actors involved, paving the way for new metrics to evaluate scientific contributions.
As part of a broader suite of user interfaces, the portal supports diverse research needs, providing access to taxonomic names through the extended Catalogue of Life, ChecklistBank, and Synospecies, and enabling content searches via BiodiversityPMC, or images searches via Ocellus.
At the recent Bouchout+10 Symposium in Disentis, the EJT portal was showcased as a model for future efforts to liberate global biodiversity knowledge from the confines of scientific literature, promising a new era of accessibility and transparency in taxonomic research.