Zookeys at 10
Today’s 10th anniversary of Zookeys is a great day for biodiversity. The discoveries reported in the press release and celebrated in the 770th issue of the journal are magnificent and open access, unlike the big bulk of taxonomic articles.
The real and unreported impact of Zookeys, its sister Biodiversity Data Journal and the many other natural history journals hosted by Pensoft is neither reported by Pensoft, widely recognized by the taxonomists nor the wider scientific community: The revolution caused, and enabled by, the technical changes adopted in the publishing of Zookeys and the subsequent journals.
Yes, Zookeys has been an early adopter of the open access paradigm. Ten years ago, it was novel and it would take quiet a courageous discussion for a commercial publisher to delve into a largely unknown business model. A business model whereby the publishing has to be paid with the consequence that the article is afterwards be open to anybody in the world. Now, this is widely required by science funders, but still, probably most of the fellow taxonomists don’t realize what it means, that anybody, well beyond the few colleagues, have access to the publication of a new species or other relevant results.
But this is only the beginning. Another very big step has been to make Zookeys the first taxonomy journal to be accepted at PubMed Central and thus expanding the coverage of the largest archive of biomedical literature to include taxonomy. This happened by changing from publishing in a traditional print/PDF way to join Plazi to develop together with US National Library of Medicine the first domain specific flavor of the widely used Journal Article Tag Suit used to import scholarly articles into PubMed and PubMed Central. This was not just a technical change. It had another widely unknown consequence to almost everybody.
During the Linnaeus 250 anniversary celebration in Paris, Pensoft’s president Lyubomir Penev must have gotten convinced by Plazi’s contribution “1758 Binomen – 2008 e-publications” that publishing has to change so that machines can understand its content, in this context, the many taxonomic treatments that are communicated by the taxonomists in their millions of publications that include each single report of a new species and subsequent augmentations.
Today the FAIR principles are a core element of open science, referring to findable, accessible, interoperable and reusable data elements. Having a tag set that allows to tag elements in a publications such as taxonomic treatments, other elements from geographic coordinates to scientific names to materials cited in publications, allows also to annotate them with persistent identifiers, so they can be cited, or link to the standard vocabularies such as the widely used Darwin Core. This system enables automatic, immediate annotation and dissemination of taxonomic data across platforms.
Zookeys has been the first journals that championed automatic minting of Zoobank ideas for new taxa. When GBIF celebrates today its 1 Billlion’s upload based on 39.570 datasets, Zookeys is among the 22,698 datasets extracted by Plazi, in this case fully automatically, from scholarly articles. This is in stark contrast from Plazi’s effort to convert imprisoned data from PDF based publications which needs to be done for all but the Pensoft publications.
With other words, Pensoft shows the way forward how an extremely costly expedition to discover known biodiversity, the complexity only to digitize volumes being shown by the Biodiversity Heritage Library. Having a tag set in place that allows tagging taxonomic and nomenclatural elements in place allows citing well beyond the usual citation of articles. Treatments cite treatments, including often qualifier such as synonymies, reference to protologues or just augmentation to earlier treatments which lends its well for linked open data applications like synospecies. In fact this opens the door to create the catalogue of life by machine, and with that frees time of thousands of editors in catalogues to do science.
Citable treatments also allows linking cited specimens to referenced digital object increasing produced by projects like idigBio or the recently accepted ESFRI DiSSCo infrastructure in the near future. At the same time, having persistent identifiers for treatments allows to link specimens with the respective data in publications.
All Zookeys’ figures and the articles themselves are submitted to the Biodiversity Literature Repository allowing citing individual figures. Together with the input by Plazi, this open access repository now includes over 180,000 scholarly images and 30,000 articles, all heavily annotated with links to related items, such as the taxonomic treatment a figure cites, or an article including figures of citing other figures. The 50,000+ figures submitted by Pensoft are part of an emerging image based index to the taxonomic literature.
But all the data being available this way add to another Billion, a billion of facts that will be available in the OpenBiodiv knowledge management system based on facts produced in a daily way by Pensoft and extracted by Plazi with the support from the Arcadia Fund and a productive collaboration with Zenodo at CERN.
It’s time to celebrate and be happy what has been achieved during the last ten years. But this success story should also be an encouragement to look optimistically into the next ten years. If I had a wish I would like to see an open biodiversity knowledge management system as the base of all our knowledge we generate daily and have the billion of facts speak by themsleves to stimulate discovering the known.
Last but not least, I wish Pensoft a lot of energy and enthusiasm, innovation and enterpreneurship in support of biodiversity research, and an increasing awareness and adopter of this so far rather quiet revolution.