Desktop data mining and extraction

GoldenGATE editor versions

The GoldenGATE Document Editor is a visual editor for marking up documents in XML. It is designed to do most of the markup automatically; manual work is reduced to correcting the output of automated components. For these corrections, there are many specialized dialogs and document views, which display the required information in a concise fashion and provide high-level assistance to the user. In addition, the editor provides assistance for editing and marking up documents manually. A flexible, plug-in-based software architecture allows for quickly integrating new components, and for deploying upgrades of existing ones. This is not restricted to components for automated markup and document views, but also comprises handling of different data formats, and different types of data storage, e.g. the local file system, databases, and web-based data providers.

The automated markup for taxonomic documents includes finding taxonomic names, figuring out their genus, species, etc., and obtaining an LSID for them. In addition, there are functions for marking up taxonomic treatments and their inner structure, i.e. which part of the treatment provides a morphological description of a taxon, which one lists materials examined, etc. A parser for extracting individual collecting events from the latter is under development.

Online documentation material is available at the community portal which also offers support.

Standard version

This is a stable version that uses as input html documents. Minor uploads are updated at the launch of the program.

Imagine (PDF based) version

This is an alpha version. In case of updates, the user will be asked to update the program if access to the Internet is allowed in the start phase.

The main goal of the Imagine version is to provide a markup tool for digital born PDFs and thus to cut short the OCR- or text conversion process, as well as to allow incremental markup and upload to SRS.

If you are interested in using this new version, please contact Plazi Info.

GoldenGATE Web services

Asynchronous web service

The asynchronous webservice allows parsing bibliographic references (Bib Ref Parser), date tagger, geo-coordinate tagger, materials citation extractor, quantity tagger, taxon name tagger, and TaxPub materials ctiation extractor.

Synchronous web service

The synchronous webservice tests the synchronous version of GoldenGATE Web Services, with calls consisting of a simple request and response, and no user interactivity possible.