Thursday, 3 October 2013

i4Life Part 4: Download and Web Services

Download and Web Services 

Previous posts in this series:
Part 1: Improving the world's taxonomic data indexing (inlcudes full data flow diagram)
Part 2: Global Species Databases
Part 3: The Catalogue of Life

There are a number of ways to access the Catalogue of Life. For checking species names or classifications you can do it online through the search and browse interface. You can also download the results of that search using the export option on any results page. However, if you are a user who needs access to all of the Catalogue, like partners in the i4Life project, Download and Web Services are the best method to transfer large quantities of data.

The Download Service has a graphical user interface that is accessed through a password-protected page on the i4Life website. Web Services are accessed through a URL anywhere. The instructions on how to do this are found on the Web Services page of the Catalogue of Life website. Both methods enable access to the Catalogue of Life database to allow transfer of its contents in DarwinCore Archive format, an essential step in the i4Life data flow shown in the diagram above. Anyone can activate an export process of the Catalogue data using the Download Service (see image below) once they have registered and been given a password. It is a case of selecting the data that is required then pressing a button and the data is automatically downloaded to your computer as a zip file. If you do not want the whole Catalogue, the form allows you to narrow down the export to a specific taxon by selecting it from the drop down box for each rank. So if you want to limit your export to a specific order, family, genus etc., you can. You can also just download the classification without the species names, or alternatively the species names without the classification for any chosen taxon. It is not necessary to have any programming skills to operate the Download Service, but you may need to understand relational data tables and software to be able to do anything useful with the data once you have exported it. As soon as the Dynamic Checklist is updated each month, users can download all or part of this refreshed instance of the Catalogue of Life using the Download Service in this way.

i4Life Download Service interface

This process of obtaining the Catalogue of Life using the Download Service requires human involvement through clicking buttons and selecting options. Our partners and collaborators generally want a more automated way of getting the Catalogue of Life data into their systems. Preferring to control the activation of this process at their end. This is where Web Services come in. Web Services are the Catalogue of Life’s equivalent to APIs (Application Programming Interface). What is an API? Well it is a method that allows one person’s website to plug into another. The instructions that the Catalogue of Life supply on its Web Services information page enables a programmer to set up this exchange. If someone else builds something using this method they may call it a Catalogue of Life application (or widget or tool) and it can become a fixed part of their own e-infrastructure or website. Our partners use Web Services to do different things. For example, IUCN use it as a link-out service (see image below) from their Red List website. What this means is that when someone searches for a species on the IUCN Red List website that has not yet had its conservation status assessed, it would previously have returned ‘not found’. What now happens is that the user's taxon name (in the form of a text string) is used to dynamically query the Catalogue of Life database using Web Services. If there is a match the IUCN Red List website will use the returning information supplied by Web Services. What is returned is in a format that computers can transfer (either XML or PHP) and interpret. More code on the IUCN Red List website then displays it in a user-friendly way - a name with a hyperlink back to the Catalogue of Life taxon record. This lets the IUCN Red List user know that a species with this name does exist (ie they haven’t spelt it wrong and here is in the Catalogue of Life!), but it is not yet assessed.

IUCN Red List link-out to Catalogue of Life

This is one use of Web Services, but they can be used in many different ways to build different applications on other websites. While the advantages are clear for IUCN Red List and other partners in that it enables a real-time display of data from the current updated version of the Dynamic Checklist; for the Catalogue of Life the benefits are that it increases our user base, promoting the data via large, high profile biodiversity websites. Not only partner websites use Web Services, many individual users and commercial users are accessing the Catalogue of life this way, leading to a satellite distribution of the Catalogue worldwide. Any non-commercial user is free to use the latest edition of the Dynamic and Annual Checklists but is required when using it in another system to abide by the Terms of Use and notify the Species 2000 Secretariat. The reason for this is the Catalogue of Life's commitment to ensure proper credit and attribution goes to Global Species Databases, the knowledge-base upon which the Catalogue is built. By tracking users we can make sure that we are fulfilling our requirements as suppliers of this data.

Next up: i4Life Part 5: Global Biodiversity Partners

No comments:

Post a Comment