Wednesday, 30 October 2013

Example uses of the Catalogue of Life: Media libraries

Anyone who has ever searched an online image database looking for a particular species by scientific name will know it can be a frustrating exercise. Often the name you enter brings back a completely different species from the one you are looking for, or else it returns nothing at all. Yet this may not be due to the species identification skills of the photographer, or the online library's lack of images, but rather a result of the limitations of the controlled vocabularies that index these media libraries. For specialists that want to find images by scientific name, and for nature photographers with identification expertise that want to sell them, the integration of the Catalogue of Life taxonomy into these systems could certainly help yield better results. The Catalogue of Life uniquely offers a simplified and unified hierarchical classification across all organisms plus one accepted name for each species, and as such, has the potential to be used as an indexing mechanism for this kind of file management.

Commercial (and social) online contributor-based media libraries (such as iStockphoto, Shutterstock, Corbis Images, Getty Images, Alarmy, Dreamstime) have had, and continue to experience, exponential growth. Indexing and retrieval effectiveness across an international market are key to their usability and profitability.  Nature is arguably the most photographed and videoed area of life, and keywording or tagging such images for later retrieval ideally includes identifying key organisms present in each shot. So why is it that managers of these media libraries find it difficult to index nature-related information in an efficient and accurate way for their specialist users and contributors? 

Fig 1: Just some of the international common 
names held in  Catalogue of Life for this 
well-known and  wide-spread species of conifer 
Pseudotsuga menziesii


花旗松 (hua qi song) - China
British Columbia fir -  Canada, France
Douglas spruce  - Canada, USA
Douglas-fir - UK, USA
Oregon pine - UK, USA
Douglas d'Orégon - France
sapin de Douglas - Canada, France
Douglasfichte- Germany
Amerikai duglászfenyo - Hungary
Abete di Douglas - Italy
Douglasgran - Norway
The naming of organisms, both scientific and common, is rife with synonymy (different name, same species) and homonymy (same name, different species). Where common-use names are both language and location dependent (see Fig 1) and scientific names, while internationally recognised, can change periodically as competing academic views take precedence as was shown in a previous blog post on elephants. In addition, the ability to name a subject is dependent on the expertise of the contributor or the index creator, leading to varying levels of specificity, where an animal to one person, may be a pig to another, and Sus scrofa domestica to someone else! The sheer number and complexity of names can cause a great deal of confusion as is shown in the example at the end of this post.

For contributor based media libraries, controlled vocabularies are one of the most effective ways to control synonyms, arrange terms into hierarchies (to broaden or narrow search terms based on level of expertise), and determine other related or associate terms. As museum collection managers have shown for centuries, the Linnaean taxonomy of binomials in a rank-based classification is the most effective controlled vocabulary to deal with these issues in relation to species. An accepted scientific name offers a unique and universal code for every species and can act as the indexing tag of all other names - for example with plants, it can index common names, horticultural cultivar names, food ingredients and natural products. Yet the current controlled vocabularies and tagging systems used by media libraries are not adequately curating species, leading to missing, limited or incorrect search results. More names need to be added to these vocabularies, existing content re-tagged and a expert curated taxonomy decided upon, before they can hope to service expert users and handle the continued expansion of content predicted. Unfortunately no single, complete, electronic list of accepted species names exists anywhere, let alone with associated common names and synonyms. But by adding the most comprehensive list of accepted species and rank names, and manoeuvring contributors through the controlled vocabulary to ultimately choose one (as the defining tag) would be a step forward.

The Catalogue of Life is working to complete an inventory of life on earth where all known species (~1.9m plants, animals, fungi and micro-organisms) are named, documented and made available on the web.  This global quality-assured checklist currently holds over 1.4m accepted names, 1m synonyms and 0.5m common names (in multiple languages) and is expanding every month.  The Catalogue of Life is already used as an indexing mechanism by the world’s largest online biodiversity providers (European Nucleotide Archive, Encyclopedia of Life, IUCN Redlist, GBIF) and as a synonym expansion search tool (ie type in one name it will find resources that include all known synonyms too) for text-based resources such as Biodiversity Heritage Library and the Dictionary of Natural Products. While science has different motivations and methods to update and curate their datasets from those of the commercial media library, both desire the same end product - a quality controlled, up-to-date, sustainable, multilingual and internationally relevant taxonomy. Utilising the expertise of the curators that supply the Catalogue of Life is the best option for the rapid enhancement of controlled vocabularies for nature-related collections.

The example of istockphoto.com 

Fig 2: Gentianella amarella
When submitting a species image, such as the plant in Fig 2 to the istockphoto image library, you are asked to tag it with appropriate keywords. For this image it seemed logical to include the following: 

  • Gentianella amarella -  the scientific name
  • Autumn Gentian - the UK common name
  • Gentianaceae - the plant family
  • Gentian - the common name for the family, and lastly
  • Northern Gentian - another known common name for this plant from Canada. 


What is returned from the image manager is as follows:

  • Northern Gentian is 'unknown'
  • Autumn Gentian is 'unknown'
  • Gentianella amarella is 'unknown'
  • Gentianaceae is recognised but as a synonym of Lisianthus 
  • Gentian is recognised and can be included

This example shows the current limitations of the controlled vocabulary of istockphoto in not adequately dealing with the indexing of this species. Apart from not recognising the common names of a relatively well known wildflower in the UK and Canada, it also is unable to recognise the scientific species name. Furthermore, it is erroneously matching the whole plant family Gentianaceae to the name Lisianthus. Lisianthus is a commonly used name for the cultivars of one species of Gentianaceae in the small genus Eustoma. However, in science Lisianthus is the name of a different genus in Gentianaceae, and Gentianaceae has 78 possible genera, of which Lisianthus is just one.

What this means for the Gentianella amarella image seeker is they will probably experience the frustration noted at the start of this post. Only a broader search term will help find their species (in this case 'Gentian'), that will then return many unwanted images that they will need to wade through to find one that they want.  If Gentianella amarella had been in the vocabulary, both image contributor and end user would have had a greater chance of success.



Tuesday, 29 October 2013

Taxon of the Day: Dillwynella voightae

Dillwynella voightae

Today's Taxon of the Day has been produced by Thomas Kunze, he writes:

Dillwynella voightae Kunze, 2011 is a small snail from the marine gastropod family Skeneidae and is in the Catalogue of Life care of WoRMS Mollusca database. The shell is whitish, porcelain-like, with not much sculpture and has a maximum diameter of only 5.8 mm. It lives at a depth of around 600 meters on sunken wood. Its diet includes both wood fiber and bacteria that lives on the wood. It is named in honour of Dr. Janet R. Voight, who on a research expedition collected 18 specimens from a piece of wood dredged from the ocean floor. This type location, off of the coast of Louisiana in the Gulf of Mexico, is the only place it has so far been found.

Whale fall, sunken wood, algae, fish bones, tortoiseshells or squid jaws form a special habitat that is widespread in oceans. Pieces of wood or complete trees are transported into the sea and then at some point sink. In the deep parts of the ocean nutrients are rare and a whole range of living things like bacteria, snails and bivalves will settle on the fallen organic matter and slowly biodegrade them.

Empty shell
As you will see in the author string, this species was described by me. You might ask how are species found and described nowadays? Well in this particular case, I was visiting the Field Museum in Chicago looking for members of the Skeneidae family for my PhD, and while looking through their collection I found a jar labelled Dillwynella sp. I knew this meant that the museum had done a preliminary assessment of family, but as yet, had not identified the specimens down to species level. As is procedure when involved in scientific research, I was able to get 4 of the 18 specimens shipped from the US to Sweden were I was working at the time. After close examination it became obvious to me that this specimen was rather different to all other known species of Dillwynella and so a description of a new species seemed appropriate and was published in 2011 in the Nautilus.

There is only one other species of Dillwynella known from the Atlantic, D. modesta (Dall, 1889) with the other 7 known species found off the coast of New Zealand and Japan. All marine gastropods (like slugs, snails, winkles, tingles and so on) are contributed by WoRMS Mollusca to the Catalogue of Life.

CoL Annual Checklist: Dillwynella voightae
CoL contributor: WoRMS Mollusca
Image copyright: T Kunze

Wednesday, 23 October 2013

Taxon of the Day: Menura

Lyrebird - Menura novaehollandiae
Today's Taxon of the Day has once again been produced by Thomas Kunze, he writes:

Many birds are well known for their beautiful song, for example, the blackbird and nightingale are held in high musical regard here in Europe. Today’s Taxon of the Day has one of the most extraordinary birdsongs ever heard on earth, with David Attenborough describing it as “possibly the most elaborate, complex and beautiful”. The genus Menura from Australia contains two species commonly known as lyrebirds both of which are listed in the Catalogue of Life. Not only can these birds mimic the sound of a whole range of other forest birds such as the kookaburra, but human-made noises too, such as, camera clicks, street works, chainsaws and car alarms, all of which are recorded in the BBC video. Who else can do this than the Australian lyrebirds?!?

These sounds are made by the male lyrebird when trying to attract a partner (why else?). He forms a hill of earth to stand on, or places himself on a branch and starts to sing. The females will be attracted to this unique vocal performance and snatch a view of the male's astonishing tail feathers before deciding if he is appropriate for mating. The lyrebirds special capability of adopting all kinds of noises into their song are what makes it unique. Once there were just forest noises in the birdsong, but now because all sorts of sounds have encroached into the bird’s habitat they too are included. To be able to produce this variety of sound the lyrebird has developed one of the most elaborate syrinx (vocal organ) of all birds. 

There are two species in the genus Menura: Menura alberti Bonaparte, 1850 and Menura novaehollandiae Latham, 1802. Both species form the family Menuridae a member of the order Passeriformes (aka perching birds). The original distribution of both species are the mountain forests in South-Eastern Australia.

The species Menura alberti has been assessed by the Catalogue of Life's partner IUCN Red List as Near Threatened.

CoL Annual Checklist: Menura alberti  
CoL contributor: ITIS Global
Image copyright: By Melburnian (Own work) [GFDL, CC-BY-SA-3.0 or CC-BY-2.5], via Wikimedia Commons

Tuesday, 22 October 2013

i4Life Part 5: Global Biodiversity Partners

Global Biodiversity Programmes

Previous posts in this series:
Part 1: Improving the world's taxonomic data indexing
Part 2: Global Species Databases
Part 3: The Catalogue of Life

In the last post we looked at how the Catalogue of Life shares its data with partners and collaborators through the Download Tool and Web Services. So who are the Catalogue of Life's partners and collaborators? Well the Catalogue of Life is itself a partnership between the officiating ITIS and Species 2000 organisations and the confederation of 139 contributing expert-curated taxonomic databases. But now, as a result of the i4Life project the collaborative reach has grown even further to include leading global biodiversity programmes and international research groups.

Today the Catalogue of Life is used as a common index for taxa in the catalogues of five global biodiversity programmes - IUCN Red List, Global Biodiversity Information Facility, European Nucleotide Archive, Barcoding initiatives and Encylopedia of Life. This index has acted as a backbone for a growing harmonisation between these catalogues, making it possible to share more easily names that are present and identify those missing in each. This process also includes the recognition of data stored under synonymic names, because in addition to 1.4M plus species names held, the Catalogue of Life currently contains over a million synonyms too. This allows partners to enhance their own catalogues by including names from the Catalogue of Life that they do not have, meaning a more comprehensive taxon search can be conducted by users in each of their own data portals. Through the Piping Tool (our next post!) the Catalogue of Life can receive names from partners that it doesn't hold, but can only include them once they have been assessed by taxonomic experts of the contributing Global Species Database (GSD) for that taxon. Once placed (or not), updated GSD checklists are then sent back to the Catalogue of Life for inclusion in the next edition of the Dynamic checklist, before once again being made available to partners through the Download and Web Services. Many of the additional names that are circulating may be synonyms or even misspellings, but until this i4Life data flow is complete it is not possible to know exactly how many valid species names that constitutes. Names that the Catalogue of Life is missing, while probably a small fraction of the total are highly significant. While the Catalogue of Life is the largest expert-curated species indexing mechanism currently out there, if it can not index all names global biodiversity programmes hold, its taxonomy is not as useful to partner programmes as it otherwise would be. However, the lack of Global Species Databases for some taxa means this is an incremental rather than an exhaustive process. Through i4Life the e-infrastructure is now in place to keep this data flow moving. In the meantime, the Catalogue of Life, global biodiversity programmes and Global Species Databases are all enhancing the quality of their data and for the end user this will mean more agreement across data portals and less confusion.

It has been no easy task establishing this exchange, both setting up these global partnerships, and agreeing appropriate methods to make sharing species information as painless as possible. That is in addition to the actual development of the e-infrastructure and making it operational and sustainable. But today, the Catalogue of Life now delivers a refreshed instance of the Catalogue of Life taxonomy in an internationally recognised data exchange format (a key achievement of i4Life) on a monthly basis for use in the heterogeneous catalogues and portals of global biodiversity programme partners. This level of networking, cooperation and integration demonstrates the level of commitment in the biodiversity community to shared goals and a desire to achieve them collectively.

Below is a brief overview of what the global biodiversity programmes are that form i4Life's Global Biodiversity Partners, and how they are exploring or aggregating different aspects of biodiversity knowledge that includes global species distribution modelling, genome and sequence diversity, species identification using DNA Barcodes and conservation status. What is common among them all is the need for a taxonomic index from which all their other data can radiate - this is where the Catalogue of Life comes in. For more information please find a link to each of their data portals.

IUCN Red List
The IUCN Red List is a database of information related to a species risk of extinction and conservation needs. Information is presented at the species level and therefore the Red List has at its core, a taxonomic backbone. It is now widely recognised as one of the fundamental tools to support conservation planning, management, monitoring, and decision making, with among other things growing value for broadening and strengthening our understanding of human impact on biodiversity.

Website

Global Biodiversity Information Facility (GBIF)
GBIF is a distributed and digital infrastructure which builds upon the collective efforts of and contributions of thousands of scientists in hundreds of institutions across the world through aggregation of their data. It also serves many different communities. The richness and importance of its biodiversity data, in particular its distribution data, is widely used by different organisations in science and society. The Convention on Biological Diversity and other international conventions, land-use planners and the agricultural sector, are all asking for new services which GBIF can help to deliver. The Catalogue of Life taxonomy feeds into the GBIF infrastructure.

Website

European Nucleotide Archive (ENA)
The European Nucleotide Archive provides a comprehensive, accessible and publicly available repository for nucleotide sequence data. Nucleotide sequence information is crucial to our understanding of biology, from genetics and molecular interactions through to organism-wide processes. Free access to nucleotide sequence data is therefore essential for life science research. As large-scale sequencing becomes faster and cheaper, the need to deposit, search and analyse information in a central archive that is publicly available and easily accessible continues to grow.

Website

Barcoding Initiatives
The various “barcoding of life” initiatives like BOLD, CBOL, ECBOL, iBOL or QBOL are currently some of the major sources of new species. BOL projects and principles are helping scientists to discover substantial numbers of cryptic species. New tools have been created to help identify existing taxa and new tools are currently under development to discover the vast majority of the species biodiversity that remains unknown. In 2004, CBS-KNAW launched the Mycobank initiative, introducing new standards and methods for the deposit and the registration of new species names and associated data. Unlike existing species registration systems, it can handle nomenclatural, taxonomical, geographical, bibliographical, morphological, physiological, chemical, electrophoretic and other molecular data.

Website (Mycobank)

Encyclopedia of Life (EoL)
The goal of the Encyclopedia of Life is to compile and make available over the Internet as much information as possible about the world’s species of plants, animals and microorganisms. It started as a collaborative effort involving several of the world’s leading science institutions - Harvard University, the Field Museum, the Marine Biological Laboratory, the Smithsonian Institution, the Biodiversity Heritage Library, and the Missouri Botanical Garden - and includes a role for the general public and other international partners too.

Website

Next up: Piping Tool



Wednesday, 16 October 2013

Taxon of the Day: Smeagol manneringi

Unidentified species of Smeagol

Today's Taxon of the Day has been produced by Thomas Kunze for all the Tolkienists out there. He writes:

Smeagol manneringi Climo 1980 is a marine slug originally found on a limestone gravel beach in New Zealand. It lives in the intertidal zone, well hidden under rocks and has a strange appearance and quite drab exterior. A recent phylogenetic analysis allowed Climo to place it in Pulmonata, an unranked and informal group of gastropods. In 2010 Neusser et al. placed it in the Eupulmonata, which includes other significant gastropod species like the members of the slug genus Arion, which includes the Spanish slug, one of a number of species that we commonly find eating our gardens, plus the highly appreciated culinary Burgundy snail.

“A slug to find them ...”
Climo named it after the famous character Smeagol from Tolkien’s notable novels The Lord of the Rings and The Hobbit. After finding The ring (“My Precious" ) the hobbit Smeagol becomes Gollum, living most of the time subterreanean as the slug Smeagol does. As a further reason for this name Climo mentioned that the slug Smeagol looks quite strange from the outside, but after studying its internal anatomy the pulmonate relation could be revealed rather easily.

Up to now four further species of monotypic genus Smeagol have been described in the family Smeagolidae.


CoL Annual Checklist page: Smeagol manneringi
CoL contributor: WORMS
Image copyright: 
Katharina M. Jörger, Isabella Stöger, Yasunori Kano, Hiroshi Fukuda, Thomas Knebelsberger & Michael Schrödl (top)  Link

Guillermogp Guillermo García-Pimentel Ruiz [Public domain], via Wikimedia Commons (bottom)


Monday, 14 October 2013

Taxon of the Day: Hamamelis

Hamamelis virginiana
Last month’s Catalogue of Life Dynamic Checklist saw the arrival of many new Global Species Databases including one that contains the genus Hamamelis in the plant family Hamamelidaceae. This taxon includes the species H. virginiana L. (Witch-hazel),  H. vernalis Sarg. (Springtime Witch-hazel) and H. ovalis S.W. Leonard (Big Leaved Witch-hazel) all found in North America, H. japonica  Sieb. & Zucc. (Japanese Witch-hazel) from Japan and H. mollis Oliv. (Chinese Witch-hazel) from China. As a result of the new Global Species Database (GSD) the Catalogue of Life Dynamic Edition now lists all the above species. However, the Annual Checklist updated once-per-year currently lists just four. This is because prior to the new GSD's arrival Hamamelis was formed using a proto-GSD that combined ITIS Regional database (covering North America) with the Catalogue of Life China database. This meant H. japonica, not found in either of these locations, was unfortunately excluded. Next year all five species will be in the Annual Checklist in addition to the Dynamic Checklist as a result of the new GSD.

A deciduous shrub or small tree with a short trunk, Witch-hazel bears many spreading, twisted branches. Because of its ability to flower at a time when other plants are dormant, it is a widely grown garden plant with many known cultivars. It reproduces mainly by seed and has capsules that burst open explosively when mature, launching their contents a fair distance from the parent plant.

The species we have here at Reading University in the Harris Garden is H.virginiana which is now showing some beautiful autumnal colour. Soon the red leaves will drop off leaving the branches bare as if ready for winter, but before long they will burst out into bright yellow, twisted ribbon-like flowers. If you can’t wait or are unable to see it, this time-lapse video below does a good job of recording this event albeit with a Hamamelis cultivar elsewhere.



The etymology of the species epithets describe different aspects of its appearance, distribution or flowering time. Where japonica means from Japan and virginiana is Latin for Virginia, probably a result of its native eastern North American distribution. Then mollis in Latin means "soft" referring to the felted leaves of this species, and vernalis translates to spring in Latin, referring to the later flowering time of this species.  Finally, ovalis means oval, a likely reference to the shape of the leaves. The common name comes from the historical use of the twisted branches as ‘witching sticks’ used as dowsers in the search for water. Hazel describes the resemblance of the leaf shape to those of the hazelnut (ie Corylus).

Hamamelis or Witch-hazel has a well-known eponymously named homeopathic remedy, where extracts, lotions, salves are produced from the bark, twigs and leaves of the plant. For centuries it was used to cure a whole range of bodily ills but mainly these days is used for minor problems such as bruises, sores and inflammations. This is because the used parts of the plant contain compounds which reportedly have astringent, anti-irritant, antioxidant, and anti-inflammatory properties.

The species H. ovalis  was quite recently described to science (2004), which at that time in the plant world was almost the equivalent of finding a new species of mammal. For a detailed account of its discovery see the following web page.

CoL Dynamic Checklist: Search on Hamamelis 
CoL contributor: World Plants
Image copyright: By Jason Hollinger [CC-BY-2.0], via Wikimedia Commons

Thursday, 10 October 2013

Taxon of the Day: Panthera

Today's post has been produced by Thomas Kunze he writes:

Previously on Taxon of the Day we featured two members of the cat family Felidae – the cheetah and the Scottish wild cat. Today we turn our attention to the genus Panthera also in Felidae, which includes just four species, all described by Linnaeus, and all extremely well known. They are:

From top down: Tiger, Lion, Jaguar, Leopard
Lion, Panthera leo (Linnaeus, 1758)
Tiger, Panthera tigris (Linnaeus, 1758)
Jaguar, Panthera onca (Linnaeus, 1758)
Leopard, Panthera pardus (Linnaeus, 1758)

These four species often referred to as the big cats are easily recognisable species and can be seen in many zoos around the world. They have roaring as one of their common traits, in addition to all being top predators in their habitats. In the wild, the adults of all species live solitary, except for lions who live in prides, meeting-up only for mating. Lions are mostly associated with sub-Saharan savannah landscapes in Africa, although there is also a small population living in western India.

The tiger is the world’s biggest living cat distributed in different Asian habitats from the tropics in south East Asia, to Russia in the north. Its range was once much larger than it is today.

The smallest member of the big cats is the leopard which was once present all across Africa, the Arabian Peninsula and the Far East Caucasus. Nowadays, it has a very scattered distribution but still maintains a wide range in sub-Saharan Africa.

The jaguar is the only big cat living in the New World with a range from Central America, down to the Amazon Basin and northern Argentina. There have also been a few recorded in the very south of the United States. 

Sightings of black and white offspring of certain Panthera species are highly sought-after. Melanism, a black pigmentation of the hairs, create the black coloured fur which can occur in jaguar and leopard species. In both, the rosette pattern is still visible, especially in good light (see image below). These variants are commonly referred as Black Panthers. White tigers have a white basic colouring of the fur but are not considered true albinos, because of the blue eye colour and black stripes.

A black Panthera onca 

Recent taxonomic approaches based on molecular data, like that used by the Catalogue of Life’s parnter IUCN Red List, also include the snow leopard, Unica unica (Schreiber, 1775) in the genus Panthera as Panthera unica. All species are listed by IUCN Red list with the tiger as Endangered, the lion as Vulnerable and the status of leopard and jaguar as Near Threatened. Worse still some subspecies of tiger and leopard are Critically Endangered. Like other mammals, these species are provided by ITIS Global to the Catalogue of Life.

CoL Annual Checklist: Panthera 
CoL contributor: ITIS Global
Image copyright: See page for author [CC-BY-SA-3.0], via Wikimedia Commons (top), Public Domain (bottom)


The Catalogue of Life is Moving!


The idea for the Catalogue of Life developed in the early 1990s shortly after Frank Bisby’s arrival at Reading University. Initial funding led to the first release of the Catalogue in 2000 with over 200 thousand species.  The initial aim was to have substantially completed the Catalogue of Life by this date but it became clear that far less taxonomic data was available in a readily accessible and electronic form than was first expected.  However this first publication, containing around 10% of known species proved an important step in realising future grants and projects that have now built the Catalogue to over 1.4m species. The steady growth of the Catalogue accompanied the increasing use of and dependence on the internet, not just by scientists but by the general public. This made a web-accessible list of all living things an extremely timely and welcome project for both individual and institutional users.  Over the past five years the Catalogue has been supported by two substantial Framework 7 e-infrastructure grants: 4D4Life and i4Life. These grants have allowed the Catalogue to continue its steady growth towards the target 1.9m known species despite it becoming increasingly hard to identify sources of high quality data to fill the steadily reducing number of taxonomic gaps.

Catalogue of Life continues to grow every year
However while grant based investment in the e-infrastructure for Catalogue of Life has been steady and substantial it remains difficult to find funding to generate the underlying data, especially because funding has a regional basis and the Catalogue is a truly global collaboration. The achievement of exceeding 70% coverage of all species means that the Catalogue has moved from a research project to a product that is sufficiently complete to be of value to individual and project based users. There is now a steady flow of requests to use the Catalogue of Life as a complete list of species for reference in other projects. It is now used by the major biological data portals ENA, GBIF and IUCN to provide a reference taxonomy to which their data can be linked. It is used by commercial publishers and some search engines as well as providing a species index for the EDIT platform.  Through these partners the Catalogue of Life is providing unique reference material on which biologists can assess the current state of global biodiversity.

4D4Life project meeting at Reading

The Catalogue of Life at Reading University was the major research activity of the late Prof Frank Bisby, the last academic to hold the established Chair of Botany at Reading. The project developed from one person on one computer to a dedicated laboratory filled with active staff developing both the content and the infrastructure for the Catalogue. Content was developed in close collaboration with ITIS, who continue to provide the taxonomic backbone of CoL as well as species level datasets.  The electronic infrastructure developed in collaboration with Cardiff University and ETI in the Netherlands. Frank Bisby’s drive to complete this project led to many long days in the lab, a huge international telephone bill and the close identity of Reading University and Catalogue of Life generated by Frank’s frequent speeches at international conferences where he tirelessly persuaded other scientists that they should join this project. The sudden death of Frank during the 4Life projects led to a more distributed management of the Catalogue of Life with the secretariat remaining active at Reading University but an increasingly important role for the international team of directors for Species 2000 and for the Catalogue of Life Global Team who oversee content and policy for the Catalogue.
Fern checklist
Filling gaps -  ferns have
recently been added to the Catalogue

The link with ITIS established in the first days of Catalogue of Life provided strong support throughout this period of change. Alastair Culham, project leader for i4Life stepped in to manage the completion of the 4D4Life project and, supported by the excellent i4Life team, has converted the Catalogue of Life into a product with international presence. The editorial continuity of the Catalogue of Life has been ensured by the steady work of its Executive Editor Dr Yuri Roskov who has now been with the project for more than a decade. Yuri continues to bring ideas that help to complete the Catalogue yet remains strict about the quality of content. Currently the i4Life team at Reading spans six nationalities each bringing their personal views to development of Catalogue of Life.  At the end of the i4Life project the day-to-day running and management of Catalogue of Life will be transferred to Naturalis in the Netherlands who have committed salaries and resource to running Catalogue of Life for the next five years, as the first host in a rolling five year programme allowing all appropriate organisations to have the opportunity to care for and build this indispensable resource. The process of building the Catalogue of Life will never be completed because thousands of new species of life are discovered and named every year. However, we expect the original target of 1.9m species to be reached by the end of the decade if we continue to add species at the current rate.  

Friday, 4 October 2013

Catalogue of Life in Munich

Posters at the conference
Last month Yuri Roskov and Thomas Kunze attended the 106th Annual Meeting of the German Zoological Society 2013 in Munich, Germany. Here is their account:

Almost every year since 1890 the Annual Meeting of the German Zoological Society has brought together zoologists from all specialisms - neurobiologists, ecologists, physiologists and taxonomists to name but a few. This year in southern Germany, over five hundred zoologists come for four days to the main building of the Ludwig-Maximilians-University Munich to exchange their ideas in numerous lectures and poster presentations. Of course that is where we had to be as well.

On behalf of the Catalogue of Life we presented two posters: The Catalogue of Life: plant species for zoologists and Towards a Global Inventory of Animal Species. The aim was to show how the Catalogue of Life can be a good source of taxonomic data for groups in which the user does not have expertise. So a conference with a wide range of participants was highly appreciated to test this.  The posters we presented displayed to potential users how complete the Catalogue is in both plants and animals at the moment and how they can access and use this data for their own work. So for example, checking species names and concepts and classification of related taxa in the Catalogue might be useful in habitat mapping or food chain analysis. The Catalogue of Life is a easy-to-use, source of primary taxonomic knowledge on plants, fungi, microorganisms, bacteria and viruses for zoologists. Enabling them to link their own taxa with hosts, parasites, food sources, symbionts and other members of ecological association.

Furthermore, we met contributors to our datasets and looked for new partners especially from different insect groups where the Catalogue has gap areas. Overall this meeting gave us a great opportunity to target many biologists and explain to them directly our product.


 Posters can be viewed on the i4Life events page.

Thursday, 3 October 2013

i4Life Part 4: Download and Web Services

Download and Web Services 


Previous posts in this series:
Part 1: Improving the world's taxonomic data indexing (inlcudes full data flow diagram)
Part 2: Global Species Databases
Part 3: The Catalogue of Life

There are a number of ways to access the Catalogue of Life. For checking species names or classifications you can do it online through the search and browse interface. You can also download the results of that search using the export option on any results page. However, if you are a user who needs access to all of the Catalogue, like partners in the i4Life project, Download and Web Services are the best method to transfer large quantities of data.

The Download Service has a graphical user interface that is accessed through a password-protected page on the i4Life website. Web Services are accessed through a URL anywhere. The instructions on how to do this are found on the Web Services page of the Catalogue of Life website. Both methods enable access to the Catalogue of Life database to allow transfer of its contents in DarwinCore Archive format, an essential step in the i4Life data flow shown in the diagram above. Anyone can activate an export process of the Catalogue data using the Download Service (see image below) once they have registered and been given a password. It is a case of selecting the data that is required then pressing a button and the data is automatically downloaded to your computer as a zip file. If you do not want the whole Catalogue, the form allows you to narrow down the export to a specific taxon by selecting it from the drop down box for each rank. So if you want to limit your export to a specific order, family, genus etc., you can. You can also just download the classification without the species names, or alternatively the species names without the classification for any chosen taxon. It is not necessary to have any programming skills to operate the Download Service, but you may need to understand relational data tables and software to be able to do anything useful with the data once you have exported it. As soon as the Dynamic Checklist is updated each month, users can download all or part of this refreshed instance of the Catalogue of Life using the Download Service in this way.

i4Life Download Service interface

This process of obtaining the Catalogue of Life using the Download Service requires human involvement through clicking buttons and selecting options. Our partners and collaborators generally want a more automated way of getting the Catalogue of Life data into their systems. Preferring to control the activation of this process at their end. This is where Web Services come in. Web Services are the Catalogue of Life’s equivalent to APIs (Application Programming Interface). What is an API? Well it is a method that allows one person’s website to plug into another. The instructions that the Catalogue of Life supply on its Web Services information page enables a programmer to set up this exchange. If someone else builds something using this method they may call it a Catalogue of Life application (or widget or tool) and it can become a fixed part of their own e-infrastructure or website. Our partners use Web Services to do different things. For example, IUCN use it as a link-out service (see image below) from their Red List website. What this means is that when someone searches for a species on the IUCN Red List website that has not yet had its conservation status assessed, it would previously have returned ‘not found’. What now happens is that the user's taxon name (in the form of a text string) is used to dynamically query the Catalogue of Life database using Web Services. If there is a match the IUCN Red List website will use the returning information supplied by Web Services. What is returned is in a format that computers can transfer (either XML or PHP) and interpret. More code on the IUCN Red List website then displays it in a user-friendly way - a name with a hyperlink back to the Catalogue of Life taxon record. This lets the IUCN Red List user know that a species with this name does exist (ie they haven’t spelt it wrong and here is in the Catalogue of Life!), but it is not yet assessed.

IUCN Red List link-out to Catalogue of Life

This is one use of Web Services, but they can be used in many different ways to build different applications on other websites. While the advantages are clear for IUCN Red List and other partners in that it enables a real-time display of data from the current updated version of the Dynamic Checklist; for the Catalogue of Life the benefits are that it increases our user base, promoting the data via large, high profile biodiversity websites. Not only partner websites use Web Services, many individual users and commercial users are accessing the Catalogue of life this way, leading to a satellite distribution of the Catalogue worldwide. Any non-commercial user is free to use the latest edition of the Dynamic and Annual Checklists but is required when using it in another system to abide by the Terms of Use and notify the Species 2000 Secretariat. The reason for this is the Catalogue of Life's commitment to ensure proper credit and attribution goes to Global Species Databases, the knowledge-base upon which the Catalogue is built. By tracking users we can make sure that we are fulfilling our requirements as suppliers of this data.

Next up: i4Life Part 5: Global Biodiversity Partners

Tuesday, 1 October 2013

Taxon of the Day: Delphinidae

Delphinidae
Dolphins are a group of species that get a lot of positive attention with some believing they will bring healing and physic power to us humans who interact with them. We don't like to go for the popularity vote on Taxon of the Day, but these mammals are currently newsworthy for being the main subject of the extraordinary winning shot in this year's Wildlife Photographer of the Year competition here in the UK. Like all 1.4+M species found in the Catalogue of Life, they are of course special.

The Catalogue of Life lists 37 species in the family Delphinidae, the taxon that holds all oceanic dolphins (fresh-water dolphins are found elsewhere). The common names of some species in this group can be misleading with a handful of them often referred to as whales, including the well-known Orca or Killer whale. However, although dolphins, porpoises, and whales all belong to the order Cetacea the Delphinidae are united by a number of shared characteristics including a single blowhole, streamlined bodies (ie wide in the middle and narrow at each end), and a beak-like nose.

The majority of species have been assessed by the Catalogue of Life's partner the IUCNRedlist for conservation status, with a few currently classified as Vulnerable or Near Threatened and the species Cephalorhynchus hectori listed as Endangered

Did you know that Dolphins, because of their excellent hearing, sonar capabilities and underwater vision, have been used by the US Navy for decades to locate things in the sea. Once they have undergone two years of training they are ready to embark on a mission such as mine-sweeping off the coast of Croatia.

The etymology of the name dolphin is interesting, having very little variation in most languages. To hear a in-depth, inspired and entertaining overview of its origins listen to this pod cast from the lively Dolphin Communication Project.


CoL Annual Checklist: Delphinidae
CoL contributor: ITIS Global
Image copyright: Public Domain