31 research outputs found

    Profound Climatic Effects on Two East Asian Black-Throated Tits (Ave: Aegithalidae), Revealed by Ecological Niche Models and Phylogeographic Analysis

    Get PDF
    Although a number of studies have assessed the effects of geological and climatic changes on species distributions in East Asian, we still have limited knowledge of how these changes have impacted avian species in south-western and southern China. Here, we aim to study paleo-climatic effects on an East Asian bird, two subspecies of black-throated tit (A. c. talifuensis–concinnus) with the combined analysis of phylogeography and Ecological Niche Models (ENMs). We sequenced three mitochondrial DNA markers from 32 populations (203 individuals) and used phylogenetic inferences to reconstruct the intra-specific relationships among haplotypes. Population genetic analyses were undertaken to gain insight into the demographic history of these populations. We used ENMs to predict the distribution of target species during three periods; last inter-glacial (LIG), last glacial maximum (LGM) and present. We found three highly supported, monophyletic MtDNA lineages and different historical demography among lineages in A. c. talifuensis–concinnus. These lineages formed a narrowly circumscribed intra-specific contact zone. The estimated times of lineage divergences were about 2.4 Ma and 0.32 Ma respectively. ENMs predictions were similar between present and LGM but substantially reduced during LIG. ENMs reconstructions and molecular dating suggest that Pleistocene climate changes had triggered and shaped the genetic structure of black-throated tit. Interestingly, in contrast to profound impacts of other glacial cycles, ENMs and phylogeographic analysis suggest that LGM had limited effect on these two subspecies. ENMs also suggest that Pleistocene climatic oscillations enabled the formation of the contact zone and thus support the refuge theory

    Taxonomic Data Quality Control for CoL-China

    No full text
    High quality checklists of species are important in biodiversity data to help answer what and how many species are present in a country or a region, and they are often used as backbones in biodiversity databases. Catalogue of Life, China (CoL-China), hosted by the Species 2000 China Node in the Chinese Academy of Sciences, has published 16 annual versions since 2008, and these have been used very widely in China for supporting biodiversity research, conservation decisions and citizen science. Taxonomic data quality is one of the reasons for its popularity, due to a systematic workflow that guarantees quality control. Our goals of quality management are to ensure that the contents of the CoL-China comply with the standard of the global Catalogue of Life (CoL), ensure that data items such as the taxonomy system, accepted names, Chinese names, common names, synonyms, distributions, literature, and data sources are complete and accurate, improve the scientific value and reliability of CoL-China, and ensure the smooth release of each annual version. Several measures were implemented in our quality control workflow:A professional organization ensures the scientific credentials of CoL-China. An editorial board, including 31 authoritative scientists as a decision-making body, leads the Species 2000 China Node to establish rules and goals for making and publicizing CoL-China. More than 300 taxonomists from institutes of the Chinese Academy of Sciences are involved in working on different taxon groups like animals, plants etc. There is a working group that is composed of taxonomists and information scientists for managing the procedure of annual checklist production and compiling the checklists of various taxon groups into CoL-China.Authoritative data sources ensure the quality of CoL-China from the outset. All selected data sources are from peer-reviewed taxonomic papers, dissertations or mature databases. Selection of each data source is controlled by a specialist in the relevant taxon group. Principles for filtering data sources are as follows:Completeness. At a minimum, the data source should hold a checklist of a family and contain most of the fields required, such as accepted name with authorship, classification and distribution. The fields like common name and reference are optional.Science. The data source should be maintained by the professional community, and all the data items should be checked by experts for each species.Timeliness. Maintenance of the data source should comply with leading-edge practices for taxonomic research.A taxonomic data management tool was developed to implement the workflow of data quality. This is a platform for multi-person collaboration, which allows experts who study the same taxonomic group to work on the same datasets together. It can collect and manage multi-dimensional data of species, meeting the requirements of CoL-China. The tool provides several convenient functions e.g., batch import taxon data, a visualization tool for editing taxonomic trees, and extraction of taxonomic data from labeled free text. An auto checking process with 28 steps (Fig. 1) was implemented in this tool to verify each item for all species.Artificial intelligence helps improve quality control. Taxonomic data is in free text for many data sources, and it is easy to make mistakes when retrieving data items manually. We developed an artificial intelligence (AI) tool for extracting distribution data from free text, which significantly helps to promote data quality.Unique identifiers introduced for each taxon imported into CoL-China so that taxon concepts can be tracked. This helps control data quality from the original source to the CoL-China database.Major progress has been made in listing the Chinese known species by CoL-China. But large gaps still exists in some taxon groups especially for insects, which affects the whole quality of CoL-China. Another challenge is how to keep CoL-China up-to-date with new discoveries. As a next step, we will focus on these problems and continue to keep consistent data quality assurance and data quality control mechanisms with CoL

    An AI-based Wild Animal Detection System and Its Application

    No full text
    Rapid accumulation of biodiversity data and development of deep learning methods bring the opportunities for detecting and identifying wild animals automatically, based on artificial intelligence. In this paper, we introduce an AI-based wild animal detection system. It is composed of acoustic and image sensors, network infrastructures, species recognition models, and data storage and visualization platform, which go through the technical chain learned from Internet of Things (IOT) and applied to biodiversity detection. The workflow of the system is as follows:Deploying sensors for different detection targets. The acoustic sensor is composed of two microphones for picking up sounds from the environment and an edge computing box for judging and sending back the sound files. The acoustic sensor is suitable for monitoring birds, mammals, chirping insects and frogs. The image sensor is composed of a high performance camera that can be controlled to record surroundings automatically and a video analysis edge box running a model for detecting and recording animals. The image sensor is suitable for monitoring waterbirds in locations without visual obstructions.Adopting different networks according to signal availability. Network infrastructures are critical for the detection system and the task of transferring data collected by sensors. We use the existing network when 4/5G signals are available, and build special networks using Mesh Networking technology for the areas without signals. Multiple network strategies lower the cost for monitoring jobs.Recognizing species from sounds, images or videos. AI plays a key role in our system. We have trained acoustic models for more than 800 Chinese birds and some common chirping insects and frogs, which can be identified from sound files recorded by acoustic sensors. For video and image data, we also have trained models for recognizing 1300 Chinese birds and 400 mammals, which help to discover and count animals captured by image sensors. Moreover, we propose a special method for detecting species through features of voices, images and niche features of animals. It is a flexible framework to adapt to different combinations of acoustic and image sensors. All models were trained with labeled voices, images and distribution data from Chinese species database, ESPECIES.Saving and displaying machine observations. The original sound, image and video files with identified results were stored in the data platform deployed on the cloud for extensible computing and storage. We have developed visualization modules in the platform for displaying sensors on maps using WebGIS to show curves of the number of records and species for each day, real time alerts from sensors capturing animals, and other parameters.For storing and exchanging records of machine observations and information of sensors, and models and key nodes of network, we have proposed a collection of data fields extended from Darwin Core and built up a data model to represent where, when and which sensors observe which species. The system has been applied in several projects since last year. For example, we have deployed 50 sensors across the city of Beijing for detecting birds, and now they have harvested more than 300 million records and detected 320 species, filling the data gaps of Beijing birds from taxonomic coverage to time dimension effectively. Next steps will focus on improving AI models for identifying species with higher accuracy, popularizing this system in biodiversity detection, and building up a mechanism for sharing and publishing machine observations

    Notes of Life: A platform for recording species observations driven by artificial intelligence

    No full text
    Biodiversity research is stepping into a big data era with the rapid increase in the abundance of biodiversity data, especially the large number of species images. It has been a new trend and hot topic on how to utilize artificial intelligence to mine big biodiversity data to support wildlife observation and recognition. In this research, we integrate large numbers of species images, including higher plants, birds and insects, and use a state-of-the-art image deep learning technique to train species auto-recognition models. Currently, we get a model that can recognize more than 900 Chinese birds with top 1 accuracy 81% and top 5 accuracy 95% (top n accuracy means the probability that the correct answer presents in top n predicted results), and more models are coming soon. Based on these models, we developed a platform named Notes of Life (NOL, http://nol.especies.cn), which includes a website and a mobile application (app) for assisting biological scientists and citizen scientists to recognize and record wildlife. Users can upload their observation records and images of wildlife through our mobile app while they are investigating in the wild. The website is used for bulk data uploading and management. Species images can be classified by taxon-specific, plug-in recognition models that speed up the process of identification. There is an expert module in NOL where citizen scientists can work interactively with information provided by biological scientists, and post a species image identification request to experts when they cannot recognize the species by themselves or from models. The expert module is for improving the quality of citizen science data, and it is a supplement of the disadvantage of species auto-recognition models. Above all, NOL embraces the idea that scientific research supports citizen science and citizen science gives feedback to science, and of finding a sustainable way to collect increasingly more reliable data for biodiversity research

    An AI-based Wild Animal Detection System and Its Application

    No full text
    Rapid accumulation of biodiversity data and development of deep learning methods bring the opportunities for detecting and identifying wild animals automatically, based on artificial intelligence. In this paper, we introduce an AI-based wild animal detection system. It is composed of acoustic and image sensors, network infrastructures, species recognition models, and data storage and visualization platform, which go through the technical chain learned from Internet of Things (IOT) and applied to biodiversity detection. The workflow of the system is as follows:Deploying sensors for different detection targets. The acoustic sensor is composed of two microphones for picking up sounds from the environment and an edge computing box for judging and sending back the sound files. The acoustic sensor is suitable for monitoring birds, mammals, chirping insects and frogs. The image sensor is composed of a high performance camera that can be controlled to record surroundings automatically and a video analysis edge box running a model for detecting and recording animals. The image sensor is suitable for monitoring waterbirds in locations without visual obstructions.Adopting different networks according to signal availability. Network infrastructures are critical for the detection system and the task of transferring data collected by sensors. We use the existing network when 4/5G signals are available, and build special networks using Mesh Networking technology for the areas without signals. Multiple network strategies lower the cost for monitoring jobs.Recognizing species from sounds, images or videos. AI plays a key role in our system. We have trained acoustic models for more than 800 Chinese birds and some common chirping insects and frogs, which can be identified from sound files recorded by acoustic sensors. For video and image data, we also have trained models for recognizing 1300 Chinese birds and 400 mammals, which help to discover and count animals captured by image sensors. Moreover, we propose a special method for detecting species through features of voices, images and niche features of animals. It is a flexible framework to adapt to different combinations of acoustic and image sensors. All models were trained with labeled voices, images and distribution data from Chinese species database, ESPECIES.Saving and displaying machine observations. The original sound, image and video files with identified results were stored in the data platform deployed on the cloud for extensible computing and storage. We have developed visualization modules in the platform for displaying sensors on maps using WebGIS to show curves of the number of records and species for each day, real time alerts from sensors capturing animals, and other parameters.For storing and exchanging records of machine observations and information of sensors, and models and key nodes of network, we have proposed a collection of data fields extended from Darwin Core and built up a data model to represent where, when and which sensors observe which species. The system has been applied in several projects since last year. For example, we have deployed 50 sensors across the city of Beijing for detecting birds, and now they have harvested more than 300 million records and detected 320 species, filling the data gaps of Beijing birds from taxonomic coverage to time dimension effectively. Next steps will focus on improving AI models for identifying species with higher accuracy, popularizing this system in biodiversity detection, and building up a mechanism for sharing and publishing machine observations

    MapBio:Mapping Biodiversity of China

    No full text
    MapBio is a project initiated by the Chinese Academy of Sciences, which aims at integrating species distribution data from different sources and mapping the biodiversity of China to support biodiversity research and biodiversity conservation decisions. Species distribution data may be found in journal articles, books and different databases in various formats, and most species distributions are described in free text. MapBio is trying to build up a workflow for collecting this free text, parsing it into standardized data and projecting distributions onto a map for each species in China. A map module of MapBio is designed and implemented based on Web GIS to visualize species distributions on a map at different levels, e.g., occurrence points, county, province, distribution range, protected area, waterbody, biogeographic realm. Since the completeness of distribution data is very important for assessing biodiversity, we developed a tool in MapBio for analysis of the gaps in distribution data. Based on the species distribution data, especially the occurrence data, MapBio provides an integrated modeling tool for helping users to build species niche models. MapBio is an open access project. Users can get data and services from it easily for biodiversity research and conservation, and also can contribute their own biodiversity data to MapBio

    Discussion of the Method for Constructing Animal Traits

    No full text
    Trait data in biology can be extracted from text and structured for reuse within and across taxa. For example, body length is one trait applicable to many species and "body length is about 170 cm" is one trait data point for the human species. Trait data can be used in more detailed analyses to describe species evolution and development processes, so it has begun to be valued by more than taxonomists. The EOL (Encyclopedia of Life) TraitBank provides an example of a trait database. Current trait databases are in their infancy. Most are based on morphological data such as shape, color, structural and sexual characteristics. In fact, some data such as behavioral and biological characteristics may be similarly included in trait databases. To build a trait database we constructed a list of controlled vocabulary to record the states of various terms. These terms may exhibit common characteristics: They can be grouped as conceptual (subject) and descriptive (delimiter) terms. For example, in “the shoulder height is 65–70 cm”, "shoulder height" is the conceptual term and "65–70 cm" is the descriptive term. Conceptual terms may be part of an interdependent hierarchical structure. Examples in morphology, physiology and conservation or protection status, demonstrate how parts or systems may be broken into smaller measurable (quantifiable) or enumerable pieces. Descriptive terms will modify or delimit parameters of conceptual terms. These may be numerical with distinguishing units, counts, or other adjectives or enumerable with special nouns. Although controlled vocabularies about animals are complex, they can be normalized using RDF (Resource Description Framework) and OWL (web ontology language) standards. Next, we extract traits from two main types of existing descriptions. tabular data, which is more easily digested by machine, and descriptive text, which is complex. Pure text often needs to be extracted manually or by NLP (computerized natural language processing). Sometimes machine learning methods can be used. Moreover, different human languages may demand different extraction methods. Because the number of recordable traits exceeds current collection records, the database structure should be optimized for retrieval speed. For this reason, key-value databases are more suitable for storage of traits data than relational databases. EOL used the database Virtuoso for Traitbank, which is a non-relational database. Using existing mature tools and standards of ontology, we can construct a preliminary work-flow for animal trait data, but some tools and specifications for data analysis and use need to await additional data accumulation

    Discussion of the Method for Constructing Animal Traits

    No full text
    Trait data in biology can be extracted from text and structured for reuse within and across taxa. For example, body length is one trait applicable to many species and "body length is about 170 cm" is one trait data point for the human species. Trait data can be used in more detailed analyses to describe species evolution and development processes, so it has begun to be valued by more than taxonomists. The EOL (Encyclopedia of Life) TraitBank provides an example of a trait database. Current trait databases are in their infancy. Most are based on morphological data such as shape, color, structural and sexual characteristics. In fact, some data such as behavioral and biological characteristics may be similarly included in trait databases. To build a trait database we constructed a list of controlled vocabulary to record the states of various terms. These terms may exhibit common characteristics: They can be grouped as conceptual (subject) and descriptive (delimiter) terms. For example, in "the shoulder height is 65–70 cm", "shoulder height" is the conceptual term and "65–70 cm" is the descriptive term. Conceptual terms may be part of an interdependent hierarchical structure. Examples in morphology, physiology and conservation or protection status, demonstrate how parts or systems may be broken into smaller measurable (quantifiable) or enumerable pieces. Descriptive terms will modify or delimit parameters of conceptual terms. These may be numerical with distinguishing units, counts, or other adjectives or enumerable with special nouns. Although controlled vocabularies about animals are complex, they can be normalized using RDF (Resource Description Framework) and OWL (web ontology language) standards. Next, we extract traits from two main types of existing descriptions. tabular data, which is more easily digested by machine, and descriptive text, which is complex. Pure text often needs to be extracted manually or by NLP (computerized natural language processing). Sometimes machine learning methods can be used. Moreover, different human languages may demand different extraction methods. Because the number of recordable traits exceeds current collection records, the database structure should be optimized for retrieval speed. For this reason, key-value databases are more suitable for storage of traits data than relational databases. EOL used the database Virtuoso for Traitbank, which is a non-relational database. Using existing mature tools and standards of ontology, we can construct a preliminary work-flow for animal trait data, but some tools and specifications for data analysis and use need to await additional data accumulation

    The China Program on Species Diversity Information Systems

    No full text
    A program to integrate species diversity information systems was launched by the Chinese Academy of Sciences (CAS) in January 2018, with funding from the CAS Earth project, a Strategic Priority Research Program of CAS. The program will create a series of data products, such as China flora online, species catalogues, distribution maps, software tools for data mining and knowledge discovery based on big data and artificial intelligence technology, and a service platform and portal highlighting species diversity information in China. The products and platform will provide the robust data to support decision making on biodiversity conservation, fundamental research on biodiversity evolution and spatial patterns, and species identification for citizen science. China flora online will include 35,000 species of higher plants in China and an online editing environment for botanists to maintain the floral records. The trait database will include structured data of animals, plants and fungi, such as weight, height, length, color and shape of organisms. This species catalogue will be the annually updated version of the Catalogue of Life, China. The distribution maps will show the spatial pattern for each species of vertebrate animal and higher plant. Cell phone apps will help users to easily and quickly identify plants in the field. The mechanism and workflow for data collection, integration, public sharing and quality control will be built up in the next few years

    Identifying Habitat Elements from Bird Images Using Deep Convolutional Neural Networks

    No full text
    With the rapid development of digital technology, bird images have become an important part of ornithology research data. However, due to the rapid growth of bird image data, it has become a major challenge to effectively process such a large amount of data. In recent years, deep convolutional neural networks (DCNNs) have shown great potential and effectiveness in a variety of tasks regarding the automatic processing of bird images. However, no research has been conducted on the recognition of habitat elements in bird images, which is of great help when extracting habitat information from bird images. Here, we demonstrate the recognition of habitat elements using four DCNN models trained end-to-end directly based on images. To carry out this research, an image database called Habitat Elements of Bird Images (HEOBs-10) and composed of 10 categories of habitat elements was built, making future benchmarks and evaluations possible. Experiments showed that good results can be obtained by all the tested models. ResNet-152-based models yielded the best test accuracy rate (95.52%); the AlexNet-based model yielded the lowest test accuracy rate (89.48%). We conclude that DCNNs could be efficient and useful for automatically identifying habitat elements from bird images, and we believe that the practical application of this technology will be helpful for studying the relationships between birds and habitat elements
    corecore