27 research outputs found

    Data reliability in citizen science: learning curve and the effects of training method, volunteer background and experience on identification accuracy of insects visiting ivy flowers

    Get PDF
    • Citizen science, the involvement of volunteers in collecting of scientific data, can be a useful research tool. However, data collected by volunteers are often of lower quality than that collected by professional scientists. • We studied the accuracy with which volunteers identified insects visiting ivy (Hedera) flowers in Sussex, England. In the first experiment, we examined the effects of training method, volunteer background and prior experience. Fifty-three participants were trained for the same duration using one of three different methods (pamphlet, pamphlet + slide show, pamphlet + direct training). Almost immediately following training, we tested the ability of participants to identify live insects on ivy flowers to one of 10 taxonomic categories and recorded whether their identifications were correct or incorrect, without providing feedback. • The results showed that the type of training method had a significant effect on identification accuracy (P = 0.008). Participants identified 79.1% of insects correctly after using a one-page colour pamphlet, 85.6% correctly after using the pamphlet and viewing a slide show, and 94.3% correctly after using the pamphlet in combination with direct training in the field. • As direct training cannot be delivered remotely, in the following year we conducted a second experiment, in which a different sample of 26 volunteers received the pamphlet plus slide show training repeatedly three times. Moreover, in this experiment participants received c. 2 minutes of additional training material, either videos of insects or stills taken from the videos. Testing showed that identification accuracy increased from 88.6% to 91.3% to 97.5% across the three successive tests. We also found a borderline significant interaction between the type of additional material and the test number (P = 0.053), such that the video gave fewer errors than stills in the first two tests only. • The most common errors made by volunteers were misidentifications of honey bees and social wasps with their hover fly mimics. We also tested six experts who achieved nearly perfect accuracy (99.8%), which shows what is possible in practice. • Overall, our study shows that two or three sessions of remote training can be as good as one of direct training, even for relatively challenging taxonomic discriminations that include distinguishing models and mimics

    Changepoint analysis for efficient variant calling

    No full text

    Changepoint analysis for efficient variant calling

    No full text
    We present CAGe, a statistical algorithm which exploits high sequence identity between sampled genomes and a reference assembly to streamline the variant calling process. Using a combination of changepoint detection, classification, and online variant detection, CAGe is able to call simple variants quickly and accurately on the 90-95% of a sampled genome which differs little from the reference, while correctly learning the remaining 5-10% that must be processed using more computationally expensive methods. CAGe runs on a deeply sequenced human whole genome sample in approximately 20 minutes, potentially reducing the burden of variant calling by an order of magnitude after one memory-efficient pass over the data. © 2014 Springer International Publishing Switzerland
    corecore