9 research outputs found
Museum specimen data reveal emergence of a plant disease may be linked to increases in the insect vector population
The emergence rate of new plant diseases is increasing due to novel introductions, climate change, and changes in vector populations, posing risks to agricultural sustainability. Assessing and managing future disease risks depends on understanding the causes of contemporary and historical emergence events. Since the mid-1990s, potato growers in the western United States, Mexico, and Central America have experienced severe yield loss from Zebra Chip disease and have responded by increasing insecticide use to suppress populations of the insect vector, the potato psyllid, Bactericera cockerelli (Hemiptera: Triozidae). Despite the severe nature of Zebra Chip outbreaks, the causes of emergence remain unknown. We tested the hypotheses that (1) B. cockerelli occupancy has increased over the last century in California and (2) such increases are related to climate change, specifically warmer winters. We compiled a data set of 87,000 museum specimen occurrence records across the order Hemiptera collected between 1900 and 2014. We then analyzed changes in B. cockerelli distribution using a hierarchical occupancy model using changes in background species lists to correct for collecting effort. We found evidence that B. cockerelli occupancy has increased over the last century. However, these changes appear to be unrelated to climate changes, at least at the scale of our analysis. To the extent that species occupancy is related to abundance, our analysis provides the first quantitative support for the hypothesis that B. cockerelli population abundance has increased, but further work is needed to link B. cockerelli population dynamics to Zebra Chip epidemics. Finally, we demonstrate how this historical macro-ecological approach provides a general framework for comparative risk assessment of future pest and insect vector outbreaks
Semi-Supervised Cause Identification from Aviation Safety Reports
We introduce cause identification, a new problem involving classification of incident reports in the aviation domain. Specifically, given a set of pre-defined causes, a cause identification system seeks to identify all and only those causes that can explain why the aviation incident described in a given report occurred. The difficulty of cause identification stems in part from the fact that it is a multi-class, multilabel categorization task, and in part from the skewness of the class distributions and the scarcity of annotated reports. To improve the performance of a cause identification system for the minority classes, we present a bootstrapping algorithm that automatically augments a training set by learning from a small amount of labeled data and a large amount of unlabeled data. Experimental results show that our algorithm yields a relative error reduction of 6.3 % in F-measure for the minority classes in comparison to a baseline that learns solely from the labeled data.
Modeling Organization in Student Essays
Automated essay scoring is one of the most important educational applications of natural language processing. Recently, researchers have begun exploring methods of scoring essays with respect to particular dimensions of quality such as coherence, technical errors, and relevance to prompt, but there is relatively little work on modeling organization. We present a new annotated corpus and propose heuristic-based and learning-based approaches to scoring essays along the organization dimension, utilizing techniques that involve sequence alignment, alignment kernels, and string kernels.