15 research outputs found
Ensemble Kalman Filters with Resampling
Filtering is concerned with online estimation of the state of a dynamical
system from partial and noisy observations. In applications where the state of
the system is high dimensional, ensemble Kalman filters are often the method of
choice. These algorithms rely on an ensemble of interacting particles to
sequentially estimate the state as new observations become available. Despite
the practical success of ensemble Kalman filters, theoretical understanding is
hindered by the intricate dependence structure of the interacting particles.
This paper investigates ensemble Kalman filters that incorporate an additional
resampling step to break the dependency between particles. The new algorithm is
amenable to a theoretical analysis that extends and improves upon those
available for filters without resampling, while also performing well in
numerical examples.Comment: 32 pages, 5 figure
Covariance Operator Estimation: Sparsity, Lengthscale, and Ensemble Kalman Filters
This paper investigates covariance operator estimation via thresholding. For
Gaussian random fields with approximately sparse covariance operators, we
establish non-asymptotic bounds on the estimation error in terms of the
sparsity level of the covariance and the expected supremum of the field. We
prove that thresholded estimators enjoy an exponential improvement in sample
complexity compared with the standard sample covariance estimator if the field
has a small correlation lengthscale. As an application of the theory, we study
thresholded estimation of covariance operators within ensemble Kalman filters.Comment: 25 pages, 1 figur
Limits of use of social media for monitoring biosecurity events.
Compared to applications that trigger massive information streams, like earthquakes and human disease epidemics, the data input for agricultural and environmental biosecurity events (ie. the introduction of unwanted exotic pests and pathogens), is expected to be sparse and less frequent. To investigate if Twitter data can be useful for the detection and monitoring of biosecurity events, we adopted a three-step process. First, we confirmed that sightings of two migratory species, the Bogong moth (Agrotis infusa) and the Common Koel (Eudynamys scolopaceus) are reported on Twitter. Second, we developed search queries to extract the relevant tweets for these species. The queries were based on either the taxonomic name, common name or keywords that are frequently used to describe the species (symptomatic or syndromic). Third, we validated the results using ground truth data. Our results indicate that the common name queries provided a reasonable number of tweets that were related to the ground truth data. The taxonomic query resulted in too small datasets, while the symptomatic queries resulted in large datasets, but with highly variable signal-to-noise ratios. No clear relationship was observed between the tweets from the symptomatic queries and the ground truth data. Comparing the results for the two species showed that the level of familiarity with the species plays a major role. The more familiar the species, the more stable and reliable the Twitter data. This clearly presents a problem for using social media to detect the arrival of an exotic organism of biosecurity concern for which public is unfamiliar
Welvaert et al. Bogong moth and Common Koel surveillance
The EXCEL datasheets "Common Moth", "Common Koel 1", "Common Koel 2", Common Koel 3", "Symp Moth 1", "Symp Moth2", "Symp Koel 1", "Symp Koel 2", "Symp Koel 3", and "Symp Koel 4" are relevance summaries (0=non-relevant, 1=relevant) of de-identified tweets (from Twitter). Tweets were produced using the Commonwealth Scientific & Industrial Research Organisations Emergency Situation Awareness (ESA) system. They are derived by the searches defined within the associated manuscript. The survey data of Bogong Moth data are field data collected from the summit ridge of Mount Gingera, Brindabella Ranges, Australia. Please contact the correspondence author, Peter Caley ([email protected]), for further information
CUSUM charts for the Koel Common name queries.
<p>Red dots indicate a deviation of the number of tweets, in particular the upper part of the chart points to an increase in tweets. UDB: Upper Decision Boundary; LDB: Lower Decision Boundary.</p
Overview of the queries used in this study.
<p>Three types of queries are distinguished: (1) a taxonomic query using the taxonomic classification, (2) a common name query using the common name of the species, and (3) a symptomatic query that searches for tweets that indicate the presence of the species without mentioning either the taxonomic or common name.</p
Lag parameters used to fit AR models for the Twitter time series.
<p>Lag parameters used to fit AR models for the Twitter time series.</p
Validation of Bogong moth Twitter data against ground truth data collected in surveys.
<p>The Twitter data are represented as time series of the weekly counts, while the survey data are shown as bar charts. The grey shaded area delimits the time period in which tweets couldn’t be reliably captured.</p
Validation of Koel Twitter data against historical monthly sightings.
<p>The Twitter data are represented as time series of the weekly counts, while the historical data are shown as monthly bars that are replicated for each migration season. The grey shaded area delimits the time period in which tweets couldn’t be reliably captured.</p
CUSUM charts for the Koel symptomatic queries.
<p>Red dots indicate a deviation of the number of tweets, in particular the upper part of the chart points to an increase in tweets. UDB: Upper Decision Boundary; LDB: Lower Decision Boundary.</p