5,841 research outputs found
On the Evidence for Clustering in the Arrival Directions of AGASA's Ultrahigh Energy Cosmic Rays
Previous analyses of cosmic rays above 40 EeV observed by the AGASA
experiment have suggested that their arrival directions may be clustered.
However, estimates of the chance probability of this clustering signal vary
from 10^{-2} to 10^{-6} and beyond. It is essential that the strength of this
evidence be well understood in order to compare it with anisotropy studies in
other cosmic ray experiments. We apply two methods for extracting a meaningful
significance from this data set: one can scan for the cuts which optimize the
clustering signal, using simulations to determine the appropriate statistical
penalty for the scan. This analysis finds a chance probability of about 0.3%.
Alternatively, one can optimize the cuts with a first set of data, and then
apply them to the remaining data directly without statistical penalty. One can
extend the statistical power of this test by considering cross-correlation
between the initial data and the remaining data, as long as the initial
clustering signal is not included. While the scan is more useful in general, in
the present case only splitting the data set offers an unbiased test of the
clustering hypothesis. Using this test we find that the AGASA data is
consistent at the 8% level with the null hypothesis of isotropically
distributed arrival directions.Comment: 14 pages, 3 figures. Unbiased test expanded to include
cross-correlation between initial and later data sets for greater statistical
power; minor revisions to discussion. Accepted by Astropart. Phy
Plant succession on gopher mounds in Western Cascade meadows: consequences for species diversity and heterogeneity
Pocket gophers have the potential to alter the dynamics of grasslands by creating mounds that bury existing vegetation and locally reset succession. Gopher mounds may provide safe sites for less competitive species, potentially increasing both species diversity and vegetation heterogeneity (spatial variation in species composition). We compared species composition, diversity and heterogeneity among gopher mounds of different ages in three montane meadows in the Cascade Range of Oregon. Cover of graminoids and forbs increased with mound age, as did species richness. Contrary to many studies, we found no evidence that mounds provided safe sites for early successional species, despite their abundance in the soil seed bank, or that diversity peaked on intermediate-aged mounds. However, cover of forbs relative to that of graminoids was greater on mounds than in the adjacent meadow. Variation in species composition was also greater within and among mounds than in adjacent patches of undisturbed vegetation, suggesting that these small-scale disturbances increase heterogeneity within meadows
High-Dimensional Density Ratio Estimation with Extensions to Approximate Likelihood Computation
The ratio between two probability density functions is an important component
of various tasks, including selection bias correction, novelty detection and
classification. Recently, several estimators of this ratio have been proposed.
Most of these methods fail if the sample space is high-dimensional, and hence
require a dimension reduction step, the result of which can be a significant
loss of information. Here we propose a simple-to-implement, fully nonparametric
density ratio estimator that expands the ratio in terms of the eigenfunctions
of a kernel-based operator; these functions reflect the underlying geometry of
the data (e.g., submanifold structure), often leading to better estimates
without an explicit dimension reduction step. We show how our general framework
can be extended to address another important problem, the estimation of a
likelihood function in situations where that function cannot be
well-approximated by an analytical form. One is often faced with this situation
when performing statistical inference with data from the sciences, due the
complexity of the data and of the processes that generated those data. We
emphasize applications where using existing likelihood-free methods of
inference would be challenging due to the high dimensionality of the sample
space, but where our spectral series method yields a reasonable estimate of the
likelihood function. We provide theoretical guarantees and illustrate the
effectiveness of our proposed method with numerical experiments.Comment: With supplementary materia
Combining local- and large-scale models to predict the distributions of invasive plant species
Habitat-distribution models are increasingly used to predict the potential distributions of invasive species and to inform monitoring. However, these models assume that species are in equilibrium with the environment, which is clearly not true for most invasive species. Although this assumption is frequently acknowledged, solutions have not been adequately addressed. There are several potential methods for improving habitat-distribution models. Models that require only presence data may be more effective for invasive species, but this assumption has rarely been tested. In addition, combining modeling types to form ‘ensemble’ models may improve the accuracy of predictions. However, even with these improvements, models developed for recently invaded areas are greatly influenced by the current distributions of species and thus reflect near- rather than long-term potential for invasion. Larger scale models from species’ native and invaded ranges may better reflect long-term invasion potential, but they lack finer scale resolution. We compared logistic regression (which uses presence/absence data) and two presence-only methods for modeling the potential distributions of three invasive plant species on the Olympic Peninsula in Washington State, USA. We then combined the three methods to create ensemble models. We also developed climate-envelope models for the same species based on larger scale distributions and combined models from multiple scales to create an index of near- and long-term invasion risk to inform monitoring in Olympic National Park (ONP). Neither presence-only nor ensemble models were more accurate than logistic regression for any of the species. Larger scale models predicted much greater areas at risk of invasion. Our index of near- and long-term invasion risk indicates that \u3c4% of ONP is at high near-term risk of invasion while 67-99% of the Park is at moderate or high long-term risk of invasion. We demonstrate how modeling results can be used to guide the design of monitoring protocols and monitoring results can in turn be used to refine models. We propose that by using models from multiple scales to predict invasion risk and by explicitly linking model development to monitoring, it may be possible to overcome some of the limitations of habitat-distribution models
Database anonymization services
The progress of technology and the development of powerful databases have made it possible to store and easily access continually increasing amounts of sensitive data about people. Since personal information is becoming common in many different databases, it is vital that this data be hidden to ensure privacy of the individuals whose records are stored in these repositories. Database anonymization is the key to securing these databases by ensuring that database users will be unable to reveal sensitive personal information by intelligently structuring their queries.
We analyzed the structure of the BiomData database which contains images and sound recordings of six biometric modalities acquired from hundreds of volunteers. To ensure the confidentiality of these volunteers, our goal was to prevent queries which would allow database users to obtain images of easily identifiable biometric data (facial images, for example) together with the corresponding images of modalities for which user\u27s anonymity is required (fingerprint images, for example). ERUCES Tricryption® Engine was used to anonymize the links between the six biometric modality tables contained in the database, thereby enhancing privacy of volunteers who participate in the biometric collection study while promoting an open data sharing research environment
- …