282 research outputs found
The Bayesian Decision Tree Technique with a Sweeping Strategy
The uncertainty of classification outcomes is of crucial importance for many
safety critical applications including, for example, medical diagnostics. In
such applications the uncertainty of classification can be reliably estimated
within a Bayesian model averaging technique that allows the use of prior
information. Decision Tree (DT) classification models used within such a
technique gives experts additional information by making this classification
scheme observable. The use of the Markov Chain Monte Carlo (MCMC) methodology
of stochastic sampling makes the Bayesian DT technique feasible to perform.
However, in practice, the MCMC technique may become stuck in a particular DT
which is far away from a region with a maximal posterior. Sampling such DTs
causes bias in the posterior estimates, and as a result the evaluation of
classification uncertainty may be incorrect. In a particular case, the negative
effect of such sampling may be reduced by giving additional prior information
on the shape of DTs. In this paper we describe a new approach based on sweeping
the DTs without additional priors on the favorite shape of DTs. The
performances of Bayesian DT techniques with the standard and sweeping
strategies are compared on a synthetic data as well as on real datasets.
Quantitatively evaluating the uncertainty in terms of entropy of class
posterior probabilities, we found that the sweeping strategy is superior to the
standard strategy
Unsupervised Feature Selection with Adaptive Structure Learning
The problem of feature selection has raised considerable interests in the
past decade. Traditional unsupervised methods select the features which can
faithfully preserve the intrinsic structures of data, where the intrinsic
structures are estimated using all the input features of data. However, the
estimated intrinsic structures are unreliable/inaccurate when the redundant and
noisy features are not removed. Therefore, we face a dilemma here: one need the
true structures of data to identify the informative features, and one need the
informative features to accurately estimate the true structures of data. To
address this, we propose a unified learning framework which performs structure
learning and feature selection simultaneously. The structures are adaptively
learned from the results of feature selection, and the informative features are
reselected to preserve the refined structures of data. By leveraging the
interactions between these two essential tasks, we are able to capture accurate
structures and select more informative features. Experimental results on many
benchmark data sets demonstrate that the proposed method outperforms many state
of the art unsupervised feature selection methods
Recipes for sparse LDA of horizontal data
Many important modern applications require analyzing data with more variables than observations, called for short horizontal. In such situation the classical Fisher’s linear discriminant analysis (LDA) does not possess solution because the within-group scatter matrix is singular. Moreover, the number of the variables is usually huge and the classical type of solutions (discriminant functions) are difficult to interpret as they involve all available variables. Nowadays, the aim is to develop fast and reliable algorithms for sparse LDA of horizontal data. The resulting discriminant functions depend on very few original variables, which facilitates their interpretation. The main theoretical and numerical challenge is how to cope with the singularity of the within-group scatter matrix. This work aims at classifying the existing approaches according to the way they tackle this singularity issue, and suggest new ones
Standard survey methods for estimating colony losses and explanatory risk factors in Apis mellifera
This chapter addresses survey methodology and questionnaire design for the collection of data pertaining to estimation of honey bee colony loss rates and identification of risk factors for colony loss. Sources of error in surveys are described. Advantages and disadvantages of different random and non-random sampling strategies and different modes of data collection are presented to enable the researcher to make an informed choice. We discuss survey and questionnaire methodology in some detail, for the purpose of raising awareness of issues to be considered during the survey design stage in order to minimise error and bias in the results. Aspects of survey design are illustrated using surveys in Scotland. Part of a standardized questionnaire is given as a further example, developed by the COLOSS working group for Monitoring and Diagnosis. Approaches to data analysis are described, focussing on estimation of loss rates. Dutch monitoring data from 2012 were used for an example of a statistical analysis with the public domain R software. We demonstrate the estimation of the overall proportion of losses and corresponding confidence interval using a quasi-binomial model to account for extra-binomial variation. We also illustrate generalized linear model fitting when incorporating a single risk factor, and derivation of relevant confidence intervals
High connectivity among locally adapted populations of a marine fish (Menidia menidia)
Author Posting. © Ecological Society of America, 2010. This article is posted here by permission of Ecological Society of America for personal use, not for redistribution. The definitive version was published in Ecology 91 (2010): 3526–3537, doi:10.1890/09-0548.1.Patterns of connectivity are important in understanding the geographic scale of local adaptation in marine populations. While natural selection can lead to local adaptation, high connectivity can diminish the potential for such adaptation to occur. Connectivity, defined as the exchange of individuals among subpopulations, is presumed to be significant in most marine species due to life histories that include widely dispersive stages. However, evidence of local adaptation in marine species, such the Atlantic silverside, Menidia menidia, raises questions concerning the degree of connectivity. We examined geochemical signatures in the otoliths, or ear bones, of adult Atlantic silversides collected in 11 locations along the northeastern coast of the United States from New Jersey to Maine in 2004 and eight locations in 2005 using laser ablation inductively coupled plasma mass spectrometry (ICP-MS) and isotope ratio monitoring mass spectrometry (irm-MS). These signatures were then compared to baseline signatures of juvenile fish of known origin to determine natal origin of these adult fish. We then estimated migration distances and the degree of mixing from these data. In both years, fish generally had the highest probability of originating from the same location in which they were captured (0.01–0.80), but evidence of mixing throughout the sample area was present. Furthermore, adult M. menidia exhibit highly dispersive behavior with some fish migrating over 700 km. The probability of adult fish returning to natal areas differed between years, with the probability being, on average, 0.2 higher in the second year. These findings demonstrate that marine species with largely open populations are capable of local adaptation despite apparently high gene flow.This work was funded by the National Science Foundation
(grant OCE-0425830 to D. O. Conover and grant OCE-
0134998 to S. R. Thorrold) and the New York State
Department of Environmental Conservation
NMJ-morph reveals principal components of synaptic morphology influencing structure–function relationships at the neuromuscular junction
The ability to form synapses is one of the fundamental properties required by the mammalian nervous system to generate network connectivity. Structural and functional diversity among synaptic populations is a key hallmark of network diversity, and yet we know comparatively little about the morphological principles that govern variability in the size, shape and strength of synapses. Using the mouse neuromuscular junction (NMJ) as an experimentally accessible model synapse, we report on the development of a robust, standardized methodology to facilitate comparative morphometric analysis of synapses (‘NMJ-morph’). We used NMJ-morph to generate baseline morphological reference data for 21 separate pre- and post-synaptic variables from 2160 individual NMJs belonging to nine anatomically distinct populations of synapses, revealing systematic differences in NMJ morphology between defined synaptic populations. Principal components analysis revealed that overall NMJ size and the degree of synaptic fragmentation, alongside pre-synaptic axon diameter, were the most critical parameters in defining synaptic morphology. ‘Average’ synaptic morphology was remarkably conserved between comparable synapses from the left and right sides of the body. Systematic differences in synaptic morphology predicted corresponding differences in synaptic function that were supported by physiological recordings, confirming the robust relationship between synaptic size and strength
- …