21,824 research outputs found
Mixtures of Spatial Spline Regressions
We present an extension of the functional data analysis framework for
univariate functions to the analysis of surfaces: functions of two variables.
The spatial spline regression (SSR) approach developed can be used to model
surfaces that are sampled over a rectangular domain. Furthermore, combining SSR
with linear mixed effects models (LMM) allows for the analysis of populations
of surfaces, and combining the joint SSR-LMM method with finite mixture models
allows for the analysis of populations of surfaces with sub-family structures.
Through the mixtures of spatial splines regressions (MSSR) approach developed,
we present methodologies for clustering surfaces into sub-families, and for
performing surface-based discriminant analysis. The effectiveness of our
methodologies, as well as the modeling capabilities of the SSR model are
assessed through an application to handwritten character recognition
Inferring the Latent Incidence of Inefficiency from DEA Estimates and Bayesian Priors
Data envelopment analysis (DEA) is among the most popular empirical tools for measuring cost and productive efficiency. Because DEA is a linear programming technique, establishing formal statistical properties for outcomes is difficult. We show that the incidence of inefficiency within a population of Decision Making Units (DMUs) is a latent variable, with DEA outcomes providing only noisy sample-based categorizations of inefficiency. We then use a Bayesian approach to infer an appropriate posterior distribution for the incidence of inefficient DMUs based on a random sample of DEA outcomes and a prior distribution on the incidence of inefficiency. The methodology applies to both finite and infinite populations, and to sampling DMUs with and without replacement, and accounts for the noise in the DEA characterization of inefficiency within a coherent Bayesian approach to the problem. The result is an appropriately up-scaled, noise-adjusted inference regarding the incidence of inefficiency in a population of DMUs.Data Envelopment Analysis, latent inefficiency, Bayesian inference,Beta priors, posterior incidence of inefficiency
Mapping cognitive ontologies to and from the brain
Imaging neuroscience links brain activation maps to behavior and cognition
via correlational studies. Due to the nature of the individual experiments,
based on eliciting neural response from a small number of stimuli, this link is
incomplete, and unidirectional from the causal point of view. To come to
conclusions on the function implied by the activation of brain regions, it is
necessary to combine a wide exploration of the various brain functions and some
inversion of the statistical inference. Here we introduce a methodology for
accumulating knowledge towards a bidirectional link between observed brain
activity and the corresponding function. We rely on a large corpus of imaging
studies and a predictive engine. Technically, the challenges are to find
commonality between the studies without denaturing the richness of the corpus.
The key elements that we contribute are labeling the tasks performed with a
cognitive ontology, and modeling the long tail of rare paradigms in the corpus.
To our knowledge, our approach is the first demonstration of predicting the
cognitive content of completely new brain images. To that end, we propose a
method that predicts the experimental paradigms across different studies.Comment: NIPS (Neural Information Processing Systems), United States (2013
Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications
Wireless sensor networks monitor dynamic environments that change rapidly
over time. This dynamic behavior is either caused by external factors or
initiated by the system designers themselves. To adapt to such conditions,
sensor networks often adopt machine learning techniques to eliminate the need
for unnecessary redesign. Machine learning also inspires many practical
solutions that maximize resource utilization and prolong the lifespan of the
network. In this paper, we present an extensive literature review over the
period 2002-2013 of machine learning methods that were used to address common
issues in wireless sensor networks (WSNs). The advantages and disadvantages of
each proposed algorithm are evaluated against the corresponding problem. We
also provide a comparative guide to aid WSN designers in developing suitable
machine learning solutions for their specific application challenges.Comment: Accepted for publication in IEEE Communications Surveys and Tutorial
Multivariate regression analysis of atmospheric density in the region 30 to 110 km
Multivariate regression analysis of atmospheric density in region 30 to 100 k
Block designs for experiments with non-normal response
Many experiments measure a response that cannot be adequately described by a linear model withnormally distributed errors and are often run in blocks of homogeneous experimental units. Wedevelop the first methods of obtaining efficient block designs for experiments with an exponentialfamily response described by a marginal model fitted via Generalized Estimating Equations. Thismethodology is appropriate when the blocking factor is a nuisance variable as, for example, occursin industrial experiments. A D-optimality criterion is developed for finding designs robust to thevalues of the marginal model parameters and applied using three strategies: unrestricted algorithmicsearch, use of minimum-support designs, and blocking of an optimal design for the correspondingGeneralized Linear Model. Designs obtained from each strategy are critically compared and shownto be much more efficient than designs that ignore the blocking structure. The designs are comparedfor a range of values of the intra-block working correlation and for exchangeable, autoregressive andnearest neighbor structures. An analysis strategy is developed for a binomial response that allows es-timation from experiments with sparse data, and its efectiveness demonstrated. The design strategiesare motivated and demonstrated through the planning of an experiment from the aeronautics industr
A rule-of-thumb for the variable bandwidth selection in kernel hazard rate estimation
In nonparametric curve estimation the decision about the type of smoothing parameter is critical for the practical performance. The nearest neighbor bandwidth as introduced by Gefeller and Dette 1992 for censored data in survival analysis is specified by one parameter, namely the number of nearest neighbors. Bandwidth selection in this setting is rarely investigated although not linked closely to the frequently studied fixed bandwidth. We introduce a selection algorithm in the hazard rate estimation context. The approach uses a newly developed link to the fixed bandwidth which identifies the variable bandwidth as additional smoothing step. The procedure gains further data-adaption after fixed bandwidth smoothing. Assessment by a Monte Carlo simulation and a clinical example demonstrate the practical relevance of the findings. --
- …