10,027 research outputs found
Meta learning of bounds on the Bayes classifier error
Meta learning uses information from base learners (e.g. classifiers or
estimators) as well as information about the learning problem to improve upon
the performance of a single base learner. For example, the Bayes error rate of
a given feature space, if known, can be used to aid in choosing a classifier,
as well as in feature selection and model selection for the base classifiers
and the meta classifier. Recent work in the field of f-divergence functional
estimation has led to the development of simple and rapidly converging
estimators that can be used to estimate various bounds on the Bayes error. We
estimate multiple bounds on the Bayes error using an estimator that applies
meta learning to slowly converging plug-in estimators to obtain the parametric
convergence rate. We compare the estimated bounds empirically on simulated data
and then estimate the tighter bounds on features extracted from an image patch
analysis of sunspot continuum and magnetogram images.Comment: 6 pages, 3 figures, to appear in proceedings of 2015 IEEE Signal
Processing and SP Education Worksho
Direct Estimation of Information Divergence Using Nearest Neighbor Ratios
We propose a direct estimation method for R\'{e}nyi and f-divergence measures
based on a new graph theoretical interpretation. Suppose that we are given two
sample sets and , respectively with and samples, where
is a constant value. Considering the -nearest neighbor (-NN)
graph of in the joint data set , we show that the average powered
ratio of the number of points to the number of points among all -NN
points is proportional to R\'{e}nyi divergence of and densities. A
similar method can also be used to estimate f-divergence measures. We derive
bias and variance rates, and show that for the class of -H\"{o}lder
smooth functions, the estimator achieves the MSE rate of
. Furthermore, by using a weighted ensemble
estimation technique, for density functions with continuous and bounded
derivatives of up to the order , and some extra conditions at the support
set boundary, we derive an ensemble estimator that achieves the parametric MSE
rate of . Our estimators are more computationally tractable than other
competing estimators, which makes them appealing in many practical
applications.Comment: 2017 IEEE International Symposium on Information Theory (ISIT
Information Theoretic Structure Learning with Confidence
Information theoretic measures (e.g. the Kullback Liebler divergence and
Shannon mutual information) have been used for exploring possibly nonlinear
multivariate dependencies in high dimension. If these dependencies are assumed
to follow a Markov factor graph model, this exploration process is called
structure discovery. For discrete-valued samples, estimates of the information
divergence over the parametric class of multinomial models lead to structure
discovery methods whose mean squared error achieves parametric convergence
rates as the sample size grows. However, a naive application of this method to
continuous nonparametric multivariate models converges much more slowly. In
this paper we introduce a new method for nonparametric structure discovery that
uses weighted ensemble divergence estimators that achieve parametric
convergence rates and obey an asymptotic central limit theorem that facilitates
hypothesis testing and other types of statistical validation.Comment: 10 pages, 3 figure
The intrinsic value of HFO features as a biomarker of epileptic activity
High frequency oscillations (HFOs) are a promising biomarker of epileptic
brain tissue and activity. HFOs additionally serve as a prototypical example of
challenges in the analysis of discrete events in high-temporal resolution,
intracranial EEG data. Two primary challenges are 1) dimensionality reduction,
and 2) assessing feasibility of classification. Dimensionality reduction
assumes that the data lie on a manifold with dimension less than that of the
feature space. However, previous HFO analyses have assumed a linear manifold,
global across time, space (i.e. recording electrode/channel), and individual
patients. Instead, we assess both a) whether linear methods are appropriate and
b) the consistency of the manifold across time, space, and patients. We also
estimate bounds on the Bayes classification error to quantify the distinction
between two classes of HFOs (those occurring during seizures and those
occurring due to other processes). This analysis provides the foundation for
future clinical use of HFO features and buides the analysis for other discrete
events, such as individual action potentials or multi-unit activity.Comment: 5 pages, 5 figure
Image patch analysis of sunspots and active regions. II. Clustering via matrix factorization
Separating active regions that are quiet from potentially eruptive ones is a
key issue in Space Weather applications. Traditional classification schemes
such as Mount Wilson and McIntosh have been effective in relating an active
region large scale magnetic configuration to its ability to produce eruptive
events. However, their qualitative nature prevents systematic studies of an
active region's evolution for example. We introduce a new clustering of active
regions that is based on the local geometry observed in Line of Sight
magnetogram and continuum images. We use a reduced-dimension representation of
an active region that is obtained by factoring the corresponding data matrix
comprised of local image patches. Two factorizations can be compared via the
definition of appropriate metrics on the resulting factors. The distances
obtained from these metrics are then used to cluster the active regions. We
find that these metrics result in natural clusterings of active regions. The
clusterings are related to large scale descriptors of an active region such as
its size, its local magnetic field distribution, and its complexity as measured
by the Mount Wilson classification scheme. We also find that including data
focused on the neutral line of an active region can result in an increased
correspondence between our clustering results and other active region
descriptors such as the Mount Wilson classifications and the value. We
provide some recommendations for which metrics, matrix factorization
techniques, and regions of interest to use to study active regions.Comment: Accepted for publication in the Journal of Space Weather and Space
Climate (SWSC). 33 pages, 12 figure
Actin Cytoskeleton and Golgi Involvement in Barley stripe mosaic virus Movement and Cell Wall Localization of Triple Gene Block Proteins.
Barley stripe mosaic virus (BSMV) induces massive actin filament thickening at the infection front of infected Nicotiana benthamiana leaves. To determine the mechanisms leading to actin remodeling, fluorescent protein fusions of the BSMV triple gene block (TGB) proteins were coexpressed in cells with the actin marker DsRed: Talin. TGB ectopic expression experiments revealed that TGB3 is a major elicitor of filament thickening, that TGB2 resulted in formation of intermediate DsRed:Talin filaments, and that TGB1 alone had no obvious effects on actin filament structure. Latrunculin B (LatB) treatments retarded BSMV cell-to-cell movement, disrupted actin filament organization, and dramatically decreased the proportion of paired TGB3 foci appearing at the cell wall (CW). BSMV infection of transgenic plants tagged with GFP-KDEL exhibited membrane proliferation and vesicle formation that were especially evident around the nucleus. Similar membrane proliferation occurred in plants expressing TGB2 and/or TGB3, and DsRed: Talin fluorescence in these plants colocalized with the ER vesicles. TGB3 also associated with the Golgi apparatus and overlapped with cortical vesicles appearing at the cell periphery. Brefeldin A treatments disrupted Golgi and also altered vesicles at the CW, but failed to interfere with TGB CW localization. Our results indicate that actin cytoskeleton interactions are important in BSMV cell-to-cell movement and for CW localization of TGB3
Image patch analysis of sunspots and active regions. I. Intrinsic dimension and correlation analysis
The flare-productivity of an active region is observed to be related to its
spatial complexity. Mount Wilson or McIntosh sunspot classifications measure
such complexity but in a categorical way, and may therefore not use all the
information present in the observations. Moreover, such categorical schemes
hinder a systematic study of an active region's evolution for example. We
propose fine-scale quantitative descriptors for an active region's complexity
and relate them to the Mount Wilson classification. We analyze the local
correlation structure within continuum and magnetogram data, as well as the
cross-correlation between continuum and magnetogram data. We compute the
intrinsic dimension, partial correlation, and canonical correlation analysis
(CCA) of image patches of continuum and magnetogram active region images taken
from the SOHO-MDI instrument. We use masks of sunspots derived from continuum
as well as larger masks of magnetic active regions derived from the magnetogram
to analyze separately the core part of an active region from its surrounding
part. We find the relationship between complexity of an active region as
measured by Mount Wilson and the intrinsic dimension of its image patches.
Partial correlation patterns exhibit approximately a third-order Markov
structure. CCA reveals different patterns of correlation between continuum and
magnetogram within the sunspots and in the region surrounding the sunspots.
These results also pave the way for patch-based dictionary learning with a view
towards automatic clustering of active regions.Comment: Accepted for publication in the Journal of Space Weather and Space
Climate (SWSC). 23 pages, 11 figure
Capabilities for transdisciplinary research
Problems framed as societal challenges have provided fresh impetus for transdisciplinary research. In response, funders have started programmes aimed at increasing transdisciplinary research capacity. However, current programme evaluations do not adequately measure the skills and characteristics of individuals and collectives doing this research. Addressing this gap, we propose a systematic framework for evaluating transdisciplinary research based on the Capability Approach, a set of concepts designed to assess practices, institutions, and people based on public values. The framework is operationalized through a mixed-method procedure which evaluates capabilities as they are valued and experienced by researchers themselves. The procedure is tested on a portfolio of ‘pump-priming’ research projects in the UK. We find these projects are sites of capability development in three ways: through convening cognitive capabilities required for academic practice; cultivating informal tacit capabilities; and maintaining often unacknowledged backstage capabilities over durations that extend beyond the lifetime of individual projects. Directing greater attention to these different modes of capability development in transdisciplinary research programmes may be useful formatively in identifying areas for ongoing project support, and also in steering research system capacity towards societal needs
- …