1,281 research outputs found
Collaborative Layer-wise Discriminative Learning in Deep Neural Networks
Intermediate features at different layers of a deep neural network are known
to be discriminative for visual patterns of different complexities. However,
most existing works ignore such cross-layer heterogeneities when classifying
samples of different complexities. For example, if a training sample has
already been correctly classified at a specific layer with high confidence, we
argue that it is unnecessary to enforce rest layers to classify this sample
correctly and a better strategy is to encourage those layers to focus on other
samples.
In this paper, we propose a layer-wise discriminative learning method to
enhance the discriminative capability of a deep network by allowing its layers
to work collaboratively for classification. Towards this target, we introduce
multiple classifiers on top of multiple layers. Each classifier not only tries
to correctly classify the features from its input layer, but also coordinates
with other classifiers to jointly maximize the final classification
performance. Guided by the other companion classifiers, each classifier learns
to concentrate on certain training examples and boosts the overall performance.
Allowing for end-to-end training, our method can be conveniently embedded into
state-of-the-art deep networks. Experiments with multiple popular deep
networks, including Network in Network, GoogLeNet and VGGNet, on scale-various
object classification benchmarks, including CIFAR100, MNIST and ImageNet, and
scene classification benchmarks, including MIT67, SUN397 and Places205,
demonstrate the effectiveness of our method. In addition, we also analyze the
relationship between the proposed method and classical conditional random
fields models.Comment: To appear in ECCV 2016. Maybe subject to minor changes before
camera-ready versio
Ecological Release and Venom Evolution of a Predatory Marine Snail at Easter Island
BACKGROUND:Ecological release is coupled with adaptive radiation and ecological diversification yet little is known about the molecular basis of phenotypic changes associated with this phenomenon. The venomous, predatory marine gastropod Conus miliaris has undergone ecological release and exhibits increased dietary breadth at Easter Island. METHODOLOGY/PRINCIPAL FINDINGS:We examined the extent of genetic differentiation of two genes expressed in the venom of C. miliaris among samples from Easter Island, American Samoa and Guam. The population from Easter Island exhibits unique frequencies of alleles that encode distinct peptides at both loci. Levels of divergence at these loci exceed observed levels of divergence observed at a mitochondrial gene region at Easter Island. CONCLUSIONS/SIGNIFICANCE:Patterns of genetic variation at two genes expressed in the venom of this C. miliaris suggest that selection has operated at these genes and contributed to the divergence of venom composition at Easter Island. These results show that ecological release is associated with strong selection pressures that promote the evolution of new phenotypes
Functional Diversity and Structural Disorder in the Human Ubiquitination Pathway
The ubiquitin-proteasome system plays a central role in cellular regulation and protein quality control (PQC). The system is built as a pyramid of increasing complexity, with two E1 (ubiquitin activating), few dozen E2 (ubiquitin conjugating) and several hundred E3 (ubiquitin ligase) enzymes. By collecting and analyzing E3 sequences from the KEGG BRITE database and literature, we assembled a coherent dataset of 563 human E3s and analyzed their various physical features. We found an increase in structural disorder of the system with multiple disorder predictors (IUPred - E1: 5.97%, E2: 17.74%, E3: 20.03%). E3s that can bind E2 and substrate simultaneously (single subunit E3, ssE3) have significantly higher disorder (22.98%) than E3s in which E2 binding (multi RING-finger, mRF, 0.62%), scaffolding (6.01%) and substrate binding (adaptor/substrate recognition subunits, 17.33%) functions are separated. In ssE3s, the disorder was localized in the substrate/adaptor binding domains, whereas the E2-binding RING/HECT-domains were structured. To demonstrate the involvement of disorder in E3 function, we applied normal modes and molecular dynamics analyses to show how a disordered and highly flexible linker in human CBL (an E3 that acts as a regulator of several tyrosine kinase-mediated signalling pathways) facilitates long-range conformational changes bringing substrate and E2-binding domains towards each other and thus assisting in ubiquitin transfer. E3s with multiple interaction partners (as evidenced by data in STRING) also possess elevated levels of disorder (hubs, 22.90% vs. non-hubs, 18.36%). Furthermore, a search in PDB uncovered 21 distinct human E3 interactions, in 7 of which the disordered region of E3s undergoes induced folding (or mutual induced folding) in the presence of the partner. In conclusion, our data highlights the primary role of structural disorder in the functions of E3 ligases that manifests itself in the substrate/adaptor binding functions as well as the mechanism of ubiquitin transfer by long-range conformational transitions. © 2013 Bhowmick et al
Optimally splitting cases for training and testing high dimensional classifiers
<p>Abstract</p> <p>Background</p> <p>We consider the problem of designing a study to develop a predictive classifier from high dimensional data. A common study design is to split the sample into a training set and an independent test set, where the former is used to develop the classifier and the latter to evaluate its performance. In this paper we address the question of what proportion of the samples should be devoted to the training set. How does this proportion impact the mean squared error (MSE) of the prediction accuracy estimate?</p> <p>Results</p> <p>We develop a non-parametric algorithm for determining an optimal splitting proportion that can be applied with a specific dataset and classifier algorithm. We also perform a broad simulation study for the purpose of better understanding the factors that determine the best split proportions and to evaluate commonly used splitting strategies (1/2 training or 2/3 training) under a wide variety of conditions. These methods are based on a decomposition of the MSE into three intuitive component parts.</p> <p>Conclusions</p> <p>By applying these approaches to a number of synthetic and real microarray datasets we show that for linear classifiers the optimal proportion depends on the overall number of samples available and the degree of differential expression between the classes. The optimal proportion was found to depend on the full dataset size (n) and classification accuracy - with higher accuracy and smaller <it>n </it>resulting in more assigned to the training set. The commonly used strategy of allocating 2/3rd of cases for training was close to optimal for reasonable sized datasets (<it>n </it>≥ 100) with strong signals (i.e. 85% or greater full dataset accuracy). In general, we recommend use of our nonparametric resampling approach for determing the optimal split. This approach can be applied to any dataset, using any predictor development method, to determine the best split.</p
How do risk attitudes affect measured confidence?
We examine the relationship between confidence in own absolute performance and risk attitudes using two confidence elicitation procedures: self-reported (non-incentivised) confidence and an incentivised procedure that elicits the certainty equivalent of a bet based on performance. The former procedure reproduces the “hard-easy effect” (underconfidence in easy tasks and overconfidence in hard tasks) found in a large number of studies using non-incentivised self-reports. The latter procedure produces general underconfidence, which is significantly reduced, but not eliminated when we filter out the effects of risk attitudes. Finally, we find that self-reported confidence correlates significantly with features of individual risk attitudes including parameters of individual probability weighting
A random forest approach to the detection of epistatic interactions in case-control studies
<p>Abstract</p> <p>Background</p> <p>The key roles of epistatic interactions between multiple genetic variants in the pathogenesis of complex diseases notwithstanding, the detection of such interactions remains a great challenge in genome-wide association studies. Although some existing multi-locus approaches have shown their successes in small-scale case-control data, the "combination explosion" course prohibits their applications to genome-wide analysis. It is therefore indispensable to develop new methods that are able to reduce the search space for epistatic interactions from an astronomic number of all possible combinations of genetic variants to a manageable set of candidates.</p> <p>Results</p> <p>We studied case-control data from the viewpoint of binary classification. More precisely, we treated single nucleotide polymorphism (SNP) markers as categorical features and adopted the random forest to discriminate cases against controls. On the basis of the gini importance given by the random forest, we designed a sliding window sequential forward feature selection (SWSFS) algorithm to select a small set of candidate SNPs that could minimize the classification error and then statistically tested up to three-way interactions of the candidates. We compared this approach with three existing methods on three simulated disease models and showed that our approach is comparable to, sometimes more powerful than, the other methods. We applied our approach to a genome-wide case-control dataset for Age-related Macular Degeneration (AMD) and successfully identified two SNPs that were reported to be associated with this disease.</p> <p>Conclusion</p> <p>Besides existing pure statistical approaches, we demonstrated the feasibility of incorporating machine learning methods into genome-wide case-control studies. The gini importance offers yet another measure for the associations between SNPs and complex diseases, thereby complementing existing statistical measures to facilitate the identification of epistatic interactions and the understanding of epistasis in the pathogenesis of complex diseases.</p
Observation of associated near-side and away-side long-range correlations in √sNN=5.02 TeV proton-lead collisions with the ATLAS detector
Two-particle correlations in relative azimuthal angle (Δϕ) and pseudorapidity (Δη) are measured in √sNN=5.02 TeV p+Pb collisions using the ATLAS detector at the LHC. The measurements are performed using approximately 1 μb-1 of data as a function of transverse momentum (pT) and the transverse energy (ΣETPb) summed over 3.1<η<4.9 in the direction of the Pb beam. The correlation function, constructed from charged particles, exhibits a long-range (2<|Δη|<5) “near-side” (Δϕ∼0) correlation that grows rapidly with increasing ΣETPb. A long-range “away-side” (Δϕ∼π) correlation, obtained by subtracting the expected contributions from recoiling dijets and other sources estimated using events with small ΣETPb, is found to match the near-side correlation in magnitude, shape (in Δη and Δϕ) and ΣETPb dependence. The resultant Δϕ correlation is approximately symmetric about π/2, and is consistent with a dominant cos2Δϕ modulation for all ΣETPb ranges and particle pT
Search for pair-produced long-lived neutral particles decaying to jets in the ATLAS hadronic calorimeter in ppcollisions at √s=8TeV
The ATLAS detector at the Large Hadron Collider at CERN is used to search for the decay of a scalar boson to a pair of long-lived particles, neutral under the Standard Model gauge group, in 20.3fb−1of data collected in proton–proton collisions at √s=8TeV. This search is sensitive to long-lived particles that decay to Standard Model particles producing jets at the outer edge of the ATLAS electromagnetic calorimeter or inside the hadronic calorimeter. No significant excess of events is observed. Limits are reported on the product of the scalar boson production cross section times branching ratio into long-lived neutral particles as a function of the proper lifetime of the particles. Limits are reported for boson masses from 100 GeVto 900 GeV, and a long-lived neutral particle mass from 10 GeVto 150 GeV
Measurement of the cross-section of high transverse momentum vector bosons reconstructed as single jets and studies of jet substructure in pp collisions at √s = 7 TeV with the ATLAS detector
This paper presents a measurement of the cross-section for high transverse momentum W and Z bosons produced in pp collisions and decaying to all-hadronic final states. The data used in the analysis were recorded by the ATLAS detector at the CERN Large Hadron Collider at a centre-of-mass energy of √s = 7 TeV;{\rm Te}{\rm V}4.6\;{\rm f}{{{\rm b}}^{-1}}{{p}_{{\rm T}}}\gt 320\;{\rm Ge}{\rm V}|\eta |\lt 1.9{{\sigma }_{W+Z}}=8.5\pm 1.7$ pb and is compared to next-to-leading-order calculations. The selected events are further used to study jet grooming techniques
Search for R-parity-violating supersymmetry in events with four or more leptons in sqrt(s) =7 TeV pp collisions with the ATLAS detector
A search for new phenomena in final states with four or more leptons (electrons or muons) is presented. The analysis is based on 4.7 fb−1 of proton-proton collisions delivered by the Large Hadron Collider and recorded with the ATLAS detector. Observations are consistent with Standard Model expectations in two signal regions: one that requires moderate values of missing transverse momentum and another that requires large effective mass. The results are interpreted in a simplified model of R-parity-violating supersymmetry in which a 95% CL exclusion region is set for charged wino masses up to 540 GeV. In an R-parity-violating MSUGRA/CMSSM model, values of m 1/2 up to 820 GeV are excluded for 10 < tan β < 40
- …