57,666 research outputs found
Regression trees for longitudinal and multiresponse data
Previous algorithms for constructing regression tree models for longitudinal
and multiresponse data have mostly followed the CART approach. Consequently,
they inherit the same selection biases and computational difficulties as CART.
We propose an alternative, based on the GUIDE approach, that treats each
longitudinal data series as a curve and uses chi-squared tests of the residual
curve patterns to select a variable to split each node of the tree. Besides
being unbiased, the method is applicable to data with fixed and random time
points and with missing values in the response or predictor variables.
Simulation results comparing its mean squared prediction error with that of
MVPART are given, as well as examples comparing it with standard linear mixed
effects and generalized estimating equation models. Conditions for asymptotic
consistency of regression tree function estimates are also given.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS596 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Optimal crossover designs for the proportional model
In crossover design experiments, the proportional model, where the carryover
effects are proportional to their direct treatment effects, has draw attentions
in recent years. We discover that the universally optimal design under the
traditional model is E-optimal design under the proportional model. Moreover,
we establish equivalence theorems of Kiefer-Wolfowitz's type for four popular
optimality criteria, namely A, D, E and T (trace).Comment: Published in at http://dx.doi.org/10.1214/13-AOS1148 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings
Learning word embeddings has received a significant amount of attention
recently. Often, word embeddings are learned in an unsupervised manner from a
large collection of text. The genre of the text typically plays an important
role in the effectiveness of the resulting embeddings. How to effectively train
word embedding models using data from different domains remains a problem that
is underexplored. In this paper, we present a simple yet effective method for
learning word embeddings based on text from different domains. We demonstrate
the effectiveness of our approach through extensive experiments on various
down-stream NLP tasks.Comment: 7 pages, accepted by EMNLP 201
Protein secondary structure prediction by combining hidden Markov models and sliding window scores
Instead of conformation states of single residues, refined conformation
states of quintuplets are proposed to reflect conformation correlation. Simple
hidden Markov models combining with sliding window scores are used for
predicting secondary structure of a protein from its amino acid sequence. Since
the length of protein conformation segments varies in a narrow range, we ignore
the duration effect of the length distribution. The window scores for residues
are a window version of the Chou-Fasman propensities estimated under an
approximation of conditional independency. Different window widths are
examined, and the optimal width is found to be 17. A high accuracy about 70% is
achieved.Comment: 8 pages, 1 figure, 2 table
The factorization in exclusive B decays: a critical look
I review the theoretical ideas and concepts along the line of factorization
in the exclusive B decays. In order to understand the naive factorization, the
effective field theories and the perturbative method of QCD are introduced and
developed. We focus our discussions on the large energy effective theory, the
QCD factorization approach and the soft-collinear effective theory.Comment: Talk at (or Contribution to) the International Workshop on QCD:
QCD@Work 2003 - Conversano (Italy) 14-18 June 2003 (eConf C030614). 6 page
Entropic Approach for Reduction of Amino Acid Alphabets
The primitive data for deducing the Miyazawa-Jernigan contact energy or
BLOSUM score metrix are the pair frequency counts. Each amino acid corresponds
to a distribution. Taking the Kullback-Leibler distance of two probability
distributions as resemblance coefficient and relating cluster to mixed
population, we perform cluster analysis of amino acids based on the frequecy
counts data. Furthermore, Ward's clustering is also obtained by adopting the
average score as an objective function. An ordinal cophenetic is introduced to
compare results from different clustering methods.Comment: 6 pages, 1 figure, 6 table
Sudakov form factor in effective field theory
We discuss the Sudakov form factor in the framework of the soft-collinear
effective theory. The running of the short distance coefficient function from
high to low scale gives the summation of Sudakov logarithms to all orders. Our
discussions concentrate on the factorization and derivation of the
renormalization group equation from the effective theory point of view. The
intuitive interpretation of the renormalization group method is discussed. We
compared our method with other resummation approaches in the literatures.Comment: 27 pages, 7 figures, revtex
The QCD factorization in decays
A study of hadron pair production mechanism is motivated by the recent
observed decays . One novel phenomenon is threshold
enhancement of the kaon pair production. We show that these decays in the heavy
quark mass limit can be factorized into a generalized form. The new
non-perturbative quantity is the generalized distribution amplitude which
describes how a quark-antiquark pair transmits into the hadron pair. A proof of
factorization of decays to all-orders is performed
by using the soft-collinear effective theory. The phenomenological application
is discussed in brief.Comment: 14 pages, 2 figure
Multimodal Emotion Recognition Using Multimodal Deep Learning
To enhance the performance of affective models and reduce the cost of
acquiring physiological signals for real-world applications, we adopt
multimodal deep learning approach to construct affective models from multiple
physiological signals. For unimodal enhancement task, we indicate that the best
recognition accuracy of 82.11% on SEED dataset is achieved with shared
representations generated by Deep AutoEncoder (DAE) model. For multimodal
facilitation tasks, we demonstrate that the Bimodal Deep AutoEncoder (BDAE)
achieves the mean accuracies of 91.01% and 83.25% on SEED and DEAP datasets,
respectively, which are much superior to the state-of-the-art approaches. For
cross-modal learning task, our experimental results demonstrate that the mean
accuracy of 66.34% is achieved on SEED dataset through shared representations
generated by EEG-based DAE as training samples and shared representations
generated by eye-based DAE as testing sample, and vice versa
Core Influence Mechanism on Vertex-Cover Problem through Leaf-Removal-Core Breaking
Leaf-Removal process has been widely researched and applied in many
mathematical and physical fields to help understand the complex systems, and a
lot of problems including the minimal vertex-cover are deeply related to this
process and the Leaf-Removal cores. In this paper, based on the structural
features of the Leaf-Removal cores, a method named Core Influence is proposed
to break the graphs into No-Leaf-Removal-Core ones, which takes advantages of
identifying some significant nodes by localized and greedy strategy. By
decomposing the minimal vertex-cover problem into the Leaf-Removal cores
breaking process and maximal matching of the remained graphs, it is proved that
any minimal vertex-covers of the whole graph can be located into these two
processes, of which the latter one is a P problem, and the best boundary is
achieved at the transition point. Compared with other node importance indices,
the Core Influence method could break down the Leaf-Removal cores much faster
and get the no-core graphs by removing fewer nodes from the graphs. Also, the
vertex-cover numbers resulted from this method are lower than existing node
importance measurements, and compared with the exact minimal vertex-cover
numbers, this method performs appropriate accuracy and stability at different
scales. This research provides a new localized greedy strategy to break the
hard Leaf-Removal Cores efficiently and heuristic methods could be constructed
to help understand some NP problems.Comment: 11pages, 6 figures, 2 table
- …