6,161 research outputs found
Auto-tail dependence coefficients for stationary solutions of linear stochastic recurrence equations and for GARCH(1,1)
We examine the auto-dependence structure of strictly stationary solutions of linear stochastic recurrence equations and of strictly stationary GARCH(1, 1) processes from the point of view of ordinary and generalized tail dependence coefficients. Since such processes can easily be of infinite variance, a substitute for the usual auto-correlation function is needed
Optimization of the Asymptotic Property of Mutual Learning Involving an Integration Mechanism of Ensemble Learning
We propose an optimization method of mutual learning which converges into the
identical state of optimum ensemble learning within the framework of on-line
learning, and have analyzed its asymptotic property through the statistical
mechanics method.The proposed model consists of two learning steps: two
students independently learn from a teacher, and then the students learn from
each other through the mutual learning. In mutual learning, students learn from
each other and the generalization error is improved even if the teacher has not
taken part in the mutual learning. However, in the case of different initial
overlaps(direction cosine) between teacher and students, a student with a
larger initial overlap tends to have a larger generalization error than that of
before the mutual learning. To overcome this problem, our proposed optimization
method of mutual learning optimizes the step sizes of two students to minimize
the asymptotic property of the generalization error. Consequently, the
optimized mutual learning converges to a generalization error identical to that
of the optimal ensemble learning. In addition, we show the relationship between
the optimum step size of the mutual learning and the integration mechanism of
the ensemble learning.Comment: 13 pages, 3 figures, submitted to Journal of Physical Society of
Japa
Ensemble learning of linear perceptron; Online learning theory
Within the framework of on-line learning, we study the generalization error
of an ensemble learning machine learning from a linear teacher perceptron. The
generalization error achieved by an ensemble of linear perceptrons having
homogeneous or inhomogeneous initial weight vectors is precisely calculated at
the thermodynamic limit of a large number of input elements and shows rich
behavior. Our main findings are as follows. For learning with homogeneous
initial weight vectors, the generalization error using an infinite number of
linear student perceptrons is equal to only half that of a single linear
perceptron, and converges with that of the infinite case with O(1/K) for a
finite number of K linear perceptrons. For learning with inhomogeneous initial
weight vectors, it is advantageous to use an approach of weighted averaging
over the output of the linear perceptrons, and we show the conditions under
which the optimal weights are constant during the learning process. The optimal
weights depend on only correlation of the initial weight vectors.Comment: 14 pages, 3 figures, submitted to Physical Review
Learning Timbre Analogies from Unlabelled Data by Multivariate Tree Regression
This is the Author's Original Manuscript of an article whose final and definitive form, the Version of Record, has been published in the Journal of New Music Research, November 2011, copyright Taylor & Francis. The published article is available online at http://www.tandfonline.com/10.1080/09298215.2011.596938
A Non-Sequential Representation of Sequential Data for Churn Prediction
We investigate the length of event sequence giving best predictions
when using a continuous HMM approach to churn prediction from sequential
data. Motivated by observations that predictions based on only the few most recent
events seem to be the most accurate, a non-sequential dataset is constructed
from customer event histories by averaging features of the last few events. A simple
K-nearest neighbor algorithm on this dataset is found to give significantly
improved performance. It is quite intuitive to think that most people will react
only to events in the fairly recent past. Events related to telecommunications occurring
months or years ago are unlikely to have a large impact on a customer’s
future behaviour, and these results bear this out. Methods that deal with sequential
data also tend to be much more complex than those dealing with simple nontemporal
data, giving an added benefit to expressing the recent information in a
non-sequential manner
Regret analysis for performance metrics in multi-label classification: the case of Hamming and subset zero-one loss
State Measurements with Short Laser Pulses and Lower-Efficiency Photon Detectors
It has been proposed by Cook (Phys. Scr. T 21, 49 (1988)) to use a short
probe laser pulse for state measurements of two-level systems. In previous work
we have investigated to what extent this proposal fulfills the projection
postulate if ideal photon detectors are considered. For detectors with overall
efficiency less than 1 complications arise for single systems, and for this
case we present a simple criterion for a laser pulse to act as a state
measurement and to cause an almost complete state reduction.Comment: 13 pages, LaTeX; submitted to J. mod. Op
A model-based multithreshold method for subgroup identification
Thresholding variable plays a crucial role in subgroup identification for personalizedmedicine. Most existing partitioning methods split the sample basedon one predictor variable. In this paper, we consider setting the splitting rulefrom a combination of multivariate predictors, such as the latent factors, principlecomponents, and weighted sum of predictors. Such a subgrouping methodmay lead to more meaningful partitioning of the population than using a singlevariable. In addition, our method is based on a change point regression modeland thus yields straight forward model-based prediction results. After choosinga particular thresholding variable form, we apply a two-stage multiple changepoint detection method to determine the subgroups and estimate the regressionparameters. We show that our approach can produce two or more subgroupsfrom the multiple change points and identify the true grouping with high probability.In addition, our estimation results enjoy oracle properties. We design asimulation study to compare performances of our proposed and existing methodsand apply them to analyze data sets from a Scleroderma trial and a breastcancer study
Bagging ensemble selection for regression
Bagging ensemble selection (BES) is a relatively new ensemble learning strategy. The strategy can be seen as an ensemble of the ensemble selection from libraries of models (ES) strategy. Previous experimental results on binary classification problems have shown that using random trees as base classifiers, BES-OOB (the most successful variant of BES) is competitive with (and in many cases, superior to) other ensemble learning strategies, for instance, the original ES algorithm, stacking with linear regression, random forests or boosting. Motivated by the promising results in classification, this paper examines the predictive performance of the BES-OOB strategy for regression problems. Our results show that the BES-OOB strategy outperforms Stochastic Gradient Boosting and Bagging when using regression trees as the base learners. Our results also suggest that the advantage of using a diverse model library becomes clear when the model library size is relatively large. We also present encouraging results indicating that the non negative least squares algorithm is a viable approach for pruning an ensemble of ensembles
Ensemble Sales Forecasting Study in Semiconductor Industry
Sales forecasting plays a prominent role in business planning and business
strategy. The value and importance of advance information is a cornerstone of
planning activity, and a well-set forecast goal can guide sale-force more
efficiently. In this paper CPU sales forecasting of Intel Corporation, a
multinational semiconductor industry, was considered. Past sale, future
booking, exchange rates, Gross domestic product (GDP) forecasting, seasonality
and other indicators were innovatively incorporated into the quantitative
modeling. Benefit from the recent advances in computation power and software
development, millions of models built upon multiple regressions, time series
analysis, random forest and boosting tree were executed in parallel. The models
with smaller validation errors were selected to form the ensemble model. To
better capture the distinct characteristics, forecasting models were
implemented at lead time and lines of business level. The moving windows
validation process automatically selected the models which closely represent
current market condition. The weekly cadence forecasting schema allowed the
model to response effectively to market fluctuation. Generic variable
importance analysis was also developed to increase the model interpretability.
Rather than assuming fixed distribution, this non-parametric permutation
variable importance analysis provided a general framework across methods to
evaluate the variable importance. This variable importance framework can
further extend to classification problem by modifying the mean absolute
percentage error(MAPE) into misclassify error. Please find the demo code at :
https://github.com/qx0731/ensemble_forecast_methodsComment: 14 pages, Industrial Conference on Data Mining 2017 (ICDM 2017
- …