6,464 research outputs found

    Statistical Mechanics of Linear and Nonlinear Time-Domain Ensemble Learning

    Full text link
    Conventional ensemble learning combines students in the space domain. In this paper, however, we combine students in the time domain and call it time-domain ensemble learning. We analyze, compare, and discuss the generalization performances regarding time-domain ensemble learning of both a linear model and a nonlinear model. Analyzing in the framework of online learning using a statistical mechanical method, we show the qualitatively different behaviors between the two models. In a linear model, the dynamical behaviors of the generalization error are monotonic. We analytically show that time-domain ensemble learning is twice as effective as conventional ensemble learning. Furthermore, the generalization error of a nonlinear model features nonmonotonic dynamical behaviors when the learning rate is small. We numerically show that the generalization performance can be improved remarkably by using this phenomenon and the divergence of students in the time domain.Comment: 11 pages, 7 figure

    Minimalist AdaBoost for blemish identification in potatoes

    Get PDF
    We present a multi-class solution based on minimalist Ad- aBoost for identifying blemishes present in visual images of potatoes. Using training examples we use Real AdaBoost to rst reduce the fea- ture set by selecting ve features for each class, then train binary clas- siers for each class, classifying each testing example according to the binary classier with the highest certainty. Against hand-drawn ground truth data we achieve a pixel match of 83% accuracy in white potatoes and 82% in red potatoes. For the task of identifying which blemishes are present in each potato within typical industry dened criteria (10% coverage) we achieve accuracy rates of 93% and 94%, respectively

    A Non-Sequential Representation of Sequential Data for Churn Prediction

    Get PDF
    We investigate the length of event sequence giving best predictions when using a continuous HMM approach to churn prediction from sequential data. Motivated by observations that predictions based on only the few most recent events seem to be the most accurate, a non-sequential dataset is constructed from customer event histories by averaging features of the last few events. A simple K-nearest neighbor algorithm on this dataset is found to give significantly improved performance. It is quite intuitive to think that most people will react only to events in the fairly recent past. Events related to telecommunications occurring months or years ago are unlikely to have a large impact on a customer’s future behaviour, and these results bear this out. Methods that deal with sequential data also tend to be much more complex than those dealing with simple nontemporal data, giving an added benefit to expressing the recent information in a non-sequential manner

    Optimization of the Asymptotic Property of Mutual Learning Involving an Integration Mechanism of Ensemble Learning

    Full text link
    We propose an optimization method of mutual learning which converges into the identical state of optimum ensemble learning within the framework of on-line learning, and have analyzed its asymptotic property through the statistical mechanics method.The proposed model consists of two learning steps: two students independently learn from a teacher, and then the students learn from each other through the mutual learning. In mutual learning, students learn from each other and the generalization error is improved even if the teacher has not taken part in the mutual learning. However, in the case of different initial overlaps(direction cosine) between teacher and students, a student with a larger initial overlap tends to have a larger generalization error than that of before the mutual learning. To overcome this problem, our proposed optimization method of mutual learning optimizes the step sizes of two students to minimize the asymptotic property of the generalization error. Consequently, the optimized mutual learning converges to a generalization error identical to that of the optimal ensemble learning. In addition, we show the relationship between the optimum step size of the mutual learning and the integration mechanism of the ensemble learning.Comment: 13 pages, 3 figures, submitted to Journal of Physical Society of Japa

    Ensemble learning of linear perceptron; Online learning theory

    Full text link
    Within the framework of on-line learning, we study the generalization error of an ensemble learning machine learning from a linear teacher perceptron. The generalization error achieved by an ensemble of linear perceptrons having homogeneous or inhomogeneous initial weight vectors is precisely calculated at the thermodynamic limit of a large number of input elements and shows rich behavior. Our main findings are as follows. For learning with homogeneous initial weight vectors, the generalization error using an infinite number of linear student perceptrons is equal to only half that of a single linear perceptron, and converges with that of the infinite case with O(1/K) for a finite number of K linear perceptrons. For learning with inhomogeneous initial weight vectors, it is advantageous to use an approach of weighted averaging over the output of the linear perceptrons, and we show the conditions under which the optimal weights are constant during the learning process. The optimal weights depend on only correlation of the initial weight vectors.Comment: 14 pages, 3 figures, submitted to Physical Review

    Learning Multi-label Alternating Decision Trees from Texts and Data

    Get PDF
    International audienceMulti-label decision procedures are the target of the supervised learning algorithm we propose in this paper. Multi-label decision procedures map examples to a finite set of labels. Our learning algorithm extends Schapire and Singer?s Adaboost.MH and produces sets of rules that can be viewed as trees like Alternating Decision Trees (invented by Freund and Mason). Experiments show that we take advantage of both performance and readability using boosting techniques as well as tree representations of large set of rules. Moreover, a key feature of our algorithm is the ability to handle heterogenous input data: discrete and continuous values and text data. Keywords boosting - alternating decision trees - text mining - multi-label problem

    How to Find More Supernovae with Less Work: Object Classification Techniques for Difference Imaging

    Get PDF
    We present the results of applying new object classification techniques to difference images in the context of the Nearby Supernova Factory supernova search. Most current supernova searches subtract reference images from new images, identify objects in these difference images, and apply simple threshold cuts on parameters such as statistical significance, shape, and motion to reject objects such as cosmic rays, asteroids, and subtraction artifacts. Although most static objects subtract cleanly, even a very low false positive detection rate can lead to hundreds of non-supernova candidates which must be vetted by human inspection before triggering additional followup. In comparison to simple threshold cuts, more sophisticated methods such as Boosted Decision Trees, Random Forests, and Support Vector Machines provide dramatically better object discrimination. At the Nearby Supernova Factory, we reduced the number of non-supernova candidates by a factor of 10 while increasing our supernova identification efficiency. Methods such as these will be crucial for maintaining a reasonable false positive rate in the automated transient alert pipelines of upcoming projects such as PanSTARRS and LSST.Comment: 25 pages; 6 figures; submitted to Ap

    Statistical Mechanics of Nonlinear On-line Learning for Ensemble Teachers

    Full text link
    We analyze the generalization performance of a student in a model composed of nonlinear perceptrons: a true teacher, ensemble teachers, and the student. We calculate the generalization error of the student analytically or numerically using statistical mechanics in the framework of on-line learning. We treat two well-known learning rules: Hebbian learning and perceptron learning. As a result, it is proven that the nonlinear model shows qualitatively different behaviors from the linear model. Moreover, it is clarified that Hebbian learning and perceptron learning show qualitatively different behaviors from each other. In Hebbian learning, we can analytically obtain the solutions. In this case, the generalization error monotonically decreases. The steady value of the generalization error is independent of the learning rate. The larger the number of teachers is and the more variety the ensemble teachers have, the smaller the generalization error is. In perceptron learning, we have to numerically obtain the solutions. In this case, the dynamical behaviors of the generalization error are non-monotonic. The smaller the learning rate is, the larger the number of teachers is; and the more variety the ensemble teachers have, the smaller the minimum value of the generalization error is.Comment: 13 pages, 9 figure
    • …
    corecore