Search CORE

6,248 research outputs found

Popular Ensemble Methods: An Empirical Study

Author: Maclin R.
Opitz D.
Publication venue: 'AI Access Foundation'
Publication date: 01/06/2011
Field of study

An ensemble consists of a set of individually trained classifiers (such as neural networks or decision trees) whose predictions are combined when classifying novel instances. Previous research has shown that an ensemble is often more accurate than any of the single classifiers in the ensemble. Bagging (Breiman, 1996c) and Boosting (Freund and Shapire, 1996; Shapire, 1990) are two relatively new but popular methods for producing ensembles. In this paper we evaluate these methods on 23 data sets using both neural networks and decision trees as our classification algorithm. Our results clearly indicate a number of conclusions. First, while Bagging is almost always more accurate than a single classifier, it is sometimes much less accurate than Boosting. On the other hand, Boosting can create ensembles that are less accurate than a single classifier -- especially when using neural networks. Analysis indicates that the performance of the Boosting methods is dependent on the characteristics of the data set being examined. In fact, further results show that Boosting ensembles may overfit noisy data sets, thus decreasing its performance. Finally, consistent with previous studies, our work suggests that most of the gain in an ensemble's performance comes in the first few classifiers combined; however, relatively large gains can be seen up to 25 classifiers when Boosting decision trees

arXiv.org e-Print Archive

Crossref

Feature and Region Selection for Visual Learning

Author: Cabral Ricardo
De la Torre Fernando
Wang Liantao
Zhao Ji
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/01/2016
Field of study

Visual learning problems such as object classification and action recognition are typically approached using extensions of the popular bag-of-words (BoW) model. Despite its great success, it is unclear what visual features the BoW model is learning: Which regions in the image or video are used to discriminate among classes? Which are the most discriminative visual words? Answering these questions is fundamental for understanding existing BoW models and inspiring better models for visual recognition. To answer these questions, this paper presents a method for feature selection and region selection in the visual BoW model. This allows for an intermediate visualization of the features and regions that are important for visual learning. The main idea is to assign latent weights to the features or regions, and jointly optimize these latent variables with the parameters of a classifier (e.g., support vector machine). There are four main benefits of our approach: (1) Our approach accommodates non-linear additive kernels such as the popular

\chi^2

and intersection kernel; (2) our approach is able to handle both regions in images and spatio-temporal regions in videos in a unified way; (3) the feature selection problem is convex, and both problems can be solved using a scalable reduced gradient method; (4) we point out strong connections with multiple kernel learning and multiple instance learning approaches. Experimental results in the PASCAL VOC 2007, MSR Action Dataset II and YouTube illustrate the benefits of our approach

arXiv.org e-Print Archive

FigShare

Transonic Flutter Suppression Control Law Design, Analysis and Wind Tunnel Results

Author: Mukhopadhyay Vivek
Publication venue
Publication date
Field of study

The benchmark active controls technology and wind tunnel test program at NASA Langley Research Center was started with the objective to investigate the nonlinear, unsteady aerodynamics and active flutter suppression of wings in transonic flow. The paper will present the flutter suppression control law design process, numerical nonlinear simulation and wind tunnel test results for the NACA 0012 benchmark active control wing model. The flutter suppression control law design processes using (1) classical, (2) linear quadratic Gaussian (LQG), and (3) minimax techniques are described. A unified general formulation and solution for the LQG and minimax approaches, based on the steady state differential game theory is presented. Design considerations for improving the control law robustness and digital implementation are outlined. It was shown that simple control laws when properly designed based on physical principles, can suppress flutter with limited control power even in the presence of transonic shocks and flow separation. In wind tunnel tests in air and heavy gas medium, the closed-loop flutter dynamic pressure was increased to the tunnel upper limit of 200 psf The control law robustness and performance predictions were verified in highly nonlinear flow conditions, gain and phase perturbations, and spoiler deployment. A non-design plunge instability condition was also successfully suppressed

NASA Technical Reports Server

Recommended from our members

Bayesian Modeling for Mental Health Surveys

Author: Williams Sharifa Zakiya
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

Sample surveys are often used to collect data for obtaining estimates of finite population quantities, such as disease prevalence. However, non-response and sampling frame under-coverage can cause the survey sample to differ from the target population in important ways. To reduce bias in the survey estimates that can arise from these differences, auxiliary information about the target population from sources including administrative files or census data can be used. Survey weighting is one approach commonly used to reduce bias. Although weighted estimates are relatively easy to obtain, they can be inefficient in the presence of highly dispersed weights. Model-based estimation in survey research offers advantages of improved efficiency in the presence of sparse data and highly variable weights. However, these models can be subject to model misspecification. In this dissertation, we propose Bayesian penalized spline regression models for survey inference about proportions in the entire population as well as in sub-populations. The proposed methods incorporate survey weights as covariates using a penalized spline to protect against model misspecification. We show by simulations that the proposed methods perform well, yielding efficient estimates of population proportion for binary survey data in the presence of highly dispersed weights and robust to model misspecification for survey outcomes. We illustrate the use of the proposed methods to estimate the prevalence of lifetime temper dysregulation disorder among National Guard service members overall and in sub-populations defined by gender and race using the Ohio Army National Guard Mental Health Initiative 2008-2009 survey data. We further extend the proposed framework to the setting where individual auxiliary data for the population are not available and utilize a Bayesian bootstrap approach to complete model-based estimation of current and undiagnosed depression in Hispanics/Latinos of different national backgrounds from the 2015 Washington Heights Community Survey

Columbia University Academic Commons

Recommended from our members

Adaptive Optimal Control The Thinking Man's GPC

Author: Bitmead Robert R
Gevers Michel
Wertz Vincent
Publication venue: eScholarship, University of California
Publication date: 01/01/1990
Field of study

Exploring connections between adaptive control theory and practice, this book treats the techniques of linear quadratic optimal control and estimation (Kalman filtering), recursive identification, linear systems theory and robust arguments

eScholarship - University of California

DIAL UCLouvain

From SMOTE to Mixup for Deep Imbalanced Classification

Author: Cheng Wei-Chao
Lin Hsuan-Tien
Mai Tan-Ha
Publication venue
Publication date: 03/11/2023
Field of study

Given imbalanced data, it is hard to train a good classifier using deep learning because of the poor generalization of minority classes. Traditionally, the well-known synthetic minority oversampling technique (SMOTE) for data augmentation, a data mining approach for imbalanced learning, has been used to improve this generalization. However, it is unclear whether SMOTE also benefits deep learning. In this work, we study why the original SMOTE is insufficient for deep learning, and enhance SMOTE using soft labels. Connecting the resulting soft SMOTE with Mixup, a modern data augmentation technique, leads to a unified framework that puts traditional and modern data augmentation techniques under the same umbrella. A careful study within this framework shows that Mixup improves generalization by implicitly achieving uneven margins between majority and minority classes. We then propose a novel margin-aware Mixup technique that more explicitly achieves uneven margins. Extensive experimental results demonstrate that our proposed technique yields state-of-the-art performance on deep imbalanced classification while achieving superior performance on extremely imbalanced data. The code is open-sourced in our developed package https://github.com/ntucllab/imbalanced-DL to foster future research in this direction.Comment: 25 pages, 3 figures. The paper is accepted by TAAI 202

arXiv.org e-Print Archive