Search CORE

119,037 research outputs found

A Mathematical Programming Approach for Integrated Multiple Linear Regression Subset Selection and Validation

Author: Cheong Taesu
Chung Seokhyun
Park Young-Woong
Park Young-Woong
Publication venue: 'Elsevier BV'
Publication date: 26/07/2020
Field of study

Subset selection for multiple linear regression aims to construct a regression model that minimizes errors by selecting a small number of explanatory variables. Once a model is built, various statistical tests and diagnostics are conducted to validate the model and to determine whether the regression assumptions are met. Most traditional approaches require human decisions at this step. For example, the user adding or removing a variable until a satisfactory model is obtained. However, this trial-and-error strategy cannot guarantee that a subset that minimizes the errors while satisfying all regression assumptions will be found. In this paper, we propose a fully automated model building procedure for multiple linear regression subset selection that integrates model building and validation based on mathematical programming. The proposed model minimizes mean squared errors while ensuring that the majority of the important regression assumptions are met. We also propose an efficient constraint to approximate the constraint for the coefficient t-test. When no subset satisfies all of the considered regression assumptions, our model provides an alternative subset that satisfies most of these assumptions. Computational results show that our model yields better solutions (i.e., satisfying more regression assumptions) compared to the state-of-the-art benchmark models while maintaining similar explanatory power

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)

Towards an Iterative Algorithm for the Optimal Boundary Coverage of a 3D Environment

Author: A. Bottino
A. Bottino
D.R. Carraghan
D.R. Woods
K. Tarabanis
K.A. Tarabanis
P. Oestergard
T.S. Newman
W.R. Scott
Publication venue: Springer
Publication date: 01/01/2009
Field of study

This paper presents a new optimal algorithm for locating a set of sensors in 3D able to see the boundaries of a polyhedral environment. Our approach is iterative and is based on a lower bound on the sensors' number and on a restriction of the original problem requiring each face to be observed in its entirety by at least one sensor. The lower bound allows evaluating the quality of the solution obtained at each step, and halting the algorithm if the solution is satisfactory. The algorithm asymptotically converges to the optimal solution of the unrestricted problem if the faces are subdivided into smaller part

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Positive Semidefinite Metric Learning Using Boosting-like Algorithms

Author: Hengel Anton van den
Kim Junae
Shen Chunhua
Wang Lei
Publication venue
Publication date: 01/01/2012
Field of study

The success of many machine learning and pattern recognition methods relies heavily upon the identification of an appropriate distance metric on the input data. It is often beneficial to learn such a metric from the input training data, instead of using a default one such as the Euclidean distance. In this work, we propose a boosting-based technique, termed BoostMetric, for learning a quadratic Mahalanobis distance metric. Learning a valid Mahalanobis distance metric requires enforcing the constraint that the matrix parameter to the metric remains positive definite. Semidefinite programming is often used to enforce this constraint, but does not scale well and easy to implement. BoostMetric is instead based on the observation that any positive semidefinite matrix can be decomposed into a linear combination of trace-one rank-one matrices. BoostMetric thus uses rank-one positive semidefinite matrices as weak learners within an efficient and scalable boosting-based learning process. The resulting methods are easy to implement, efficient, and can accommodate various types of constraints. We extend traditional boosting algorithms in that its weak learner is a positive semidefinite matrix with trace and rank being one rather than a classifier or regressor. Experiments on various datasets demonstrate that the proposed algorithms compare favorably to those state-of-the-art methods in terms of classification accuracy and running time.Comment: 30 pages, appearing in Journal of Machine Learning Researc

arXiv.org e-Print Archive

CiteSeerX

Adelaide Research & Scholarship

Research Online