20 research outputs found
A Fisher consistent multiclass loss function with variable margin on positive examples
The concept of pointwise Fisher consistency (or classification calibration) states necessary and sufficient conditions to have Bayes consistency when a classifier minimizes a surrogate loss function instead of the 0-1 loss. We present a family of multiclass hinge loss functions defined by a continuous control parameter. representing the margin of the positive points of a given class. The parameter. allows shifting from classification uncalibrated to classification calibrated loss functions. Though previous results suggest that increasing the margin of positive points has positive effects on the classification model, other approaches have failed to give increasing weight to the positive examples without losing the classification calibration property. Our lambda-based loss function can give unlimited weight to the positive examples without breaking the classification calibration property. Moreover, when embedding these loss functions into the Support Vector Machine's framework (lambda-SVM), the parameter. defines different regions for the Karush-Kuhn-Tucker conditions. A large margin on positive points also facilitates faster convergence of the Sequential Minimal Optimization algorithm, leading to lower training times than other classification calibrated methods. lambda-SVM allows easy implementation, and its practical use in different datasets not only supports our theoretical analysis, but also provides good classification performance and fast training times.The authors acknowledge the referees' comments and suggestions that helped to improve the manuscript. This research is based upon work supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via the Federal Bureau of Investigations, Finance Division. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. I.R-L acknowledges partial support by Spain's grants TIN2013-42351-P (MINECO) and S2013/ICE-2845 CASI-CAM-CM (Comunidad de Madrid). The authors gratefully acknowledge the use of the facilities of Centro de Computacion Cientifica (CCC) at Universidad Autonoma de Madrid
A practical view of large-scale classification: feature selection and real-time classification
Tesis doctoral inédita, Universidad Autónoma de Madrid, Escuela Politécnica Superior, mayo de 201
On the equivalence of Kernel Fisher discriminant analysis and Kernel Quadratic Programming Feature Selection
This is the author’s version of a work that was accepted for publication in Pattern Recognition Letters. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Pattern Recognition Letters, Vol. 32, Iss. 11, (2011) DOI: 10.1016/j.patrec.2011.04.007We reformulate the Quadratic Programming Feature Selection (QPFS) method in a Kernel space to obtain a vector which maximizes the quadratic objective function of QPFS. We demonstrate that the vector obtained by Kernel Quadratic Programming Feature Selection is equivalent to the Kernel Fisher vector and, therefore, a new interpretation of the Kernel Fisher discriminant analysis is given which provides some computational advantages for highly unbalanced datasets.I.R.-L. is supported by an FPU grant from Universidad Autónoma de Madrid, and partially supported by the Universidad Autónoma de Madrid-IIC Chair and TIN 2010-21575-C02-01. RH was partially supported by Grants ONRN00014-07-1-0741, and US Army Medical and Material Command under contract #W81XWH-10-C-0040 in collaboration with Elintrix, Inc
Hierarchical linear support vector machine
This is the author’s version of a work that was accepted for publication in Pattern Recognition. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Pattern Recognition, Vol. 45, Iss. 12, (2012) DOI: 10.1016/j.patcog.2012.06.002The increasing size and dimensionality of real-world datasets make it necessary to design efficient algorithms not only in the training process but also in the prediction phase. In applications such as credit card fraud detection, the classifier needs to predict an event in 10 ms at most. In these environments the speed of the prediction constraints heavily outweighs the training costs. We propose a new classification method, called a Hierarchical Linear Support Vector Machine (H-LSVM), based on the construction of an oblique decision tree in which the node split is obtained as a Linear Support Vector Machine. Although other methods have been proposed to break the data space down in subregions to speed up Support Vector Machines, the H-LSVM algorithm represents a very simple and efficient model in training but mainly in prediction for large-scale datasets. Only a few hyperplanes need to be evaluated in the prediction step, no kernel computation is required and the tree structure makes parallelization possible. In experiments with medium and large datasets, the H-LSVM reduces the prediction cost considerably while achieving classification results closer to the non-linear SVM than that of the linear case.The authors would like to thank the anonymous reviewers for their comments that help improve the manuscript. I.R.-L. is supported by an FPU Grant from Universidad Autónoma de Madrid, and partially supported by the Universidad Autónoma de Madrid-IIC Chair and TIN2010-21575-C02-01. R.H. acknowledges partial support by ONRN00014-07-1-0741, USARIEM-W81XWH-10-C-0040 (ELINTRIX) and JPL-2012-1455933
Dataset from chemical gas sensor array in turbulent wind tunnel
The dataset includes the acquired time series of a chemical detection platform exposed to different gas conditions in a turbulent wind tunnel. The chemo-sensory elements were sampling directly the environment. In contrast to traditional approaches that include measurement chambers, open sampling systems are sensitive to dispersion mechanisms of gaseous chemical analytes, namely diffusion, turbulence, and advection, making the identification and monitoring of chemical substances more challenging. The sensing platform included 72 metal-oxide gas sensors that were positioned at 6 different locations of the wind tunnel. At each location, 10 distinct chemical gases were released in the wind tunnel, the sensors were evaluated at 5 different operating temperatures, and 3 different wind speeds were generated in the wind tunnel to induce different levels of turbulence. Moreover, each configuration was repeated 20 times, yielding a dataset of 18,000 measurements. The dataset was collected over a period of 16 months. The data is related to "On the performance of gas sensor arrays in open sampling systems using Inhibitory Support Vector Machines", by Vergara et al.[1]. The dataset can be accessed publicly at the UCI repository upon citation of [1]: http://archive.ics.uci.edu/ml/datasets/Gas+sensor+arrays+in+open+sampling+settings.This work has been supported by the California Institute for Telecommunications and Information Technology (CALIT2) under Grant number 2014 CSRO 136
Data set from chemical sensor array exposed to turbulent gas mixtures
A chemical detection platform composed of 8 chemo-resistive gas sensors was exposed to turbulent gas mixtures generated naturally in a wind tunnel. The acquired time series of the sensors are provided. The experimental setup was designed to test gas sensors in realistic environments. Traditionally, chemical detection systems based on chemo-resistive sensors include a gas chamber to control the sample air flow and minimize turbulence. Instead, we utilized a wind tunnel with two independent gas sources that generate two gas plumes. The plumes get naturally mixed along a turbulent flow and reproduce the gas concentration fluctuations observed in natural environments. Hence, the gas sensors can capture the spatio-temporal information contained in the gas plumes. The sensor array was exposed to binary mixtures of ethylene with either methane or carbon monoxide. Volatiles were released at four different rates to induce different concentration levels in the vicinity of the sensor array. Each configuration was repeated 6 times, for a total of 180 measurements. The data is related to "Chemical Discrimination in Turbulent Gas Mixtures with MOX Sensors Validated by Gas Chromatography-Mass Spectrometry", by Fonollosa et al. [1]. The dataset can be accessed publicly at the UCI repository upon citation of [1]: http://archive.ics.uci.edu/ml/datasets/Gas+senso+rarray+exposed+to+turbulent+gas+mixtures.This work has been supported by the California Institute for Telecommunications and Information Technology (CALIT2) under Grant Number 2014 CSRO 136
Quadratic Programming Feature Selection
Identifying a subset of features that preserves classification accuracy is a problem of growing importance, because of the increasing size and dimensionality of real-world data sets. We propose a new feature selection method, named Quadratic Programming Feature Selection (QPFS), that reduces the task to a quadratic optimization problem. In order to limit the computational complexity of solving the optimization problem, QPFS uses the Nystr¨om method for approximate matrix diagonalization. QPFS is thus capable of dealing with very large data sets, for which the use of other methods is computationally expensive. In experiments with small and medium data sets, the QPFS method leads to classification accuracy similar to that of other successful techniques. For large data sets, QPFS is superior in terms of computational efficiency.I.R.-L. is supported by an FPU grant from Universidad Autónoma de Madrid, and partially supported by the Universidad Autónoma de Madrid-IIC Chair. R.H. acknowledges partial support by ONR N00014-07-1-074
Analysis of pattern recognition and dimensionality reduction techniques for odor biometrics
In this paper, we analyze the performance of several well-known pattern recognition and dimensionality reduction techniques when applied to mass-spectrometry data for odor biometric identification. Motivated by the successful results of previous works capturing the odor from other parts of the body, this work attempts to evaluate the feasibility of identifying people by the odor emanated from the hands. By formulating this task according to a machine learning scheme, the problem is identified with a small-sample-size supervised classification problem in which the input data is formed by mass spectrograms from the hand odor of 13 subjects captured in different sessions. The high dimensionality of the data makes it necessary to apply feature selection and extraction techniques together with a simple classifier in order to improve the generalization capabilities of the model. Our experimental results achieve recognition rates over 85% which reveals that there exists discriminatory information in the hand odor and points at body odor as a promising biometric identifier
Variables socioemocionales y bienestar psicológico en personas mayores
The study of well-being is especially interesting in the case of the elderly, located at a vital time when limiting life in a quantitative sense is more evident, in which the disease and disorders are more likely to increase, and in which the quality of life years left to live and their promotion is essential (Satorres, 2013). The psychological well-being is a broad concept that includes social dimensions, subjective and psychological as well as behaviors related to health in general, that lead people to work in a positive way. The term happiness is too ambitious; however the individual subjective well-being (BIS) allows to measure the degree of happiness or satisfaction that, in general terms, predominates in each according to his own point of view. On the other hand, the constructive thought, in their different scales and facets, are in reality variables socio emotional functions which enables us to face the world and reality. The research group of the ULPGC INDEPSI has conducted an investigation to relate these two constructs (constructive thought and subjective wellbeing individual) in a group with ages between 57 and 87 years (n=96) who receive university studies for older, using for this purpose the inventory of constructive thought emotional (Epstein, 2012) and the questionnaire BIS-HERNAN (Hernández, 1996 and 2000) that measure different aspects of happiness. The results indicate that the factors that attaches the happiness and unhappiness are of a different nature; that there are significant differences (p smaller than 0.05) between the valuation of the past and future happiness in comparison with the present and that having a good emotional coping and little suspicion are significant predictors of happiness.El estudio del bienestar es especialmente interesante en el caso de las personas mayores, situadas en un momento vital en el que la limitación de la vida en un sentido cuantitativo es más evidente, en el que la enfermedad y las disfunciones tienen más probabilidad de aumento, y en el que la calidad de los años de vida que quedan por vivir y su promoción es fundamental (Satorres, 2013). El bienestar psicológico es un concepto amplio que incluye dimensiones sociales, subjetivas y psicológicas, así como comportamientos relacionados con la salud en general, que llevan a las personas a funcionar de un modo positivo. El término felicidad es demasiado ambicioso; sin embargo el Bienestar Subjetivo Individual (BIS) permite medir el grado de felicidad o de satisfacción que, en términos generales, predomina en cada uno según su propio punto de vista. Por otro lado, el pensamiento constructivo, en sus distintas escalas y facetas, son en realidad variables socioemocionales que nos permite afrontar el mundo y la realidad. El grupo de investigación INDEPSI de la ULPGC ha realizado un estudio para relacionar estos dos constructos (pensamiento constructivo y bienestar subjetivo individual) en un grupo con edades comprendidas entre los 57 y 87 años, (n=96) que reciben estudios universitarios para mayores, usando para ello el Inventario de Pensamiento Constructivo Emocional (Epstein, 2012) y el cuestionario BIS-HERNAN (Hernández, 1996 y 2000) que mide distintos aspectos de la felicidad. Los resultados nos indican que los factores por los que se atribuye la felicidad y la infelicidad son de distinta naturaleza; que existen diferencias significativas (p menor que 0.05) entre la valoración de la felicidad pasada y futura en comparación con la actual, y que tener un buen afrontamiento emocional y poca suspicacia son predictores significativos de la felicidad