2,270 research outputs found

    Analysis of the Consistency of a Mixed Integer Programming-based Multi-Category Constrained Discriminant Model

    Get PDF
    Classification is concerned with the development of rules for the allocation of observations to groups, and is a fundamental problem in machine learning. Much of previous work on classification models investigates two-group discrimination. Multi-category classification is less-often considered due to the tendency of generalizations of two-group models to produce misclassification rates that are higher than desirable. Indeed, producing “good” two-group classification rules is a challenging task for some applications, and producing good multi-category rules is generally more difficult. Additionally, even when the “optimal” classification rule is known, inter-group misclassification rates may be higher than tolerable for a given classification model. We investigate properties of a mixed-integer programming based multi-category classification model that allows for the pre-specification of limits on inter-group misclassification rates. The mechanism by which the limits are satisfied is the use of a reserved judgment region, an artificial category into which observations are placed whose attributes do not sufficiently indicate membership to any particular group. The method is shown to be a consistent estimator of a classification rule with misclassification limits, and performance on simulated and real-world data is demonstrated

    Machine Learning

    Get PDF
    Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience

    Probabilistic Inference from Arbitrary Uncertainty using Mixtures of Factorized Generalized Gaussians

    Full text link
    This paper presents a general and efficient framework for probabilistic inference and learning from arbitrary uncertain information. It exploits the calculation properties of finite mixture models, conjugate families and factorization. Both the joint probability density of the variables and the likelihood function of the (objective or subjective) observation are approximated by a special mixture model, in such a way that any desired conditional distribution can be directly obtained without numerical integration. We have developed an extended version of the expectation maximization (EM) algorithm to estimate the parameters of mixture models from uncertain training examples (indirect observations). As a consequence, any piece of exact or uncertain information about both input and output values is consistently handled in the inference and learning stages. This ability, extremely useful in certain situations, is not found in most alternative methods. The proposed framework is formally justified from standard probabilistic principles and illustrative examples are provided in the fields of nonparametric pattern classification, nonlinear regression and pattern completion. Finally, experiments on a real application and comparative results over standard databases provide empirical evidence of the utility of the method in a wide range of applications

    Using CSW weight’s in UTASTAR method

    Get PDF
    Several researchers have considered similarities between Multi-Criteria Decision Making (MCDM) and Data Envelopment Analysis (DEA), as tools for solving decision making problems. As the preferences of decision- maker (DM) on alternatives are not considered in classical DEA, some researchers have tried to consider it in DEA. The UTA-STAR method is one of the techniques widely used in Multi Criteria Decision Analysis. In this technique, the preferences of decision maker on alternatives are considered and UTA-STAR tries to compute the most suitable weights for criteria and alternatives to obtain a utility function having a minimum deviation from the preferences. The goal of this paper is interpreting decision maker’s preferences in UTA-STAR method, in a new manner, using the common set of weights (CSW) in DEA

    Eye detection using discriminatory features and an efficient support vector machine

    Get PDF
    Accurate and efficient eye detection has broad applications in computer vision, machine learning, and pattern recognition. This dissertation presents a number of accurate and efficient eye detection methods using various discriminatory features and a new efficient Support Vector Machine (eSVM). This dissertation first introduces five popular image representation methods - the gray-scale image representation, the color image representation, the 2D Haar wavelet image representation, the Histograms of Oriented Gradients (HOG) image representation, and the Local Binary Patterns (LBP) image representation - and then applies these methods to derive five types of discriminatory features. Comparative assessments are then presented to evaluate the performance of these discriminatory features on the problem of eye detection. This dissertation further proposes two discriminatory feature extraction (DFE) methods for eye detection. The first DFE method, discriminant component analysis (DCA), improves upon the popular principal component analysis (PCA) method. The PCA method can derive the optimal features for data representation but not for classification. In contrast, the DCA method, which applies a new criterion vector that is defined on two novel measure vectors, derives the optimal discriminatory features in the whitened PCA space for two-class classification problems. The second DFE method, clustering-based discriminant analysis (CDA), improves upon the popular Fisher linear discriminant (FLD) method. A major disadvantage of the FLD is that it may not be able to extract adequate features in order to achieve satisfactory performance, especially for two-class problems. To address this problem, three CDA models (CDA-1, -2, and -3) are proposed by taking advantage of the clustering technique. For every CDA model anew between-cluster scatter matrix is defined. The CDA method thus can derive adequate features to achieve satisfactory performance for eye detection. Furthermore, the clustering nature of the three CDA models and the nonparametric nature of the CDA-2 and -3 models can further improve the detection performance upon the conventional FLD method. This dissertation finally presents a new efficient Support Vector Machine (eSVM) for eye detection that improves the computational efficiency of the conventional Support Vector Machine (SVM). The eSVM first defines a Θ set that consists of the training samples on the wrong side of their margin derived from the conventional soft-margin SVM. The Θ set plays an important role in controlling the generalization performance of the eSVM. The eSVM then introduces only a single slack variable for all the training samples in the Θ set, and as a result, only a very small number of those samples in the Θ set become support vectors. The eSVM hence significantly reduces the number of support vectors and improves the computational efficiency without sacrificing the generalization performance. A modified Sequential Minimal Optimization (SMO) algorithm is then presented to solve the large Quadratic Programming (QP) problem defined in the optimization of the eSVM. Three large-scale face databases, the Face Recognition Grand challenge (FRGC) version 2 database, the BioID database, and the FERET database, are applied to evaluate the proposed eye detection methods. Experimental results show the effectiveness of the proposed methods that improve upon some state-of-the-art eye detection methods

    Identifying Mislabeled Training Data

    Full text link
    This paper presents a new approach to identifying and eliminating mislabeled training instances for supervised learning. The goal of this approach is to improve classification accuracies produced by learning algorithms by improving the quality of the training data. Our approach uses a set of learning algorithms to create classifiers that serve as noise filters for the training data. We evaluate single algorithm, majority vote and consensus filters on five datasets that are prone to labeling errors. Our experiments illustrate that filtering significantly improves classification accuracy for noise levels up to 30 percent. An analytical and empirical evaluation of the precision of our approach shows that consensus filters are conservative at throwing away good data at the expense of retaining bad data and that majority filters are better at detecting bad data at the expense of throwing away good data. This suggests that for situations in which there is a paucity of data, consensus filters are preferable, whereas majority vote filters are preferable for situations with an abundance of data

    Para-Proxemic Attributions: an Investigation Into the Relationship Between Close-Up and Extreme Close-Up Camera Shots and Audience Response.

    Get PDF
    The purpose of this study was to determine if there were differences in para-proxemic attributions (effectations based upon the relative distance of a media source) to the extreme close-up as opposed to the close-up camera shots. Differences in audience response by sex of subject were found. Two stimuli were simultaneously videotaped of a man making an informative speech. The first tape was composed of establishing shots and extreme close-up shots. The second tape was comprised of establishing shots and close-up shots. The establishing shots were constant in both tapes. In the first tape a cut from the establishing shot to the extreme close-up shot would electronically trigger a cut in the second tape from the establishing shot to the close-up shot. Because of the baseline nature of research in paraproxemic attributions and the lack of a valid and reliable instrument for use as the dependent measure a pilot study was run. After viewing one of the two treatment subjects responded to a revised version of the McCroskey and Jenson instrument for the measurement of perceived image of mass media news sources. Subjects responses were subjected to image factor analysis. This analysis yielded a three factor structure for the male subjects and a four factor structure for the female subjects. A subsequent treatment condition with a new subject population yielded an almost identical factor pattern as that in the pilot study. Three factors emerged for the male subjects and four factors emerged for the female respondents. It was determined that the different factor structures showed a difference in subjects attributions toward the stimulus based upon the independent variable of sex of respondent. Multiple discriminant analysis was then run to determine if the sex specific instruments could differentiate subjects responses by treatment condition. Results of those analyses showed that the sex specific instruments could correctly classify the subjects by para-proxemic treatment conditions upwards of 63% in every condition except the male extreme close-up condition. The lack of linearity of responses in this condition was explained as a result of a response ambiguity for males in an invading situation. Further research was suggested to determine which specific items were responded to differently by treatment conditions. Additionally, a different stimulus needed to be designed specific to new situations, and other camera shots tested in varying combinations

    Autonomous Eye Tracking in Octopus bimaculoides

    Get PDF
    The importance of the position of cephalopods, and particularly octopuses, as the most intelligent group of invertebrates is becoming increasingly appreciated by the neuroscience research community. Cephalopods are the most distantly related species to humans that possesses advanced cognitive abilities; as their intelligence evolved independently from vertebrates, comparative analyses reveal trends in the evolution of nervous systems and the foundations of intelligence itself. Vision is an especially important area of cephalopod cognition to research because cephalopods are predominantly visual creatures, like humans, and the rapid transduction of visual signals allows the inner-workings of octopus cognition to be revealed in real time. While octopuses can be conditioned to indicate what they see through responses to conditioned visual stimuli, no system as of yet provides a non-invasive means of determining what an octopus is looking at without training. This thesis introduces an automated methodological framework to predict the direction of an octopuses gaze for use in visual cognition research. The system utilizes deep learning models to track the eyes of octopuses, then predicts where an octopus is looking based off of the orientation of their eyes and known anatomical traits that constrain where their vision could be directed. Data could not be collected this spring to train a model and test the tool in the experimental setting the system utilizes, however analyses conducted on data not intended for this project suggest the approach is feasible for estimating an octopus\u27 gaze and offer insights into how to do so most effectively
    • 

    corecore