156 research outputs found

    Algorithms for Multiclass Classification and Regularized Regression

    Get PDF

    Algorithms for Multiclass Classification and Regularized Regression

    Get PDF

    Algorithms for Multiclass Classification and Regularized Regression

    Get PDF
    Multiclass classification and regularized regression problems are very common in modern statistical and machine learning applications. On the one hand, multiclass classification problems require the prediction of class labels: given observations of objects that belong to certain classes, can we predict to which class a new object belongs? On the other hand, the reg

    Classication and Clustering Using Intelligent Techniques: Application to Microarray Cancer Data

    Get PDF
    Analysis and interpretation of DNA Microarray data is a fundamental task in bioinformatics. Feature Extraction plays a critical role in better performance of the classifier. We address the dimension reduction of DNA features in which relevant features are extracted among thousands of irrelevant ones through dimensionality reduction. This enhances the speed and accuracy of the classifiers. Principal Component Analysis is a technique used for feature extraction which helps to retrieve intrinsic information from high dimensional data in eigen spaces to solve the curse of dimensionality problem. Neural Networks and Support Vector Machine are implemented on reduced data set and their performances are measured in terms of predictive accuracy, specificity, and sensitivity. Next, we propose a Multiobjective Genetic Algorithm-based fuzzy clustering technique using real coded encoding of cluster centers for clustering and classification. This technique is implemented on microarray cancer data to select training data using multiobjective genetic algorithm with non-dominated sorting. The two objective functions for this multiobjective techniques are optimization of cluster compactness as well as separation simultaneously. This approach identifies the solution. Support Vector Machine classifier is further trained by the selected training points which have high confidence value. Then remaining points are classified by trained SVM classifier. Finally, the four clustering label vectors through majority voting ensemble are combined. The performance of the proposed MOGA-SVM, classification and clustering method has been compared to MOGA-BP, SVM, BP. The performance are measured in terms of Silhoutte Index, ARI Index respectively. The experiment were carried on three public domain cancer data sets, viz., Ovarian, Colon and Leukemia cancer

    Kernel Methods for Machine Learning with Life Science Applications

    Get PDF

    Extensions of Classification Method Based on Quantiles

    Get PDF
    This thesis deals with the problem of classification in general, with a particular focus on heavy-tailed or skewed data. The classification problem is first formalized by statistical learning theory and several important classification methods are reviewed, where the distance-based classifiers, including the median-based classifier and the quantile-based classifier (QC), are especially useful for the heavy-tailed or skewed inputs. However, QC is limited by its model capacity and the issue of high-dimensional accumulated errors. Our objective of this study is to investigate more general methods while retaining the merits of QC. We present four extensions of QC, which appear in chronological order and preserve the ideas driving our research. The first extension, ensemble quantile classifier (EQC), treats QC as a base learner in ensemble learning to increase model capacity and introduces weight decay regularization to mitigate high-dimensional accumulated errors. The second extension, multiple quantile classifier (MQC), enhances the model capacity of EQC by allowing multiple quantile-difference transformations to be conducted for each variable. The third extension, factorized multiple quantile classifier (FMQC), adds higher-order interactions to MQC via a computationally efficient approach of adaptive factorization machines. The fourth extension, deep multiple quantile classifier (DeepMQC), embeds the MQC into the flexible framework of deep neural networks and opens more possibilities of applications to various tasks. We discuss the theoretical motivation for each method. Numerical studies on synthetic and real datasets are used to demonstrate the improvement of the proposed methods

    IST Austria Thesis

    Get PDF
    The human ability to recognize objects in complex scenes has driven research in the computer vision field over couple of decades. This thesis focuses on the object recognition task in images. That is, given the image, we want the computer system to be able to predict the class of the object that appears in the image. A recent successful attempt to bridge semantic understanding of the image perceived by humans and by computers uses attribute-based models. Attributes are semantic properties of the objects shared across different categories, which humans and computers can decide on. To explore the attribute-based models we take a statistical machine learning approach, and address two key learning challenges in view of object recognition task: learning augmented attributes as mid-level discriminative feature representation, and learning with attributes as privileged information. Our main contributions are parametric and non-parametric models and algorithms to solve these frameworks. In the parametric approach, we explore an autoencoder model combined with the large margin nearest neighbor principle for mid-level feature learning, and linear support vector machines for learning with privileged information. In the non-parametric approach, we propose a supervised Indian Buffet Process for automatic augmentation of semantic attributes, and explore the Gaussian Processes classification framework for learning with privileged information. A thorough experimental analysis shows the effectiveness of the proposed models in both parametric and non-parametric views

    MODELING OF PRIMARY REFORMER TUBE METAL TEMPERATURE (TMT) USING LS-SVM

    Get PDF
    This report discusses the research done on the chosen topic, which is Modeling of Primary Reformer Tube Metal Temperature (TMT) using LS-SVM. The objective of the project is to develop a modelthat can predict the temperature of the reformer tubes. The scope of the study focused on the modeling of the primary reformer TMT of PETRONAS Ammonia hydrocarbons such as natural gas into its constituents which are carbon dioxide, carbon monoxide and hydrogen. Pressurized feed (300barg) of hydrocarbon and steam is fed into the reformer tubes and heated by the burners at about 800-1000°C to facilitate the hydrocarbon conversion. The temperature of the tubes is an important parameter to determine the life-time of the tubes. Operating the reformer beyond the TMT design limits can cause premature failures on the tubes which lead to production losses and higher downtime. Based on the literature survey, it shows that the mathematical modeling and simulation approaches are used to determine the behavior of the reformer tubes. For this project, empirical model developed by integrating the process variable will be used to predict the reformer tubes temperature. Empirical model is developed based on real-time data obtained from PASB plant. LS-SVM is used in developing the model and Back Propagation Neural Network is used to develop a model that serves as the benchmark for this project
    corecore