1,089 research outputs found

    Learning to Select Pre-Trained Deep Representations with Bayesian Evidence Framework

    Full text link
    We propose a Bayesian evidence framework to facilitate transfer learning from pre-trained deep convolutional neural networks (CNNs). Our framework is formulated on top of a least squares SVM (LS-SVM) classifier, which is simple and fast in both training and testing, and achieves competitive performance in practice. The regularization parameters in LS-SVM is estimated automatically without grid search and cross-validation by maximizing evidence, which is a useful measure to select the best performing CNN out of multiple candidates for transfer learning; the evidence is optimized efficiently by employing Aitken's delta-squared process, which accelerates convergence of fixed point update. The proposed Bayesian evidence framework also provides a good solution to identify the best ensemble of heterogeneous CNNs through a greedy algorithm. Our Bayesian evidence framework for transfer learning is tested on 12 visual recognition datasets and illustrates the state-of-the-art performance consistently in terms of prediction accuracy and modeling efficiency.Comment: Appearing in CVPR-2016 (oral presentation

    Hybrid Models with Deep and Invertible Features

    Full text link
    We propose a neural hybrid model consisting of a linear model defined on a set of features computed by a deep, invertible transformation (i.e. a normalizing flow). An attractive property of our model is that both p(features), the density of the features, and p(targets | features), the predictive distribution, can be computed exactly in a single feed-forward pass. We show that our hybrid model, despite the invertibility constraints, achieves similar accuracy to purely predictive models. Moreover the generative component remains a good model of the input features despite the hybrid optimization objective. This offers additional capabilities such as detection of out-of-distribution inputs and enabling semi-supervised learning. The availability of the exact joint density p(targets, features) also allows us to compute many quantities readily, making our hybrid model a useful building block for downstream applications of probabilistic deep learning.Comment: ICML 201

    Agnostic Bayes

    Get PDF
    Tableau d'honneur de la FacultĂ© des Ă©tudes supĂ©rieures et postdorales, 2014-2015L’apprentissage automatique correspond Ă  la science de l’apprentissage Ă  partir d’exemples. Des algorithmes basĂ©s sur cette approche sont aujourd’hui omniprĂ©sents. Bien qu’il y ait eu un progrĂšs significatif, ce domaine prĂ©sente des dĂ©fis importants. Par exemple, simplement sĂ©lectionner la fonction qui correspond le mieux aux donnĂ©es observĂ©es n’offre aucune garantie statistiques sur les exemples qui n’ont pas encore Ă©tĂ© observĂ©es. Quelques thĂ©ories sur l’apprentissage automatique offrent des façons d’aborder ce problĂšme. Parmi ceux-ci, nous prĂ©sentons la modĂ©lisation bayĂ©sienne de l’apprentissage automatique et l’approche PACbayĂ©sienne pour l’apprentissage automatique dans une vue unifiĂ©e pour mettre en Ă©vidence d’importantes similaritĂ©s. Le rĂ©sultat de cette analyse suggĂšre que de considĂ©rer les rĂ©ponses de l’ensemble des modĂšles plutĂŽt qu’un seul correspond Ă  un des Ă©lĂ©ments-clĂ©s pour obtenir une bonne performance de gĂ©nĂ©ralisation. Malheureusement, cette approche vient avec un coĂ»t de calcul Ă©levĂ©, et trouver de bonnes approximations est un sujet de recherche actif. Dans cette thĂšse, nous prĂ©sentons une approche novatrice qui peut ĂȘtre appliquĂ©e avec un faible coĂ»t de calcul sur un large Ă©ventail de configurations d’apprentissage automatique. Pour atteindre cet objectif, nous appliquons la thĂ©orie de Bayes d’une maniĂšre diffĂ©rente de ce qui est conventionnellement fait pour l’apprentissage automatique. SpĂ©cifiquement, au lieu de chercher le vrai modĂšle Ă  l’origine des donnĂ©es observĂ©es, nous cherchons le meilleur modĂšle selon une mĂ©trique donnĂ©e. MĂȘme si cette diffĂ©rence semble subtile, dans cette approche, nous ne faisons pas la supposition que le vrai modĂšle appartient Ă  l’ensemble de modĂšles explorĂ©s. Par consĂ©quent, nous disons que nous sommes agnostiques. Plusieurs expĂ©rimentations montrent un gain de gĂ©nĂ©ralisation significatif en utilisant cette approche d’ensemble de modĂšles durant la phase de validation croisĂ©e. De plus, cet algorithme est simple Ă  programmer et n’ajoute pas un coĂ»t de calcul significatif Ă  la recherche d’hyperparamĂštres conventionnels. Finalement, cet outil probabiliste peut Ă©galement ĂȘtre utilisĂ© comme un test statistique pour Ă©valuer la qualitĂ© des algorithmes sur plusieurs ensembles de donnĂ©es d’apprentissage.Machine learning is the science of learning from examples. Algorithms based on this approach are now ubiquitous. While there has been significant progress, this field presents important challenges. Namely, simply selecting the function that best fits the observed data was shown to have no statistical guarantee on the examples that have not yet been observed. There are a few learning theories that suggest how to address this problem. Among these, we present the Bayesian modeling of machine learning and the PAC-Bayesian approach to machine learning in a unified view to highlight important similarities. The outcome of this analysis suggests that model averaging is one of the key elements to obtain a good generalization performance. Specifically, one should perform predictions based on the outcome of every model instead of simply the one that best fits the observed data. Unfortunately, this approach comes with a high computational cost problem, and finding good approximations is the subject of active research. In this thesis, we present an innovative approach that can be applied with a low computational cost on a wide range of machine learning setups. In order to achieve this, we apply the Bayes’ theory in a different way than what is conventionally done for machine learning. Specifically, instead of searching for the true model at the origin of the observed data, we search for the best model according to a given metric. While the difference seems subtle, in this approach, we do not assume that the true model belongs to the set of explored model. Hence, we say that we are agnostic. An extensive experimental setup shows a significant generalization performance gain when using this model averaging approach during the cross-validation phase. Moreover, this simple algorithm does not add a significant computational cost to the conventional search of hyperparameters. Finally, this probabilistic tool can also be used as a statistical significance test to evaluate the quality of learning algorithms on multiple datasets

    Land use/land cover classification of fused Sentinel-1 and Sentinel-2 imageries using ensembles of Random Forests

    Full text link
    The study explores the synergistic combination of Synthetic Aperture Radar (SAR) and Visible-Near Infrared-Short Wave Infrared (VNIR-SWIR) imageries for land use/land cover (LULC) classification. Image fusion, employing Bayesian fusion, merges SAR texture bands with VNIR-SWIR imageries. The research aims to investigate the impact of this fusion on LULC classification. Despite the popularity of random forests for supervised classification, their limitations, such as suboptimal performance with fewer features and accuracy stagnation, are addressed. To overcome these issues, ensembles of random forests (RFE) are created, introducing random rotations using the Forest-RC algorithm. Three rotation approaches: principal component analysis (PCA), sparse random rotation (SRP) matrix, and complete random rotation (CRP) matrix are employed. Sentinel-1 SAR data and Sentinel-2 VNIR-SWIR data from the IIT-Kanpur region constitute the training datasets, including SAR, SAR with texture, VNIR-SWIR, VNIR-SWIR with texture, and fused VNIR-SWIR with texture. The study evaluates classifier efficacy, explores the impact of SAR and VNIR-SWIR fusion on classification, and significantly enhances the execution speed of Bayesian fusion code. The SRP-based RFE outperforms other ensembles for the first two datasets, yielding average overall kappa values of 61.80% and 68.18%, while the CRP-based RFE excels for the last three datasets with average overall kappa values of 95.99%, 96.93%, and 96.30%. The fourth dataset achieves the highest overall kappa of 96.93%. Furthermore, incorporating texture with SAR bands results in a maximum overall kappa increment of 10.00%, while adding texture to VNIR-SWIR bands yields a maximum increment of approximately 3.45%.Comment: Thesis for Master of Technology. Created: July 2018. Total pages 12

    Evolving Ensembles with TPOT

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceMachine learning has become popular in recent years as a solution to various problems such as fraud detection, weather prediction, improve diagnosis accuracy, and more. One of its goals is to find the model that best explains the problem. Among the several alternatives on how to accomplish that, significant attention has been laid on the matter of accuracy using stacking ensembles: the objective is to produce a more accurate prediction by combining the predictions of various estimators. This model has often been exhibiting a superior performance in contrast to its single counterparts. Because the process of choosing the best model for a given problem can be time-consuming, a necessity to automatize the machine learning process has emerged. Different tools allow this, including TPOT, a Python library that uses genetic programming to optimize the machine learning process, evolving pipelines randomly created until the best one is found, or a previously fixed maximum number of generations for the given problem is reached. Genetic programming is a field of machine learning that uses evolutionary algorithms to generate new computer programs, and it has been shown successful in quite a few applications. TPOT uses several machine learning algorithms from the Sklearn Python library. It also features some ensembles, such as Random Forest or AdaBoost. Currently, stacking ensembles are not implemented yet on TPOT, and, considering its current accuracy rates, the objective of this thesis is to implement stacking ensembles in TPOT. After we implemented stacking ensembles successfully in TPOT, we performed some experiments with different datasets and noticed that for almost all of them, TPOT has comparable performance to TPOT with stacking ensembles. Also, we observed that, when using the light dictionary version of TPOT, the results of the Stacking configuration improved for two datasets since it used weaker learners

    Predicting software faults in large space systems using machine learning techniques

    Get PDF
    Recently, the use of machine learning (ML) algorithms has proven to be of great practical value in solving a variety of engineering problems including the prediction of failure, fault, and defect-proneness as the space system software becomes complex. One of the most active areas of recent research in ML has been the use of ensemble classifiers. How ML techniques (or classifiers) could be used to predict software faults in space systems, including many aerospace systems is shown, and further use ensemble individual classifiers by having them vote for the most popular class to improve system software fault-proneness prediction. Benchmarking results on four NASA public datasets show the Naive Bayes classifier as more robust software fault prediction while most ensembles with a decision tree classifier as one of its components achieve higher accuracy rates
    • 

    corecore