2,181 research outputs found

    A hybrid algorithm for Bayesian network structure learning with application to multi-label learning

    Get PDF
    We present a novel hybrid algorithm for Bayesian network structure learning, called H2PC. It first reconstructs the skeleton of a Bayesian network and then performs a Bayesian-scoring greedy hill-climbing search to orient the edges. The algorithm is based on divide-and-conquer constraint-based subroutines to learn the local structure around a target variable. We conduct two series of experimental comparisons of H2PC against Max-Min Hill-Climbing (MMHC), which is currently the most powerful state-of-the-art algorithm for Bayesian network structure learning. First, we use eight well-known Bayesian network benchmarks with various data sizes to assess the quality of the learned structure returned by the algorithms. Our extensive experiments show that H2PC outperforms MMHC in terms of goodness of fit to new data and quality of the network structure with respect to the true dependence structure of the data. Second, we investigate H2PC's ability to solve the multi-label learning problem. We provide theoretical results to characterize and identify graphically the so-called minimal label powersets that appear as irreducible factors in the joint distribution under the faithfulness condition. The multi-label learning problem is then decomposed into a series of multi-class classification problems, where each multi-class variable encodes a label powerset. H2PC is shown to compare favorably to MMHC in terms of global classification accuracy over ten multi-label data sets covering different application domains. Overall, our experiments support the conclusions that local structural learning with H2PC in the form of local neighborhood induction is a theoretically well-motivated and empirically effective learning framework that is well suited to multi-label learning. The source code (in R) of H2PC as well as all data sets used for the empirical tests are publicly available.Comment: arXiv admin note: text overlap with arXiv:1101.5184 by other author

    Two stage Decision Tree Learning from Multi-class Imbalanced Tweets for Knowledge Discovery

    Get PDF
    Data Mining is an efficient technique for knowledge discovery from existing databases. The existing algorithms performance degrades when applied to the multi class imbalance dataset. The imbalance nature of twitter data set also hinders the process of efficient knowledge discovery. In this paper, we proposed an efficient learning approach for knowledge discovery from multi class imbalance datasets specifically designed for opinion mining. The proposed Under Sampled Imbalance Decision tree Learning (USIDL) approach uses decomposition of multi class into number of binary class samples followed by a unique technique for under sampling the instances from majority subset of each binary sample. The experimental results suggest that the proposed technique performs better than the existing C4.5 algorithm on six evaluation metrics

    Development of Landsat-based Technology for Crop Inventories: Appendices

    Get PDF
    There are no author-identified significant results in this report

    VOICE BIOMETRICS UNDER MISMATCHED NOISE CONDITIONS

    Get PDF
    This thesis describes research into effective voice biometrics (speaker recognition) under mismatched noise conditions. Over the last two decades, this class of biometrics has been the subject of considerable research due to its various applications in such areas as telephone banking, remote access control and surveillance. One of the main challenges associated with the deployment of voice biometrics in practice is that of undesired variations in speech characteristics caused by environmental noise. Such variations can in turn lead to a mismatch between the corresponding test and reference material from the same speaker. This is found to adversely affect the performance of speaker recognition in terms of accuracy. To address the above problem, a novel approach is introduced and investigated. The proposed method is based on minimising the noise mismatch between reference speaker models and the given test utterance, and involves a new form of Test-Normalisation (T-Norm) for further enhancing matching scores under the aforementioned adverse operating conditions. Through experimental investigations, based on the two main classes of speaker recognition (i.e. verification/ open-set identification), it is shown that the proposed approach can significantly improve the performance accuracy under mismatched noise conditions. In order to further improve the recognition accuracy in severe mismatch conditions, an approach to enhancing the above stated method is proposed. This, which involves providing a closer adjustment of the reference speaker models to the noise condition in the test utterance, is shown to considerably increase the accuracy in extreme cases of noisy test data. Moreover, to tackle the computational burden associated with the use of the enhanced approach with open-set identification, an efficient algorithm for its realisation in this context is introduced and evaluated. The thesis presents a detailed description of the research undertaken, describes the experimental investigations and provides a thorough analysis of the outcomes

    Audio-coupled video content understanding of unconstrained video sequences

    Get PDF
    Unconstrained video understanding is a difficult task. The main aim of this thesis is to recognise the nature of objects, activities and environment in a given video clip using both audio and video information. Traditionally, audio and video information has not been applied together for solving such complex task, and for the first time we propose, develop, implement and test a new framework of multi-modal (audio and video) data analysis for context understanding and labelling of unconstrained videos. The framework relies on feature selection techniques and introduces a novel algorithm (PCFS) that is faster than the well-established SFFS algorithm. We use the framework for studying the benefits of combining audio and video information in a number of different problems. We begin by developing two independent content recognition modules. The first one is based on image sequence analysis alone, and uses a range of colour, shape, texture and statistical features from image regions with a trained classifier to recognise the identity of objects, activities and environment present. The second module uses audio information only, and recognises activities and environment. Both of these approaches are preceded by detailed pre-processing to ensure that correct video segments containing both audio and video content are present, and that the developed system can be made robust to changes in camera movement, illumination, random object behaviour etc. For both audio and video analysis, we use a hierarchical approach of multi-stage classification such that difficult classification tasks can be decomposed into simpler and smaller tasks. When combining both modalities, we compare fusion techniques at different levels of integration and propose a novel algorithm that combines advantages of both feature and decision-level fusion. The analysis is evaluated on a large amount of test data comprising unconstrained videos collected for this work. We finally, propose a decision correction algorithm which shows that further steps towards combining multi-modal classification information effectively with semantic knowledge generates the best possible results

    Personality extraction through LinkedIn

    Full text link
    L'extraction de personnalité sur les réseaux sociaux est un domaine qui n'a que récemment commencé à capturer l'attention des chercheurs. La tâche consiste à, en partant d'un corpus de profils d'utilisateurs de réseaux sociaux, être capable de classifier leur personnalité correctement, selon un modèle de personnalité tel que défini en psychologie. Ce mémoire apporte trois innovations au domaine. Premièrement, la collecte d'un corpus d'utilisateurs LinkedIn. Deuxièmement, l'extraction sur deux modèles de personnalités, MBTI et DiSC, l'extraction sur DiSC n'ayant pas encore été faite dans le domaine, et finalement, la possibilité de passer d'un modèle de personnalité à l'autre est explorée, dans l'idée qu'il serait ainsi possible d'obtenir les résultats de multiples modèles de personnalités en partant d'un seul test.Personality extraction through social networks is a field that only recently started to capture the attention of researchers. The task consists in, starting with a corpus of user profiles on a particular social network, classifying their personalities correctly, according to a specific personality model as described in psychology. In this master thesis, three innovations to the domain are presented. Firstly, the collection of a corpus of LinkedIn users. Secondly, the extraction of the personality according to two personality models, DiSC and MBTI, the extraction with DiSC having never been done before. Lastly, the idea of going from one personality model to the other is explored, thus creating the possibility of having the results on two personality models with only one personality test

    Deep Learning for Decision Making and Autonomous Complex Systems

    Get PDF
    Deep learning consists of various machine learning algorithms that aim to learn multiple levels of abstraction from data in a hierarchical manner. It is a tool to construct models using the data that mimics a real world process without an exceedingly tedious modelling of the actual process. We show that deep learning is a viable solution to decision making in mechanical engineering problems and complex physical systems. In this work, we demonstrated the application of this data-driven method in the design of microfluidic devices to serve as a map between the user-defined cross-sectional shape of the flow and the corresponding arrangement of micropillars in the flow channel that contributed to the flow deformation. We also present how deep learning can be used in the early detection of combustion instability for prognostics and health monitoring of a combustion engine, such that appropriate measures can be taken to prevent detrimental effects as a result of unstable combustion. One of the applications in complex systems concerns robotic path planning via the systematic learning of policies and associated rewards. In this context, a deep architecture is implemented to infer the expected value of information gained by performing an action based on the states of the environment. We also applied deep learning-based methods to enhance natural low-light images in the context of a surveillance framework and autonomous robots. Further, we looked at how machine learning methods can be used to perform root-cause analysis in cyber-physical systems subjected to a wide variety of operation anomalies. In all studies, the proposed frameworks have been shown to demonstrate promising feasibility and provided credible results for large-scale implementation in the industry

    Hidden Markov Models

    Get PDF
    Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research

    Adaptive sequential feature selection in visual perception and pattern recognition

    Get PDF
    In the human visual system, one of the most prominent functions of the extensive feedback from the higher brain areas within and outside of the visual cortex is attentional modulation. The feedback helps the brain to concentrate its resources on visual features that are relevant for recognition, i. e. it iteratively selects certain aspects of the visual scene for refined processing by the lower areas until the inference process in the higher areas converges to a single hypothesis about this scene. In order to minimize a number of required selection-refinement iterations, one has to find a short sequence of maximally informative portions of the visual input. Since the feedback is not static, the selection process is adapted to a scene that should be recognized. To find a scene-specific subset of informative features, the adaptive selection process on every iteration utilizes results of previous processing in order to reduce the remaining uncertainty about the visual scene. This phenomenon inspired us to develop a computational algorithm solving a visual classification task that would incorporate such principle, adaptive feature selection. It is especially interesting because usually feature selection methods are not adaptive as they define a unique set of informative features for a task and use them for classifying all objects. However, an adaptive algorithm selects features that are the most informative for the particular input. Thus, the selection process should be driven by statistics of the environment concerning the current task and the object to be classified. Applied to a classification task, our adaptive feature selection algorithm favors features that maximally reduce the current class uncertainty, which is iteratively updated with values of the previously selected features that are observed on the testing sample. In information-theoretical terms, the selection criterion is the mutual information of a class variable and a feature-candidate conditioned on the already selected features, which take values observed on the current testing sample. Then, the main question investigated in this thesis is whether the proposed adaptive way of selecting features is advantageous over the conventional feature selection and in which situations. Further, we studied whether the proposed adaptive information-theoretical selection scheme, which is a computationally complex algorithm, is utilized by humans while they perform a visual classification task. For this, we constructed a psychophysical experiment where people had to select image parts that as they think are relevant for classification of these images. We present the analysis of behavioral data where we investigate whether human strategies of task-dependent selective attention can be explained by a simple ranker based on the mutual information, a more complex feature selection algorithm based on the conventional static mutual information and the proposed here adaptive feature selector that mimics a mechanism of the iterative hypothesis refinement. Hereby, the main contribution of this work is the adaptive feature selection criterion based on the conditional mutual information. Also it is shown that such adaptive selection strategy is indeed used by people while performing visual classification.:1. Introduction 2. Conventional feature selection 3. Adaptive feature selection 4. Experimental investigations of ACMIFS 5. Information-theoretical strategies of selective attention 6. Discussion Appendix Bibliograph
    • …
    corecore