1,547 research outputs found

    Unsupervised Stream-Weights Computation in Classification and Recognition Tasks

    Get PDF
    International audienceIn this paper, we provide theoretical results on the problem of optimal stream weight selection for the multi-stream classi- fication problem. It is shown, that in the presence of estimation or modeling errors using stream weights can decrease the total classification error. Stream weight estimates are computed for various conditions. Then we turn our attention to the problem of unsupervised stream weights computation. Based on the theoretical results we propose to use models and “anti-models” (class- specific background models) to estimate stream weights. A non-linear function of the ratio of the inter- to intra-class distance is used for stream weight estimation. The proposed unsupervised stream weight estimation algorithm is evaluated on both artificial data and on the problem of audio-visual speech classification. Finally the proposed algorithm is extended to the problem of audio- visual speech recognition. It is shown that the proposed algorithms achieve results comparable to the supervised minimum-error training approach under most testing conditions

    Committee-Based Sample Selection for Probabilistic Classifiers

    Full text link
    In many real-world learning tasks, it is expensive to acquire a sufficient number of labeled examples for training. This paper investigates methods for reducing annotation cost by `sample selection'. In this approach, during training the learning program examines many unlabeled examples and selects for labeling only those that are most informative at each stage. This avoids redundantly labeling examples that contribute little new information. Our work follows on previous research on Query By Committee, extending the committee-based paradigm to the context of probabilistic classification. We describe a family of empirical methods for committee-based sample selection in probabilistic classification models, which evaluate the informativeness of an example by measuring the degree of disagreement between several model variants. These variants (the committee) are drawn randomly from a probability distribution conditioned by the training set labeled so far. The method was applied to the real-world natural language processing task of stochastic part-of-speech tagging. We find that all variants of the method achieve a significant reduction in annotation cost, although their computational efficiency differs. In particular, the simplest variant, a two member committee with no parameters to tune, gives excellent results. We also show that sample selection yields a significant reduction in the size of the model used by the tagger

    Assessing the effects of data selection and representation on the development of reliable E. coli sigma 70 promoter region predictors

    Get PDF
    As the number of sequenced bacterial genomes increases, the need for rapid and reliable tools for the annotation of functional elements (e.g., transcriptional regulatory elements) becomes more desirable. Promoters are the key regulatory elements, which recruit the transcriptional machinery through binding to a variety of regulatory proteins (known as sigma factors). The identification of the promoter regions is very challenging because these regions do not adhere to specific sequence patterns or motifs and are difficult to determine experimentally. Machine learning represents a promising and cost-effective approach for computational identification of prokaryotic promoter regions. However, the quality of the predictors depends on several factors including: i) training data; ii) data representation; iii) classification algorithms; iv) evaluation procedures. In this work, we create several variants of E. coli promoter data sets and utilize them to experimentally examine the effect of these factors on the predictive performance of E. coli σ70 promoter models. Our results suggest that under some combinations of the first three criteria, a prediction model might perform very well on cross-validation experiments while its performance on independent test data is drastically very poor. This emphasizes the importance of evaluating promoter region predictors using independent test data, which corrects for the over-optimistic performance that might be estimated using the cross-validation procedure. Our analysis of the tested models shows that good prediction models often perform well despite how the non-promoter data was obtained. On the other hand, poor prediction models seems to be more sensitive to the choice of non-promoter sequences. Interestingly, the best performing sequence-based classifiers outperform the best performing structure-based classifiers on both cross-validation and independent test performance evaluation experiments. Finally, we propose a meta-predictor method combining two top performing sequence-based and structure-based classifiers and compare its performance with some of the state-of-the-art E. coli σ70 promoter prediction methods.NPRP grant No. 4-1454-1-233 from the Qatar National Research Fund (a member of Qatar Foundation).Scopu

    Extraction and Classification of Drug-Drug Interaction from Biomedical Text Using a Two-Stage Classifier

    Get PDF
    One of the critical causes of medical errors is Drug-Drug interaction (DDI), which occurs when one drug increases or decreases the effect of another drug. We propose a machine learning system to extract and classify drug-drug interactions from the biomedical literature, using the annotated corpus from the DDIExtraction-2013 shared task challenge. Our approach applies a two-stage classifier to handle the highly unbalanced class distribution in the corpus. The first stage is designed for binary classification of drug pairs as interacting or non-interacting, and the second stage for further classification of interacting pairs into one of four interacting types: advise, effect, mechanism, and int. To find the set of best features for classification, we explored many features, including stemmed words, bigrams, part of speech tags, verb lists, parse tree information, mutual information, and similarity measures, among others. As the system faced two different classification tasks, binary and multi-class, we also explored various classifiers in each stage. Our results show that the best performing classifier in both stages was Support Vector Machines, and the best performing features were 1000 top informative words and part of speech tags between two main drugs. We obtained an F-Measure of 0.64, showing a 12% improvement over our submitted system to the DDIExtraction 2013 competition

    Analysis of pattern recognition techniques for in-air signature biometrics

    Full text link
    As a result of advances in mobile technology, new services which benefit from the ubiquity of these devices are appearing. Some of these services require the identification of the subject since they may access private user information. In this paper, we propose to identify each user by drawing his/her handwritten signature in the air (in-airsignature). In order to assess the feasibility of an in-airsignature as a biometric feature, we have analysed the performance of several well-known patternrecognitiontechniques—Hidden Markov Models, Bayes classifiers and dynamic time warping—to cope with this problem. Each technique has been tested in the identification of the signatures of 96 individuals. Furthermore, the robustness of each method against spoofing attacks has also been analysed using six impostors who attempted to emulate every signature. The best results in both experiments have been reached by using a technique based on dynamic time warping which carries out the recognition by calculating distances to an average template extracted from several training instances. Finally, a permanence analysis has been carried out in order to assess the stability of in-airsignature over time

    Word Sense Disambiguation: A Structured Learning Perspective

    Get PDF
    This paper explores the application of structured learning methods (SLMs) to word sense disambiguation (WSD). On one hand, the semantic dependencies between polysemous words in the sentence can be encoded in SLMs. On the other hand, SLMs obtained significant achievements in natural language processing, and so it is a natural idea to apply them to WSD. However, there are many theoretical and practical problems when SLMs are applied to WSD, due to characteristics of WSD. Beginning with the method based on hidden Markov model, this paper proposes for the first time a comprehensive and unified solution for WSD based on maximum entropy Markov model, conditional random field and tree-structured conditional random field, and reduces the time complexity and running time of the proposed methods to a reasonable level by beam search, approximate training, and parallel training. The update of models brings performance improvement, the introduction of one step dependency improves performance by 1--5 percent, the adoption of non-independent features improves performance by 2--3 percent, and the extension of underlying structure to dependency parsing tree improves performance by about 1 percent. On the English all-words WSD dataset of Senseval-2004, the method based on tree-structured conditional random field outperforms the best attendee system significantly. Nevertheless, almost all machine learning methods suffer from data sparseness due to the scarcity of sense tagged data, and so do SLMs. Besides improving structured learning methods according to the characteristics of WSD, another approach to improve disambiguation performance is to mine disambiguation knowledge from all kinds of sources, such as Wikipedia, parallel corpus, and to alleviate knowledge acquisition bottleneck of WSD
    • 

    corecore