1,245 research outputs found

    One-Class Classification: Taxonomy of Study and Review of Techniques

    Full text link
    One-class classification (OCC) algorithms aim to build classification models when the negative class is either absent, poorly sampled or not well defined. This unique situation constrains the learning of efficient classifiers by defining class boundary just with the knowledge of positive class. The OCC problem has been considered and applied under many research themes, such as outlier/novelty detection and concept learning. In this paper we present a unified view of the general problem of OCC by presenting a taxonomy of study for OCC problems, which is based on the availability of training data, algorithms used and the application domains applied. We further delve into each of the categories of the proposed taxonomy and present a comprehensive literature review of the OCC algorithms, techniques and methodologies with a focus on their significance, limitations and applications. We conclude our paper by discussing some open research problems in the field of OCC and present our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure

    Integrating N-best SMT outputs into a TM system

    Get PDF
    In this paper, we propose a novel frame- work to enrich Translation Memory (TM) systems with Statistical Machine Translation (SMT) outputs using ranking. In order to offer the human translators multiple choices, instead of only using the top SMT output and top TM hit, we merge the N-best output from the SMT system and the k-best hits with highest fuzzy match scores from the TM system. The merged list is then ranked according to the prospective post-editing effort and provided to the translators to aid their work. Experiments show that our ranked output achieve 0.8747 precision at top 1 and 0.8134 precision at top 5. Our framework facilitates a tight integration between SMT and TM, where full advantage is taken of TM while high quality SMT output is availed of to improve the productivity of human translators

    Spectral and spatial methods for the classification of urban remote sensing data

    Get PDF
    Lors de ces travaux, nous nous sommes intéressés au problème de la classification supervisée d'images satellitaires de zones urbaines. Les données traitées sont des images optiques à très hautes résolutions spatiales: données panchromatiques à très haute résolution spatiale (IKONOS, QUICKBIRD, simulations PLEIADES) et des images hyperspectrales (DAIS, ROSIS). Deux stratégies ont été proposées. La première stratégie consiste en une phase d'extraction de caractéristiques spatiales et spectrales suivie d'une phase de classification. Ces caractéristiques sont extraites par filtrages morphologiques : ouvertures et fermetures géodésiques et filtrages surfaciques auto-complémentaires. La classification est réalisée avec les machines à vecteurs supports (SVM) non linéaires. Nous proposons la définition d'un noyau spatio-spectral utilisant de manière conjointe l'information spatiale et l'information spectrale extraites lors de la première phase. La seconde stratégie consiste en une phase de fusion de données pre- ou post-classification. Lors de la fusion postclassification, divers classifieurs sont appliqués, éventuellement sur plusieurs données issues d'une même scène (image panchromat ique, image multi-spectrale). Pour chaque pixel, l'appartenance à chaque classe est estimée à l'aide des classifieurs. Un schéma de fusion adaptatif permettant d'utiliser l'information sur la fiabilité locale de chaque classifieur, mais aussi l'information globale disponible a priori sur les performances de chaque algorithme pour les différentes classes, est proposé. Les différents résultats sont fusionnés à l'aide d'opérateurs flous. Les méthodes ont été validées sur des images réelles. Des améliorations significatives sont obtenues par rapport aux méthodes publiées dans la litterature

    A Review of Classification Problems and Algorithms in Renewable Energy Applications

    Get PDF
    Classification problems and their corresponding solving approaches constitute one of the fields of machine learning. The application of classification schemes in Renewable Energy (RE) has gained significant attention in the last few years, contributing to the deployment, management and optimization of RE systems. The main objective of this paper is to review the most important classification algorithms applied to RE problems, including both classical and novel algorithms. The paper also provides a comprehensive literature review and discussion on different classification techniques in specific RE problems, including wind speed/power prediction, fault diagnosis in RE systems, power quality disturbance classification and other applications in alternative RE systems. In this way, the paper describes classification techniques and metrics applied to RE problems, thus being useful both for researchers dealing with this kind of problem and for practitioners of the field

    Rails Quality Data Modelling via Machine Learning-Based Paradigms

    Get PDF

    Development of soft computing and applications in agricultural and biological engineering

    Get PDF
    Soft computing is a set of “inexact” computing techniques, which are able to model and analyze very complex problems. For these complex problems, more conventional methods have not been able to produce cost-effective, analytical, or complete solutions. Soft computing has been extensively studied and applied in the last three decades for scientific research and engineering computing. In agricultural and biological engineering, researchers and engineers have developed methods of fuzzy logic, artificial neural networks, genetic algorithms, decision trees, and support vector machines to study soil and water regimes related to crop growth, analyze the operation of food processing, and support decision-making in precision farming. This paper reviews the development of soft computing techniques. With the concepts and methods, applications of soft computing in the field of agricultural and biological engineering are presented, especially in the soil and water context for crop management and decision support in precision agriculture. The future of development and application of soft computing in agricultural and biological engineering is discussed

    The integration of machine translation and translation memory

    Get PDF
    We design and evaluate several models for integrating Machine Translation (MT) output into a Translation Memory (TM) environment to facilitate the adoption of MT technology in the localization industry. We begin with the integration on the segment level via translation recommendation and translation reranking. Given an input to be translated, our translation recommendation model compares the output from the MT and the TMsystems, and presents the better one to the post-editor. Our translation reranking model combines k-best lists from both systems, and generates a new list according to estimated post-editing effort. We perform both automatic and human evaluation on these models. When measured against the consensus of human judgement, the recommendation model obtains 0.91 precision at 0.93 recall, and the reranking model obtains 0.86 precision at 0.59 recall. The high precision of these models indicates that they can be integrated into TM environments without the risk of deteriorating the quality of the post-editing candidate, and can thereby preserve TM assets and established cost estimation methods associated with TMs. We then explore methods for a deeper integration of translation memory and machine translation on the sub-segment level. We predict whether phrase pairs derived from fuzzy matches could be used to constrain the translation of an input segment. Using a series of novel linguistically-motivated features, our constraints lead both to more consistent translation output, and to improved translation quality, reflected by a 1.2 improvement in BLEU score and a 0.72 reduction in TER score, both of statistical significance (p < 0.01). In sum, we present our work in three aspects: 1) translation recommendation and translation reranking models that can access high quality MT outputs in the TMenvironment, 2) a sub-segment translation memory and machine translation integration model that improves both translation consistency and translation quality, and 3) a human evaluation pipeline to validate the effectiveness of our models with human judgements
    corecore