Search CORE

487,610 research outputs found

Cost-sensitive active learning for computer-assisted translation

Author: Angluin
Barrachina
Brown
Casacuberta
Cohn
Dempster
Foster
Francisco Casacuberta
Isabelle
Jesús González-Rubio
Langlais
Lopez
Neal
Noreen
Ueffing
Zipf
Publication venue: 'Elsevier BV'
Publication date: 01/02/2014
Field of study

This is the author’s version of a work that was accepted for publication in Pattern Recognition Letters. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Pattern Recognition Letters, [Volume 37, 1 February 2014, Pages 124–134] DOI: 10.1016/j.patrec.2013.06.007[EN] Machine translation technology is not perfect. To be successfully embedded in real-world applications, it must compensate for its imperfections by interacting intelligently with the user within a computer-assisted translation framework. The interactive¿predictive paradigm, where both a statistical translation model and a human expert collaborate to generate the translation, has been shown to be an effective computer-assisted translation approach. However, the exhaustive supervision of all translations and the use of non-incremental translation models penalizes the productivity of conventional interactive¿predictive systems. We propose a cost-sensitive active learning framework for computer-assisted translation whose goal is to make the translation process as painless as possible. In contrast to conventional active learning scenarios, the proposed active learning framework is designed to minimize not only how many translations the user must supervise but also how difficult each translation is to supervise. To do that, we address the two potential drawbacks of the interactive-predictive translation paradigm. On the one hand, user effort is focused to those translations whose user supervision is considered more ¿informative¿, thus, maximizing the utility of each user interaction. On the other hand, we use a dynamic machine translation model that is continually updated with user feedback after deployment. We empirically validated each of the technical components in simulation and quantify the user effort saved. We conclude that both selective translation supervision and translation model updating lead to important user-effort reductions, and consequently to improved translation productivity.Work supported by the European Union Seventh Framework Program (FP7/2007-2013) under the CasMaCat Project (Grants agreement No. 287576), by the Generalitat Valenciana under Grant ALMPR (Prometeo/2009/014), and by the Spanish Government under Grant TIN2012-31723. The authors thank Daniel Ortiz-Martinez for providing us with the log-linear SMT model with incremental features and the corresponding online learning algorithms. The authors also thank the anonymous reviewers for their criticisms and suggestions.González Rubio, J.; Casacuberta Nolla, F. (2014). Cost-sensitive active learning for computer-assisted translation. Pattern Recognition Letters. 37(1):124-134. https://doi.org/10.1016/j.patrec.2013.06.007S12413437

Crossref

RiuNet

Efficient techniques for cost-sensitive learning with multiple cost considerations

Author: Wang T
Publication venue
Publication date: 01/01/2013
Field of study

University of Technology, Sydney. Faculty of Engineering and Information Technology.Cost-sensitive learning is one of the active research topics in data mining and machine learning, designed for dealing with the non-uniform cost of misclassification errors. In the last ten to fifteen years, diverse learning methods and techniques were proposed to minimize the total cost of misclassification, test and other types. This thesis studies the up-to-date prevailing cost-sensitive learning methods and techniques, and proposes some new and efficient cost-sensitive learning methods and techniques in the following three areas: First, we focus on the data over-fitting issue. In an applied context of cost-sensitive learning, many existing data mining algorithms can generate good results on training data but normally do not produce an optimal model when applied to unseen data in real world applications. We deal with this issue by developing three simple and efficient strategies - feature selection, smoothing and threshold pruning to overcome data over-fitting in cost-sensitive learning. This work sets up a solid foundation for our further research and analysis in this thesis in the other areas of cost-sensitive learning. Second, we design and develop an innovative and practical objective-resource cost-sensitive learning framework for addressing a real world issue where multiple cost units are involved. A lazy cost-sensitive decision tree is built to minimize the objective cost subjecting to given budgets of other resource costs. Finally, we study semi-supervised learning approach in the context of cost-sensitive learning. Two new classification algorithms are proposed to learn cost-sensitive classifier from training datasets with a small amount of labelled data and plenty unlabelled data. We also analyse the impact of the different input parameters to the performance of our new algorithms

OPUS - University of Technology Sydney

A Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets

Author: Dapeng Wang
Yong Zhang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

In imbalanced learning methods, resampling methods modify an imbalanced dataset to form a balanced dataset. Balanced data sets perform better than imbalanced datasets for many base classifiers. This paper proposes a cost-sensitive ensemble method based on cost-sensitive support vector machine (SVM), and query-by-committee (QBC) to solve imbalanced data classification. The proposed method first divides the majority-class dataset into several subdatasets according to the proportion of imbalanced samples and trains subclassifiers using AdaBoost method. Then, the proposed method generates candidate training samples by QBC active learning method and uses cost-sensitive SVM to learn the training samples. By using 5 class-imbalanced datasets, experimental results show that the proposed method has higher area under ROC curve (AUC), F-measure, and G-mean than many existing class-imbalanced learning methods

Crossref

Directory of Open Access Journals

Learning Super-resolved Depth from Active Gated Imaging

Author: Dietmayer Klaus
Gruber Tobias
Haala Norbert
Kokhova Mariia
Ritter Werner
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/12/2019
Field of study

Environment perception for autonomous driving is doomed by the trade-off between range-accuracy and resolution: current sensors that deliver very precise depth information are usually restricted to low resolution because of technology or cost limitations. In this work, we exploit depth information from an active gated imaging system based on cost-sensitive diode and CMOS technology. Learning a mapping between pixel intensities of three gated slices and depth produces a super-resolved depth map image with respectable relative accuracy of 5% in between 25-80 m. By design, depth information is perfectly aligned with pixel intensity values

arXiv.org e-Print Archive

Crossref

Cost-Sensitive Online Active Learning with application to malicious URL detection

Author: Balcan M.-F.
Cavallanti G.
Crammer K.
Gentile C.
Hoi S. C. H.
Kivinen J.
Li Y.
Li Y.
Wang J.
Whittaker C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

Ministry of Education, Singapore under its Academic Research Funding Tier

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

Active Multi-Field Learning for Spam Filtering

Author: Liu Wuying
Wang Lin
Xie Nan
Yi Mianzhu
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 11/02/2015
Field of study

Ubiquitous spam messages cause a serious waste of time and resources. This paper addresses the practical spam filtering problem, and proposes a universal approach to fight with various spam messages. The proposed active multi-field learning approach is based on: 1) It is cost-sensitive to obtain a label for a real-world spam filter, which suggests an active learning idea; and 2) Different messages often have a similar multi-field text structure, which suggests a multi-field learning idea. The multi-field learning framework combines multiple results predicted from field classifiers by a novel compound weight, and each field classifier calculates the arithmetical average of multiple conditional probabilities predicted from feature strings according to a data structure of string-frequency index. Comparing the current variance of field classifying results with the historical variance, the active learner evaluates the classifying confidence and regards the more uncertain message as the more informative sample for which to request a label. The experimental results show that the proposed approach can achieve the state-of-the-art performance at greatly reduced label requirements both in email spam filtering and short text spam filtering. Our active multi-field learning performance, the standard (1-ROCA) % measurement, even exceeds the full feedback performance of some advanced individual classifying algorithm

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)