97 research outputs found

    Target Contrastive Pessimistic Discriminant Analysis

    Full text link
    Domain-adaptive classifiers learn from a source domain and aim to generalize to a target domain. If the classifier's assumptions on the relationship between domains (e.g. covariate shift) are valid, then it will usually outperform a non-adaptive source classifier. Unfortunately, it can perform substantially worse when its assumptions are invalid. Validating these assumptions requires labeled target samples, which are usually not available. We argue that, in order to make domain-adaptive classifiers more practical, it is necessary to focus on robust methods; robust in the sense that the model still achieves a particular level of performance without making strong assumptions on the relationship between domains. With this objective in mind, we formulate a conservative parameter estimator that only deviates from the source classifier when a lower or equal risk is guaranteed for all possible labellings of the given target samples. We derive the corresponding estimator for a discriminant analysis model, and show that its risk is actually strictly smaller than that of the source classifier. Experiments indicate that our classifier outperforms state-of-the-art classifiers for geographically biased samples.Comment: 9 pages, no figures, 2 tables. arXiv admin note: substantial text overlap with arXiv:1706.0808

    A review of domain adaptation without target labels

    Full text link
    Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: how can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on mapping, projecting and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research.Comment: 20 pages, 5 figure

    Supervised Classification: Quite a Brief Overview

    Full text link
    The original problem of supervised classification considers the task of automatically assigning objects to their respective classes on the basis of numerical measurements derived from these objects. Classifiers are the tools that implement the actual functional mapping from these measurements---also called features or inputs---to the so-called class label---or output. The fields of pattern recognition and machine learning study ways of constructing such classifiers. The main idea behind supervised methods is that of learning from examples: given a number of example input-output relations, to what extent can the general mapping be learned that takes any new and unseen feature vector to its correct class? This chapter provides a basic introduction to the underlying ideas of how to come to a supervised classification problem. In addition, it provides an overview of some specific classification techniques, delves into the issues of object representation and classifier evaluation, and (very) briefly covers some variations on the basic supervised classification task that may also be of interest to the practitioner

    Robust semi-supervised learning: projections, limits & constraints

    Get PDF
    In many domains of science and society, the amount of data being gathered is increasing rapidly. To estimate input-output relationships that are often of interest, supervised learning techniques rely on a specific type of data: labeled examples for which we know both the input and an outcome. The problem of semi-supervised learning is how to use, increasingly abundantly available, unlabeled examples, with unknown outcomes, to improve supervised learning methods. This thesis is concerned with the question if and how these improvements are possible in a "robust", or safe, way: can we guarantee these methods do not lead to worse performance than the supervised solution?We show that for some supervised classifiers, most notably, the least squares classifier, semi-supervised adaptations can be constructed where this non-degradation in performance can indeed be guaranteed, in terms of the surrogate loss used by the classifier. Since these guarantees are given in terms of the surrogate loss, we explore why this is a useful criterion to evaluate performance. We then prove that semi-supervised versions with strict non-degradation guarantees are not possible for a large class of commonly used supervised classifiers. Other aspects covered in the thesis include optimistic learning, the peaking phenomenon and reproducibility.COMMIT - Project P23LUMC / Geneeskunde Repositoriu

    Freeway traffic incident detection using large scale traffic data and cameras

    Get PDF
    Automatic incident detection (AID) is crucial for reducing non-recurrent congestion caused by traffic incidents. In this paper, a data-driven AID framework is proposed that can leverage large-scale historical traffic data along with the inherent topology of the traffic networks to obtain robust traffic patterns. Such traffic patterns can be compared with the real-time traffic data to detect traffic incidents in the road network. Our AID framework consists of two basic steps for traffic pattern estimation. First, we estimate a robust univariate speed threshold using historical traffic information from individual sensors. This step can be parallelized using MapReduce framework thereby making it feasible to implement the framework over large networks. Our study shows that such robust thresholds can improve incident detection performance significantly compared to traditional threshold determination. Second, we leverage the knowledge of the topology of the road network to construct threshold heatmaps and perform image denoising to obtain spatio-temporally denoised thresholds. We used two image denoising techniques, bilateral filtering and total variation for this purpose. Our study shows that overall AID performance can be improved significantly using bilateral filter denoising compared to the noisy thresholds or thresholds obtained using total variation denoising. The second research objective involved detecting traffic congestion from camera images. Two modern deep learning techniques, the traditional deep convolutional neural network (DCNN) and you only look once (YOLO) models, were used to detect traffic congestion from camera images. A shallow model, support vector machine (SVM) was also used for comparison and to determine the improvements that might be obtained using costly GPU techniques. The YOLO model achieved the highest accuracy of 91.2%, followed by the DCNN model with an accuracy of 90.2%; 85% of images were correctly classified by the SVM model. Congestion regions located far away from the camera, single-lane blockages, and glare issues were found to affect the accuracy of the models. Sensitivity analysis showed that all of the algorithms were found to perform well in daytime conditions, but nighttime conditions were found to affect the accuracy of the vision system. However, for all conditions, the areas under the curve (AUCs) were found to be greater than 0.9 for the deep models. This result shows that the models performed well in challenging conditions as well. The third and final part of this study aimed at detecting traffic incidents from CCTV videos. We approached the incident detection problem using trajectory-based approach for non-congested conditions and pixel-based approach for congested conditions. Typically, incident detection from cameras has been approached using either supervised or unsupervised algorithms. A major hindrance in the application of supervised techniques for incident detection is the lack of a sufficient number of incident videos and the labor-intensive, costly annotation tasks involved in the preparation of a labeled dataset. In this study, we approached the incident detection problem using semi-supervised techniques. Maximum likelihood estimation-based contrastive pessimistic likelihood estimation (CPLE) was used for trajectory classification and identification of incident trajectories. Vehicle detection was performed using state-of-the-art deep learning-based YOLOv3, and simple online real-time tracking (SORT) was used for tracking. Results showed that CPLE-based trajectory classification outperformed the traditional semi-supervised techniques (self learning and label spreading) and its supervised counterpart by a significant margin. For pixel-based incident detection, we used a novel Histogram of Optical Flow Magnitude (HOFM) feature descriptor to detect incident vehicles using SVM classifier based on all vehicles detected by YOLOv3 object detector. We show in this study that this approach can handle both congested and non-congested conditions. However, trajectory-based approach works considerably faster (45 fps compared to 1.4 fps) and also achieves better accuracy compared to pixel-based approach for non-congested conditions. Therefore, for optimal resource usage, trajectory-based approach can be used for non-congested traffic conditions while for congested conditions, pixel-based approach can be used

    At the edge of intonation: the interplay of utterance-final F0 movements and voiceless fricative sounds

    Get PDF
    The paper is concerned with the 'edge of intonation' in a twofold sense. It focuses on utterance-final F0 movements and crosses the traditional segment-prosody divide by investigating the interplay of F0 and voiceless fricatives in speech production. An experiment was performed for German with four types of voiceless fricatives: /f/, /s/, /ʃ/ and /x/. They were elicited with scripted dialogues in the contexts of terminal falling statement and high rising question intonations. Acoustic analyses show that fricatives concluding the high rising question intonations had higher mean centres of gravity (CoGs), larger CoG ranges and higher noise energy levels than fricatives concluding the terminal falling statement intonations. The different spectral-energy patterns are suitable to induce percepts of a high 'aperiodic pitch' at the end of the questions and of a low 'aperiodic pitch' at the end of the statements. The results are discussed with regard to the possible existence of 'segmental intonation' and its implication for F0 truncation and the segment-prosody dichotomy, in which segments are the alleged troublemakers for the production and perception of intonation

    The Local Dominance Effect in Self-Evaluation: Evidence and Explanations

    Get PDF
    The local dominance effect is the tendency for comparisons with a few, discrete individuals to have a greater influence on self-assessments than comparisons with larger aggregates. This review presents a series of recent studies that demonstrate the local dominance effect. The authors offer two primary explanations for the effect and consider alternatives including social categorization and the abstract versus concrete nature of local versus general comparisons. They then discuss moderators of the effect including physical proximity and self-enhancement. Finally, the theoretical and practical implications of the effect are discussed and potential future directions in this research line are proposed
    corecore