47 research outputs found

    Improve learning combining crowdsourced labels by weighting Areas Under the Margin

    Full text link
    In supervised learning -- for instance in image classification -- modern massive datasets are commonly labeled by a crowd of workers. The obtained labels in this crowdsourcing setting are then aggregated for training. The aggregation step generally leverages a per worker trust score. Yet, such worker-centric approaches discard each task ambiguity. Some intrinsically ambiguous tasks might even fool expert workers, which could eventually be harmful for the learning step. In a standard supervised learning setting -- with one label per task and balanced classes -- the Area Under the Margin (AUM) statistic is tailored to identify mislabeled data. We adapt the AUM to identify ambiguous tasks in crowdsourced learning scenarios, introducing the Weighted AUM (WAUM). The WAUM is an average of AUMs weighted by worker and task dependent scores. We show that the WAUM can help discarding ambiguous tasks from the training set, leading to better generalization or calibration performance. We report improvements with respect to feature-blind aggregation strategies both for simulated settings and for the CIFAR-10H crowdsourced dataset

    Savoir être à l'école.:Rapport du projet 2008-2011

    Get PDF

    Savoir être à l'école.:Rapport du projet 2008-2011

    Get PDF

    Décret de passage à l'ordre du jour su la motion de Thuriot qui avait avancé une proposition additive sur les militaires absents de leurs corps pour cause légitime, lors de la séance du 1er floréal an II (20 avril 1794)

    No full text
    Thuriot Jacques Alexis, Charlier Louis Joseph. Décret de passage à l'ordre du jour su la motion de Thuriot qui avait avancé une proposition additive sur les militaires absents de leurs corps pour cause légitime, lors de la séance du 1er floréal an II (20 avril 1794). In: Tome LXXXIX - Du 29 germinal au 13 floréal an II (18 avril au 2 mai 1794) p. 82

    Discussion relative aux fausses accusations dirigées contre le citoyen Dobsent, ensuite nommé président du Tribunal révolutionnaire, lors de la séance du 24 thermidor an II (11 août 1794)

    No full text
    Charlier Louis Joseph, Thuriot Jacques Alexis. Discussion relative aux fausses accusations dirigées contre le citoyen Dobsent, ensuite nommé président du Tribunal révolutionnaire, lors de la séance du 24 thermidor an II (11 août 1794). In: Archives Parlementaires de 1787 à 1860 - Première série (1787-1799) Tome XCIV - Du 13 thermidor au 25 thermidor an II (31 juillet au 12 août 1794) Paris : Librairie Administrative P. Dupont, 1985. p. 481

    Improve learning combining crowdsourced labels by weighting Areas Under the Margin

    No full text
    In supervised learning -- for instance in image classification -- modern massive datasets are commonly labeled by a crowd of workers. The obtained labels in this crowdsourcing setting are then aggregated for training. The aggregation step generally leverages a per worker trust score. Yet, such worker-centric approaches discard each task ambiguity. Some intrinsically ambiguous tasks might even fool expert workers, which could eventually be harmful for the learning step. In a standard supervised learning setting -- with one label per task and balanced classes -- the Area Under the Margin (AUM) statistic is tailored to identify mislabeled data. We adapt the AUM to identify ambiguous tasks in crowdsourced learning scenarios, introducing the Weighted AUM (WAUM). The WAUM is an average of AUMs weighted by worker and task dependent scores. We show that the WAUM can help discarding ambiguous tasks from the training set, leading to better generalization or calibration performance. We report improvements with respect to feature-blind aggregation strategies both for simulated settings and for the CIFAR-10H crowdsourced dataset

    Peerannot: classification for crowdsourced image datasets with Python

    No full text
    Crowdsourcing is a quick and easy way to collect labels for large datasets, involving many workers. However, workers often disagree with each other. Sources of error can arise from the workers’ skills, but also from the intrinsic difficulty of the task. We present peerannot: a Python library for managing and learning from crowdsourced labels for classification. Our library allows users to aggregate labels from common noise models or train a deep learning-based classifier directly from crowdsourced labels. In addition, we provide an identification module to easily explore the task difficulty of datasets and worker capabilities

    Cross-Language Voice Conversion Based on Eigenvoices

    Get PDF
    This paper presents a novel cross-language voice conversion (VC) method based on eigenvoice conversion (EVC). Crosslanguage VC is a technique for converting voice quality between two speakers uttering different languages each other. In general, parallel data consisting of utterance pairs of those two speakers are not available. To deal with this problem, we apply EVC to cross-language VC. First, we train an eigenvoice GMM (EV-GMM) using many parallel data sets by a source speaker and many pre-stored other speakers who can utter the same language as the source speaker. And then, the conversion model between the source speaker and a target speaker who cannot utter the source speaker’s language is developed by adapting the EV-GMM using a few arbitrary sentences uttered by the target speaker in a different language. The experimental results demonstrate that the proposed method yields significant performance improvements in both speech quality and conversion accuracy for speaker individuality compared with a conventional cross-language VC method based on frame selection. Index Terms: speech synthesis, voice conversion, crosslanguage, eigenvoice conversion, unsupervised adaptatio
    corecore