Search CORE

4 research outputs found

ESRGAN+ : Further Improving Enhanced Super-Resolution Generative Adversarial Network

Author: Rakotonirina Nathanaël Carraz
Rasoanaivo Andry
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/07/2020
Field of study

Enhanced Super-Resolution Generative Adversarial Network (ESRGAN) is a perceptual-driven approach for single image super resolution that is able to produce photorealistic images. Despite the visual quality of these generated images, there is still room for improvement. In this fashion, the model is extended to further improve the perceptual quality of the images. We have designed a novel block to replace the one used by the original ESRGAN. Moreover, we introduce noise inputs to the generator network in order to exploit stochastic variation. The resulting images present more realistic textures. The code is available at https://github.com/ncarraz/ESRGANplus .Comment: ICASSP 202

arXiv.org e-Print Archive

Crossref

Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili

Author: Bassett Bruce A.
Chimoto Everlyn Asiko
Jacobs Christiaan
Kamper Herman
Rakotonirina Nathanaël Carraz
Publication venue
Publication date: 01/06/2023
Field of study

We consider hate speech detection through keyword spotting on radio broadcasts. One approach is to build an automatic speech recognition (ASR) system for the target low-resource language. We compare this to using acoustic word embedding (AWE) models that map speech segments to a space where matching words have similar vectors. We specifically use a multilingual AWE model trained on labelled data from well-resourced languages to spot keywords in data in the unseen target language. In contrast to ASR, the AWE approach only requires a few keyword exemplars. In controlled experiments on Wolof and Swahili where training and test data are from the same domain, an ASR model trained on just five minutes of data outperforms the AWE approach. But in an in-the-wild test on Swahili radio broadcasts with actual hate speech keywords, the AWE model (using one minute of template data) is more robust, giving similar performance to an ASR system trained on 30 hours of labelled data.Comment: Accepted to Interspeech 202

arXiv.org e-Print Archive

Evolutionary super-resolution

Author: Couprie Camille
Hosu Vlad
Lin Hanhe
Rakotonirina Nathanaël Carraz
Rasoanaivo Andry
Roziere Baptiste
Teytaud Olivier
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/07/2020
Field of study

Crossref

University of Dundee Online Publications

Many-Objective Optimization for Diverse Image Generation

Author: Carraz Rakotonirina Nathanaël
Hosu Vlad
Kungurtsev Petr
Najman Laurent
Rapin Jeremy
Rasoanaivo Andry
Roziere Baptiste
Teytaud Fabien
Teytaud Olivier
Wagner Markus
Wong Pak-Kan
Publication venue: HAL CCSD
Publication date: 11/11/2021
Field of study

In image generation, where diversity is critical, people can express their preferences by choosing among several proposals. Thus, the image generation system can be refined to satisfy the user's needs. In this paper, we focus on multi-objective optimization as a tool for proposing diverse solutions. Multiobjective optimization is the area of research that deals with optimizing several objective functions simultaneously. In particular, it provides numerous solutions corresponding to trade-offs between different objective functions. The goal is to have enough diversity and quality to satisfy the user. However, in computer vision, the choice of objective functions is part of the problem: typically, we have several criteria, and their mixture approximates what we need. We propose a criterion for quantifying the performance in multi-objective optimization based on cross-validation: when optimizing n−1 of the n criteria, the Pareto front should include at least one good solution for the removed n th criterion. After providing evidence for the validity and usefulness of the proposed criterion, we show that the diversity provided by multiobjective optimization is helpful in diverse image generation, namely super-resolution and inspirational generation

HAL-Ecole des Ponts ParisTech