Search CORE

58 research outputs found

Quality-Driven video analysis for the improvement of foreground segmentation

Author: Ortego Hernández Diego
Publication venue
Publication date: 01/01/2018
Field of study

Tesis Doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Tecnología Electrónica y de las Comunicaciones.Fecha de lectura: 15-06-2018It was partially supported by the Spanish Government (TEC2014-53176-R, HAVideo

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

Detección de objetos estáticos de primer plano en escenarios altamente concurridos de video-seguridad

Author: Ortego Hernández Diego
Publication venue
Publication date: 01/01/2013
Field of study

Biblos-e Archivo

LEA: Improving Sentence Similarity Robustness to Typos Using Lexical Attention Bias

Author: Almagro Mario
Almazán Emilio
Jiménez David
Ortego Diego
Publication venue
Publication date: 06/07/2023
Field of study

Textual noise, such as typos or abbreviations, is a well-known issue that penalizes vanilla Transformers for most downstream tasks. We show that this is also the case for sentence similarity, a fundamental task in multiple domains, e.g. matching, retrieval or paraphrasing. Sentence similarity can be approached using cross-encoders, where the two sentences are concatenated in the input allowing the model to exploit the inter-relations between them. Previous works addressing the noise issue mainly rely on data augmentation strategies, showing improved robustness when dealing with corrupted samples that are similar to the ones used for training. However, all these methods still suffer from the token distribution shift induced by typos. In this work, we propose to tackle textual noise by equipping cross-encoders with a novel LExical-aware Attention module (LEA) that incorporates lexical similarities between words in both sentences. By using raw text similarities, our approach avoids the tokenization shift problem obtaining improved robustness. We demonstrate that the attention bias introduced by LEA helps cross-encoders to tackle complex scenarios with textual noise, specially in domains with short-text descriptions and limited context. Experiments using three popular Transformer encoders in five e-commerce datasets for product matching show that LEA consistently boosts performance under the presence of noise, while remaining competitive on the original (clean) splits. We also evaluate our approach in two datasets for textual entailment and paraphrasing showing that LEA is robust to typos in domains with longer sentences and more natural context. Additionally, we thoroughly analyze several design choices in our approach, providing insights about the impact of the decisions made and fostering future research in cross-encoders dealing with typos.Comment: KDD'23 conference (main research track). (*) These authors contributed equall

arXiv.org e-Print Archive

Long-Term Stationary Object Detection Based on Spatio-Temporal Change Detection

Author: Martínez José M.
Ortego Hernández Diego
Sanmiguel Juan Carlos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. D. Ortego, J. C. SanMiguel and J. M. Martínez, "Long-Term Stationary Object Detection Based on Spatio-Temporal Change Detection," in IEEE Signal Processing Letters, vol. 22, no. 12, pp. 2368-2372, Dec. 2015. doi: 10.1109/LSP.2015.2482598We present a block-wise approach to detect stationary objects based on spatio-Temporal change detection. First, block candidates are extracted by filtering out consecutive blocks containing moving objects. Then, an online clustering approach groups similar blocks at each spatial location over time via statistical variation of pixel ratios. The stability changes are identified by analyzing the relationships between the most repeated clusters at regular sampling instants. Finally, stationary objects are detected as those stability changes that exceed an alarm time and have not been visualized before. Unlike previous approaches making use of Background Subtraction, the proposed approach does not require foreground segmentation and provides robustness to illumination changes, crowds and intermittent object motion. The experiments over an heterogeneous dataset demonstrate the ability of the proposed approach for short-and long-Term operation while overcoming challenging issues.This work was partially supported by the Spanish Government (HA-Video TEC2014-5317-R) and by the TEC department (UAM)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Biblos-e Archivo

Rejection based multipath reconstruction for background estimation in video sequences with stationary objects

Author: Martínez José M.
Ortego Hernández Diego
Sanmiguel Juan Carlos
Publication venue: 'Elsevier BV'
Publication date: 01/06/2018
Field of study

This is the author’s version of a work that was accepted for publication in Computer Vision and Image Understanding. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Computer Vision and Image Understanding, VOL147 (2016) DOI 10.1016/j.cviu.2016.03.012Background estimation in video consists in extracting a foreground-free image from a set of training frames. Moving and stationary objects may affect the background visibility, thus invalidating the assumption of many related literature where background is the temporal dominant data. In this paper, we present a temporal-spatial block-level approach for background estimation in video to cope with moving and stationary objects. First, a Temporal Analysis module obtains a compact representation of the training data by motion filtering and dimensionality reduction. Then, a threshold-free hierarchical clustering determines a set of candidates to represent the background for each spatial location (block). Second, a Spatial Analysis module iteratively reconstructs the background using these candidates. For each spatial location, multiple reconstruction hypotheses (paths) are explored to obtain its neighboring locations by enforcing inter-block similarities and intra-block homogeneity constraints in terms of color discontinuity, color dissimilarity and variability. The experimental results show that the proposed approach outperforms the related state-of-the-art over challenging video sequences in presence of moving and stationary objects.This work was partially supported by the Spanish Government (HAVideo, TEC2014-53176-R) and by the TEC department (Universidad Autónoma de Madrid)

Biblos-e Archivo

Unsupervised Contrastive Learning of Sound Event Representations

Author: Fonseca Eduardo
McGuinness Kevin
O'Connor Noel E.
Ortego Diego
Serra Xavier
Publication venue
Publication date: 15/11/2020
Field of study

Self-supervised representation learning can mitigate the limitations in recognition tasks with few manually labeled data but abundant unlabeled data---a common scenario in sound event research. In this work, we explore unsupervised contrastive learning as a way to learn sound event representations. To this end, we propose to use the pretext task of contrasting differently augmented views of sound events. The views are computed primarily via mixing of training examples with unrelated backgrounds, followed by other data augmentations. We analyze the main components of our method via ablation experiments. We evaluate the learned representations using linear evaluation, and in two in-domain downstream sound event classification tasks, namely, using limited manually labeled data, and using noisy labeled data. Our results suggest that unsupervised contrastive pre-training can mitigate the impact of data scarcity and increase robustness against noisy labels, outperforming supervised baselines.Comment: A 4-page version is submitted to ICASSP 202

arXiv.org e-Print Archive

DCU Online Research Access Service

UPF Digital Repository

Reliable Label Bootstrapping for Semi-Supervised Learning

Author: Albert Paul
Arazo Eric
McGuinness Kevin
O'Connor Noel E.
Ortego Diego
Publication venue
Publication date: 01/01/2021
Field of study

Reducing the amount of labels required to train convolutional neural networks without performance degradation is key to effectively reduce human annotation efforts. We propose Reliable Label Bootstrapping (ReLaB), an unsupervised preprossessing algorithm which improves the performance of semi-supervised algorithms in extremely low supervision settings. Given a dataset with few labeled samples, we first learn meaningful self-supervised, latent features for the data. Second, a label propagation algorithm propagates the known labels on the unsupervised features, effectively labeling the full dataset in an automatic fashion. Third, we select a subset of correctly labeled (reliable) samples using a label noise detection algorithm. Finally, we train a semi-supervised algorithm on the extended subset. We show that the selection of the network architecture and the self-supervised algorithm are important factors to achieve successful label propagation and demonstrate that ReLaB substantially improves semi-supervised learning in scenarios of very limited supervision on CIFAR-10, CIFAR-100 and mini-ImageNet. We reach average error rates of

\boldsymbol{22.34}

with 1 random labeled sample per class on CIFAR-10 and lower this error to

\boldsymbol{8.46}

when the labeled sample in each class is highly representative. Our work is fully reproducible: https://github.com/PaulAlbert31/ReLaB.Comment: 10 pages, 3 figure

arXiv.org e-Print Archive

Irish Universities

DCU Online Research Access Service