Search CORE

2,273 research outputs found

Automatic enhancement of noisy image sequences through local spatio-temporal spectrum analysis

Author: Miravet Carlos
Navarro Rafael
Nestares García Óscar
Santamaría Javier
Publication venue: The International Society for Optics and Photonics
Publication date: 01/11/1999
Field of study

Contiene: 13 ilustraciones, 2 tablas y fórmulasA fully automatic method is proposed to produce an enhanced image from a very noisy sequence consisting of a translating object over a background with different translation motion. The method is based on averaging registered versions of the frames in which the object has been motion compensated. Conventional techniques for displacement estimation are not adequate for these very noise sequences, and thus a new strategy has been used taking advantage of the simple model of the sequences. First, the local spatio-temporal spectrum is estimated through a bank of multidirectional/multiscale third order Gaussian derivative filters, yielding a representation of the sequence that facilitates further processing and analysis tasks. Then, energy-related measurements describing the local texture and motion are easily extracted from this representation. These descriptors are used to segment the sequence based on a local joint measure of motion and texture. Once the object of interest has been segmented, its velocity is estimated applying the gradient constraint to the output of a directional band-pass filter for all pixels belonging to the object. Velocity estimates are then used to compensate the motion prior to the average. The results obtained with real sequences of moving ships taken under very noisy conditions are highly satisfactory, demonstrating the robustness and usefulness of the proposed method.Supported by the Comisión Interministerial de Ciencia y Tecnología of Spain, grant TIC98-0925-C02-01Peer reviewe

Digital.CSIC

Graph-Optimization base multi-sensor fusion for robust UAV pose estimation

Author: Mascaró Palliser Rubén
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2017
Field of study

ing accurate, high-rate pose estimates from proprioceptive and/or exteroceptive measurements is the first step in the development of navigation algorithms for agile mobile robots such as Unmanned Aerial Vehicles (UAVs). In this paper, we propose a decoupled multi-sensor fusion approach that allows the combination of generic 6D visual-inertial (VI) odometry poses and 3D globally referenced positions to infer the global 6D pose of the robot in real-time. Our approach casts the fusion as a real-time alignment problem between the local base frame of the VI odometry and the global base frame. The quasi-constant alignment transformation that relates these coordinate systems is continuously updated employing graph- based optimization with a sliding window. We evaluate the presented pose estimation method on both simulated data and large outdoor experiments using a small UAV that is capable to run our system onboard. Results are compared against different state-of-the-art sensor fusion frameworks, revealing that the proposed approach is substantially more accurate than other decoupled fusion strategies. We also demonstrate comparable results in relation with a finely tuned Extended Kalman Filter that fuses visual, inertial and GPS measurements in a coupled way and show that our approach is generic enough to deal with different input sources in ner, as well as able to run in real-time

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Airborne forward pointing UV Rayleigh lidar for remote clear air turbulence (CAT) detection: system design and performance

Author: Barny Hervé
Ehret Gerhard
Rondeau Philippe
Veerman Henk
Vrancken Patrick
Wirth Martin
Publication venue: 'The Optical Society'
Publication date: 01/11/2016
Field of study

A high-performance airborne UV Rayleigh lidar system was developed within the European project DELICAT. With its forward-pointing architecture it aims at demonstrating a novel detection scheme for clear air turbulence (CAT) for an aeronautics safety application. Due to its occurrence in clear and clean air at high altitudes (aviation cruise flight level), this type of turbulence evades microwave radar techniques and in most cases coherent Doppler lidar techniques. The present lidar detection technique relies on air density fluctuations measurement and is thus independent of backscatter from hydrometeors and aerosol particles. The subtle air density fluctuations caused by the turbulent air flow demand exceptionally high stability of the setup and in particular of the detection system. This paper describes an airborne test system for the purpose of demonstrating this technology and turbulence detection method: a high-power UV Rayleigh lidar system is installed on a research aircraft in a forward-looking configuration for use in cruise flight altitudes. Flight test measurements demonstrate this unique lidar system being able to resolve air density fluctuations occurring in light-to-moderate CAT at 5 km or moderate CAT at 10 km distance. A scaling of the determined stability and noise characteristics shows that such performance is adequate for an application in commercial air transport.Comment: 17 pages, 19 figures. Pre-publish to Applied Optics (OSA

arXiv.org e-Print Archive

Institute of Transport Research:Publications

A Survey of Monte Carlo Tree Search Methods

Author: Browne Cameron B
Colton Simon
Cowling Peter I
Lucas Simon M
Perez Diego
Powley Edward
Rohlfshagen Philipp
Samothrakis Spyridon
Tavener Stephen
Whitehouse Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

University of Essex Research Repository

CiteSeerX

Maastricht University Research Portal

Singing voice detection in polyphonic music

Author: Rocamora Martín
Publication venue
Publication date
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

RTGEN : A Relative Temporal Graph GENerator

Author: Massri Maria
Meye Pierre
Miklos Zoltan
Raipin Philippe
Publication venue: HAL CCSD
Publication date: 29/03/2022
Field of study

International audienceGraph management systems are emerging as an efficient solution to store and query graph-oriented data. To assess the performance and compare such systems, practitioners often design benchmarks in which they use large scale graphs. However, such graphs either do not fit the scale requirements or are not publicly available. This has been the incentive of a number of graph generators which produce synthetic graphs whose characteristics mimic those of real-world graphs (degree distribution, community structure, diameter, etc.). Applications, however, require to deal with temporal graphs whose topology is in constant change. Although generating static graphs has been extensively studied in the literature, generating temporal graphs has received much less attention. In this work, we propose RTGEN a relative temporal graph generator that allows the generation of temporal graphs by controlling the evolution of the degree distribution. In particular, we propose to generate new graphs with a desired degree distribution out of existing ones while minimizing the efforts to transform our source graph to target. Our proposed relative graph generation method relies on optimal transport methods. We extend our method to also deal with the community structure of the generated graphs that is prevalent in a number of applications. Our generation model extends the concepts proposed in the Chung-Lu model with a temporal and community-aware support. We validate our generation procedure through experiments that prove the reliability of the generated graphs with the ground-truth parameters

INRIA a CCSD electronic archive server

3D 손 포즈 인식을 위한 인조 데이터의 이용

Author: Yang John
Publication venue: 서울대학교 대학원
Publication date: 01/08/2021
Field of study

학위논문(박사) -- 서울대학교대학원 : 융합과학기술대학원 융합과학부(지능형융합시스템전공), 2021.8. 양한열.3D hand pose estimation (HPE) based on RGB images has been studied for a long time. Relevant methods have focused mainly on optimization of neural framework for graphically connected finger joints. Training RGB-based HPE models has not been easy to train because of the scarcity on RGB hand pose datasets; unlike human body pose datasets, the finger joints that span hand postures are structured delicately and exquisitely. Such structure makes accurately annotating each joint with unique 3D world coordinates difficult, which is why many conventional methods rely on synthetic data samples to cover large variations of hand postures. Synthetic dataset consists of very precise annotations of ground truths, and further allows control over the variety of data samples, yielding a learning model to be trained with a large pose space. Most of the studies, however, have performed frame-by-frame estimation based on independent static images. Synthetic visual data can provide practically infinite diversity and rich labels, while avoiding ethical issues with privacy and bias. However, for many tasks, current models trained on synthetic data generalize poorly to real data. The task of 3D human hand pose estimation is a particularly interesting example of this synthetic-to-real problem, because learning-based approaches perform reasonably well given real training data, yet labeled 3D poses are extremely difficult to obtain in the wild, limiting scalability. In this dissertation, we attempt to not only consider the appearance of a hand but incorporate the temporal movement information of a hand in motion into the learning framework for better 3D hand pose estimation performance, which leads to the necessity of a large scale dataset with sequential RGB hand images. We propose a novel method that generates a synthetic dataset that mimics natural human hand movements by re-engineering annotations of an extant static hand pose dataset into pose-flows. With the generated dataset, we train a newly proposed recurrent framework, exploiting visuo-temporal features from sequential images of synthetic hands in motion and emphasizing temporal smoothness of estimations with a temporal consistency constraint. Our novel training strategy of detaching the recurrent layer of the framework during domain finetuning from synthetic to real allows preservation of the visuo-temporal features learned from sequential synthetic hand images. Hand poses that are sequentially estimated consequently produce natural and smooth hand movements which lead to more robust estimations. We show that utilizing temporal information for 3D hand pose estimation significantly enhances general pose estimations by outperforming state-of-the-art methods in experiments on hand pose estimation benchmarks. Since a fixed set of dataset provides a finite distribution of data samples, the generalization of a learning pose estimation network is limited in terms of pose, RGB and viewpoint spaces. We further propose to augment the data automatically such that the augmented pose sampling is performed in favor of training pose estimators generalization performance. Such auto-augmentation of poses is performed within a learning feature space in order to avoid computational burden of generating synthetic sample for every iteration of updates. The proposed effort can be considered as generating and utilizing synthetic samples for network training in the feature space. This allows training efficiency by requiring less number of real data samples, enhanced generalization power over multiple dataset domains and estimation performance caused by efficient augmentation.2D 이미지에서 사람의 손 모양과 포즈를 인식하고 구현흐는 연구는 각 손가락 조인트들의 3D 위치를 검출하는 것을 목표로한다. 손 포즈는 손가락 조인트들로 구성되어 있고 손목 관절부터 MCP, PIP, DIP 조인트들로 사람 손을 구성하는 신체적 요소들을 의미한다. 손 포즈 정보는 다양한 분야에서 활용될수 있고 손 제스쳐 감지 연구 분야에서 손 포즈 정보가 매우 훌륭한 입력 특징 값으로 사용된다. 사람의 손 포즈 검출 연구를 실제 시스템에 적용하기 위해서는 높은 정확도, 실시간성, 다양한 기기에 사용 가능하도록 가벼운 모델이 필요하고, 이것을 가능케 하기 위해서 학습한 인공신경망 모델을 학습하는데에는 많은 데이터가 필요로 한다. 하지만 사람 손 포즈를 측정하는 기계들이 꽤 불안정하고, 이 기계들을 장착하고 있는 이미지는 사람 손 피부 색과는 많이 달라 학습에 사용하기가 적절하지 않다. 그러기 때문에 본 논문에서는 이러한 문제를 해결하기 위해 인공적으로 만들어낸 데이터를 재가공 및 증량하여 학습에 사용하고, 그것을 통해 더 좋은 학습성과를 이루려고 한다. 인공적으로 만들어낸 사람 손 이미지 데이터들은 실제 사람 손 피부색과는 비슷할지언정 디테일한 텍스쳐가 많이 달라, 실제로 인공 데이터를 학습한 모델은 실제 손 데이터에서 성능이 현저히 많이 떨어진다. 이 두 데이타의 도메인을 줄이기 위해서 첫번째로는 사람손의 구조를 먼저 학습 시키기위해, 손 모션을 재가공하여 그 움직임 구조를 학스한 시간적 정보를 뺀 나머지만 실제 손 이미지 데이터에 학습하였고 크게 효과를 내었다. 이때 실제 사람 손모션을 모방하는 방법론을 제시하였다. 두번째로는 두 도메인이 다른 데이터를 네트워크 피쳐 공간에서 align시켰다. 그뿐만아니라 인공 포즈를 특정 데이터들로 augment하지 않고 네트워크가 많이 보지 못한 포즈가 만들어지도록 하나의 확률 모델로서 설정하여 그것에서 샘플링하는 구조를 제안하였다. 본 논문에서는 인공 데이터를 더 효과적으로 사용하여 annotation이 어려운 실제 데이터를 더 모으는 수고스러움 없이 인공 데이터들을 더 효과적으로 만들어 내는 것 뿐만 아니라, 더 안전하고 지역적 특징과 시간적 특징을 활용해서 포즈의 성능을 개선하는 방법들을 제안했다. 또한, 네트워크가 스스로 필요한 데이터를 찾아서 학습할수 있는 자동 데이터 증량 방법론도 함께 제안하였다. 이렇게 제안된 방법을 결합해서 더 나은 손 포즈의 성능을 향상 할 수 있다.1. Introduction 1 2. Related Works 14 3. Preliminaries: 3D Hand Mesh Model 27 4. SeqHAND: RGB-sequence-based 3D Hand Pose and Shape Estimation 31 5. Hand Pose Auto-Augment 66 6. Conclusion 85 Abstract (Korea) 101 감사의 글 103박

SNU Open Repository and Archive