2,317 research outputs found
Learning Articulated Motions From Visual Demonstration
Many functional elements of human homes and workplaces consist of rigid
components which are connected through one or more sliding or rotating
linkages. Examples include doors and drawers of cabinets and appliances;
laptops; and swivel office chairs. A robotic mobile manipulator would benefit
from the ability to acquire kinematic models of such objects from observation.
This paper describes a method by which a robot can acquire an object model by
capturing depth imagery of the object as a human moves it through its range of
motion. We envision that in future, a machine newly introduced to an
environment could be shown by its human user the articulated objects particular
to that environment, inferring from these "visual demonstrations" enough
information to actuate each object independently of the user.
Our method employs sparse (markerless) feature tracking, motion segmentation,
component pose estimation, and articulation learning; it does not require prior
object models. Using the method, a robot can observe an object being exercised,
infer a kinematic model incorporating rigid, prismatic and revolute joints,
then use the model to predict the object's motion from a novel vantage point.
We evaluate the method's performance, and compare it to that of a previously
published technique, for a variety of household objects.Comment: Published in Robotics: Science and Systems X, Berkeley, CA. ISBN:
978-0-9923747-0-
Building with Drones: Accurate 3D Facade Reconstruction using MAVs
Automatic reconstruction of 3D models from images using multi-view
Structure-from-Motion methods has been one of the most fruitful outcomes of
computer vision. These advances combined with the growing popularity of Micro
Aerial Vehicles as an autonomous imaging platform, have made 3D vision tools
ubiquitous for large number of Architecture, Engineering and Construction
applications among audiences, mostly unskilled in computer vision. However, to
obtain high-resolution and accurate reconstructions from a large-scale object
using SfM, there are many critical constraints on the quality of image data,
which often become sources of inaccuracy as the current 3D reconstruction
pipelines do not facilitate the users to determine the fidelity of input data
during the image acquisition. In this paper, we present and advocate a
closed-loop interactive approach that performs incremental reconstruction in
real-time and gives users an online feedback about the quality parameters like
Ground Sampling Distance (GSD), image redundancy, etc on a surface mesh. We
also propose a novel multi-scale camera network design to prevent scene drift
caused by incremental map building, and release the first multi-scale image
sequence dataset as a benchmark. Further, we evaluate our system on real
outdoor scenes, and show that our interactive pipeline combined with a
multi-scale camera network approach provides compelling accuracy in multi-view
reconstruction tasks when compared against the state-of-the-art methods.Comment: 8 Pages, 2015 IEEE International Conference on Robotics and
Automation (ICRA '15), Seattle, WA, US
MOMA: Visual Mobile Marker Odometry
In this paper, we present a cooperative odometry scheme based on the
detection of mobile markers in line with the idea of cooperative positioning
for multiple robots [1]. To this end, we introduce a simple optimization scheme
that realizes visual mobile marker odometry via accurate fixed marker-based
camera positioning and analyse the characteristics of errors inherent to the
method compared to classical fixed marker-based navigation and visual odometry.
In addition, we provide a specific UAV-UGV configuration that allows for
continuous movements of the UAV without doing stops and a minimal
caterpillar-like configuration that works with one UGV alone. Finally, we
present a real-world implementation and evaluation for the proposed UAV-UGV
configuration
Melhoria do alinhamento de imagens RGB-D usando marcadores fiduciais
3D reconstruction is the creation of three-dimensional models from the captured
shape and appearance of real objects. It is a field that has its roots in
several areas within computer vision and graphics, and has gained high importance
in others, such as architecture, robotics, autonomous driving, medicine,
and archaeology. Most of the current model acquisition technologies are
based on LiDAR, RGB-D cameras, and image-based approaches such as visual
SLAM. Despite the improvements that have been achieved, methods that
rely on professional instruments and operation result in high costs, both capital
and logistical. In this dissertation, we develop an optimization procedure
capable of enhancing the 3D reconstructions created using a consumer level
RGB-D hand-held camera, a product that is widely available, easily handled,
with a familiar interface to the average smartphone user, through the utilisation
of fiducial markers placed in the environment. Additionally, a tool was
developed to allow the removal of said fiducial markers from the texture of the
scene, as a complement to mitigate a downside of the approach taken, but
that may prove useful in other contexts.A reconstrução 3D é a criação de modelos tridimensionais a partir da forma
e aparência capturadas de objetos reais. É um campo que teve origem em
diversos ramos da visão computacional e computação gráfica, e que ganhou
grande importância em áreas como a arquitetura, robótica, condução autónoma,
medicina e arqueologia. A maioria das tecnologias de aquisição de
modelos atuais são baseadas em LiDAR, câmeras RGB-D e abordagens baseadas
em imagens, como o SLAM visual. Apesar das melhorias que foram
alcançadas, os métodos que dependem de instrumentos profissionais e da
sua operação resultam em elevados custos, tanto de capital, como logísticos.
Nesta dissertação foi desenvolvido um processo de otimização capaz
de melhorar as reconstruções 3D criadas usando uma câmera RGB-D portátil,
disponível ao nível do consumidor, de fácil manipulação e que tem uma
interface familiar para o utilizador de smartphones, através da utilização de
marcadores fiduciais colocados no ambiente. Além disso, uma ferramenta
foi desenvolvida para permitir a remoção dos ditos marcadores fiduciais da
textura da cena, como um complemento para mitigar uma desvantagem da
abordagem adotada, mas que pode ser útil em outros contextos.Mestrado em Engenharia de Computadores e Telemátic
NeuralMarker: A Framework for Learning General Marker Correspondence
We tackle the problem of estimating correspondences from a general marker,
such as a movie poster, to an image that captures such a marker.
Conventionally, this problem is addressed by fitting a homography model based
on sparse feature matching. However, they are only able to handle plane-like
markers and the sparse features do not sufficiently utilize appearance
information. In this paper, we propose a novel framework NeuralMarker, training
a neural network estimating dense marker correspondences under various
challenging conditions, such as marker deformation, harsh lighting, etc.
Besides, we also propose a novel marker correspondence evaluation method
circumstancing annotations on real marker-image pairs and create a new
benchmark. We show that NeuralMarker significantly outperforms previous methods
and enables new interesting applications, including Augmented Reality (AR) and
video editing.Comment: Accepted by ToG (SIGGRAPH Asia 2022). Project Page:
https://drinkingcoder.github.io/publication/neuralmarker
Mixed marker-based/marker-less visual odometry system for mobile robots
When moving in generic indoor environments, robotic platforms generally rely solely on information provided by onboard sensors to determine their position and orientation. However, the lack of absolute references often leads to the introduction of severe drifts in estimates computed, making autonomous operations really hard to accomplish. This paper proposes a solution to alleviate the impact of the above issues by combining two vision‐based pose estimation techniques working on relative and absolute coordinate systems, respectively. In particular, the unknown ground features in the images that are captured by the vertical camera of a mobile platform are processed by a vision‐based odometry algorithm, which is capable of estimating the relative frame‐to‐frame movements. Then, errors accumulated in the above step are corrected using artificial markers displaced at known positions in the environment. The markers are framed from time to time, which allows the robot to maintain the drifts bounded by additionally providing it with the navigation commands needed for autonomous flight. Accuracy and robustness of the designed technique are demonstrated using an off‐the‐shelf quadrotor via extensive experimental test
Development and characterization of methodology and technology for the alignment of fMRI time series
This dissertation has developed, implemented and tested a novel computer based system (AUTOALIGN) that incorporates an algorithm for the alignment of functional Magnetic Resonance Image (fMRI) time series. The algorithm assumes the human brain to be a rigid body and computes a head coordinate system on the basis of three reference points that lie on the directions correspondent to two of the eigenvectors of inertia of the volume, at the intersections with the head boundary. The eigenvectors are found weighting the inertia components with the voxel\u27s intensity values assumed as mass. The three reference points are found in the same position, relative to the origin of the head coordinate system, in both test and reference brain images. Intensity correction is performed at sub-voxel accuracy by tri-linear interpolation. A test fMR brain volume in which controlled simulations of rigid-body transformations have been introduced has preliminarily assessed system performance. Further experimentation has been conducted with real fMRI time series. Rigid-body transformations have been retrieved automatically and the values of the motion parameters compared to those obtained by the Statistical Parametric Mapping (SPM99), and the Automatic Image Registration (AIR 3.08). Results indicated that AUTOALIGN offers subvoxel accuracy in correcting both misalignment and intensity among time points in fMR images time series, and also that its performance is comparable to that of SPM99 and AIR3.08
Integração de localização baseada em movimento na aplicação móvel EduPARK
More and more, mobile applications require precise localization solutions in a variety of environments. Although GPS is widely used as localization solution, it may present some accuracy problems in special conditions such as unfavorable weather or spaces with multiple obstructions such as public parks. For these scenarios, alternative solutions to GPS are of extreme relevance and are widely studied recently. This dissertation studies the case of EduPARK application, which is an augmented reality application that is implemented in the Infante D. Pedro park in Aveiro. Due to the poor accuracy of GPS in this park, the implementation of positioning and marker-less augmented reality functionalities presents difficulties. Existing relevant systems are analyzed, and an architecture based on pedestrian dead reckoning is proposed. The corresponding implementation is presented, which consists of a positioning solution using the sensors available in the smartphones, a step detection algorithm, a distance traveled estimator, an orientation estimator and a position estimator. For the validation of this solution, functionalities were implemented in the EduPARK application for testing purposes and usability tests performed. The results obtained show that the proposed solution can be an alternative to provide accurate positioning within the Infante D. Pedro park, thus enabling the implementation of functionalities of geocaching and marker-less augmented reality.Cada vez mais, as aplicações móveis requerem soluções de localização precisa nos mais variados ambientes. Apesar de o GPS ser amplamente usado como solução para localização, pode apresentar alguns problemas de precisão em condições especiais, como mau tempo, ou espaços com várias obstruções, como parques públicos. Para estes casos, soluções alternativas ao GPS são de extrema relevância e veem sendo desenvolvidas. A presente dissertação estuda o caso do projeto EduPARK, que é uma aplicação móvel de realidade aumentada para o parque Infante D. Pedro em Aveiro. Devido à fraca precisão do GPS nesse parque, a implementação de funcionalidades baseadas no posionamento e de realidade aumentada sem marcadores apresenta dificuldades. São analisados sistemas relevantes existentes e é proposta uma arquitetura baseada em localização de pedestres. Em seguida é apresentada a correspondente implementação, que consiste numa solução de posicionamento usando os sensores disponiveis nos smartphones, um algoritmo de deteção de passos, um estimador de distância percorrida, um estimador de orientação e um estimador de posicionamento. Para a validação desta solução, foram implementadas funcionalidades na aplicação EduPARK para fins de teste, e realizados testes com utilizadores e testes de usabilidade. Os resultados obtidos demostram que a solução proposta pode ser uma alternativa para a localização no interior do parque Infante D. Pedro, viabilizando desta forma a implementação de funcionalidades baseadas no posicionamento e de realidade aumenta sem marcadores.EduPARK é um projeto financiado por Fundos FEDER através do Programa Operacional Competitividade e Internacionalização - COMPETE 2020 e por Fundos Nacionais através da FCT - Fundação para a Ciência e a Tecnologia no âmbito do projeto POCI-01-0145-FEDER-016542.Mestrado em Engenharia Informátic
- …