2,317 research outputs found

    Learning Articulated Motions From Visual Demonstration

    Full text link
    Many functional elements of human homes and workplaces consist of rigid components which are connected through one or more sliding or rotating linkages. Examples include doors and drawers of cabinets and appliances; laptops; and swivel office chairs. A robotic mobile manipulator would benefit from the ability to acquire kinematic models of such objects from observation. This paper describes a method by which a robot can acquire an object model by capturing depth imagery of the object as a human moves it through its range of motion. We envision that in future, a machine newly introduced to an environment could be shown by its human user the articulated objects particular to that environment, inferring from these "visual demonstrations" enough information to actuate each object independently of the user. Our method employs sparse (markerless) feature tracking, motion segmentation, component pose estimation, and articulation learning; it does not require prior object models. Using the method, a robot can observe an object being exercised, infer a kinematic model incorporating rigid, prismatic and revolute joints, then use the model to predict the object's motion from a novel vantage point. We evaluate the method's performance, and compare it to that of a previously published technique, for a variety of household objects.Comment: Published in Robotics: Science and Systems X, Berkeley, CA. ISBN: 978-0-9923747-0-

    Building with Drones: Accurate 3D Facade Reconstruction using MAVs

    Full text link
    Automatic reconstruction of 3D models from images using multi-view Structure-from-Motion methods has been one of the most fruitful outcomes of computer vision. These advances combined with the growing popularity of Micro Aerial Vehicles as an autonomous imaging platform, have made 3D vision tools ubiquitous for large number of Architecture, Engineering and Construction applications among audiences, mostly unskilled in computer vision. However, to obtain high-resolution and accurate reconstructions from a large-scale object using SfM, there are many critical constraints on the quality of image data, which often become sources of inaccuracy as the current 3D reconstruction pipelines do not facilitate the users to determine the fidelity of input data during the image acquisition. In this paper, we present and advocate a closed-loop interactive approach that performs incremental reconstruction in real-time and gives users an online feedback about the quality parameters like Ground Sampling Distance (GSD), image redundancy, etc on a surface mesh. We also propose a novel multi-scale camera network design to prevent scene drift caused by incremental map building, and release the first multi-scale image sequence dataset as a benchmark. Further, we evaluate our system on real outdoor scenes, and show that our interactive pipeline combined with a multi-scale camera network approach provides compelling accuracy in multi-view reconstruction tasks when compared against the state-of-the-art methods.Comment: 8 Pages, 2015 IEEE International Conference on Robotics and Automation (ICRA '15), Seattle, WA, US

    MOMA: Visual Mobile Marker Odometry

    Full text link
    In this paper, we present a cooperative odometry scheme based on the detection of mobile markers in line with the idea of cooperative positioning for multiple robots [1]. To this end, we introduce a simple optimization scheme that realizes visual mobile marker odometry via accurate fixed marker-based camera positioning and analyse the characteristics of errors inherent to the method compared to classical fixed marker-based navigation and visual odometry. In addition, we provide a specific UAV-UGV configuration that allows for continuous movements of the UAV without doing stops and a minimal caterpillar-like configuration that works with one UGV alone. Finally, we present a real-world implementation and evaluation for the proposed UAV-UGV configuration

    Melhoria do alinhamento de imagens RGB-D usando marcadores fiduciais

    Get PDF
    3D reconstruction is the creation of three-dimensional models from the captured shape and appearance of real objects. It is a field that has its roots in several areas within computer vision and graphics, and has gained high importance in others, such as architecture, robotics, autonomous driving, medicine, and archaeology. Most of the current model acquisition technologies are based on LiDAR, RGB-D cameras, and image-based approaches such as visual SLAM. Despite the improvements that have been achieved, methods that rely on professional instruments and operation result in high costs, both capital and logistical. In this dissertation, we develop an optimization procedure capable of enhancing the 3D reconstructions created using a consumer level RGB-D hand-held camera, a product that is widely available, easily handled, with a familiar interface to the average smartphone user, through the utilisation of fiducial markers placed in the environment. Additionally, a tool was developed to allow the removal of said fiducial markers from the texture of the scene, as a complement to mitigate a downside of the approach taken, but that may prove useful in other contexts.A reconstrução 3D é a criação de modelos tridimensionais a partir da forma e aparência capturadas de objetos reais. É um campo que teve origem em diversos ramos da visão computacional e computação gráfica, e que ganhou grande importância em áreas como a arquitetura, robótica, condução autónoma, medicina e arqueologia. A maioria das tecnologias de aquisição de modelos atuais são baseadas em LiDAR, câmeras RGB-D e abordagens baseadas em imagens, como o SLAM visual. Apesar das melhorias que foram alcançadas, os métodos que dependem de instrumentos profissionais e da sua operação resultam em elevados custos, tanto de capital, como logísticos. Nesta dissertação foi desenvolvido um processo de otimização capaz de melhorar as reconstruções 3D criadas usando uma câmera RGB-D portátil, disponível ao nível do consumidor, de fácil manipulação e que tem uma interface familiar para o utilizador de smartphones, através da utilização de marcadores fiduciais colocados no ambiente. Além disso, uma ferramenta foi desenvolvida para permitir a remoção dos ditos marcadores fiduciais da textura da cena, como um complemento para mitigar uma desvantagem da abordagem adotada, mas que pode ser útil em outros contextos.Mestrado em Engenharia de Computadores e Telemátic

    NeuralMarker: A Framework for Learning General Marker Correspondence

    Full text link
    We tackle the problem of estimating correspondences from a general marker, such as a movie poster, to an image that captures such a marker. Conventionally, this problem is addressed by fitting a homography model based on sparse feature matching. However, they are only able to handle plane-like markers and the sparse features do not sufficiently utilize appearance information. In this paper, we propose a novel framework NeuralMarker, training a neural network estimating dense marker correspondences under various challenging conditions, such as marker deformation, harsh lighting, etc. Besides, we also propose a novel marker correspondence evaluation method circumstancing annotations on real marker-image pairs and create a new benchmark. We show that NeuralMarker significantly outperforms previous methods and enables new interesting applications, including Augmented Reality (AR) and video editing.Comment: Accepted by ToG (SIGGRAPH Asia 2022). Project Page: https://drinkingcoder.github.io/publication/neuralmarker

    Mixed marker-based/marker-less visual odometry system for mobile robots

    Get PDF
    When moving in generic indoor environments, robotic platforms generally rely solely on information provided by onboard sensors to determine their position and orientation. However, the lack of absolute references often leads to the introduction of severe drifts in estimates computed, making autonomous operations really hard to accomplish. This paper proposes a solution to alleviate the impact of the above issues by combining two vision‐based pose estimation techniques working on relative and absolute coordinate systems, respectively. In particular, the unknown ground features in the images that are captured by the vertical camera of a mobile platform are processed by a vision‐based odometry algorithm, which is capable of estimating the relative frame‐to‐frame movements. Then, errors accumulated in the above step are corrected using artificial markers displaced at known positions in the environment. The markers are framed from time to time, which allows the robot to maintain the drifts bounded by additionally providing it with the navigation commands needed for autonomous flight. Accuracy and robustness of the designed technique are demonstrated using an off‐the‐shelf quadrotor via extensive experimental test

    Development and characterization of methodology and technology for the alignment of fMRI time series

    Get PDF
    This dissertation has developed, implemented and tested a novel computer based system (AUTOALIGN) that incorporates an algorithm for the alignment of functional Magnetic Resonance Image (fMRI) time series. The algorithm assumes the human brain to be a rigid body and computes a head coordinate system on the basis of three reference points that lie on the directions correspondent to two of the eigenvectors of inertia of the volume, at the intersections with the head boundary. The eigenvectors are found weighting the inertia components with the voxel\u27s intensity values assumed as mass. The three reference points are found in the same position, relative to the origin of the head coordinate system, in both test and reference brain images. Intensity correction is performed at sub-voxel accuracy by tri-linear interpolation. A test fMR brain volume in which controlled simulations of rigid-body transformations have been introduced has preliminarily assessed system performance. Further experimentation has been conducted with real fMRI time series. Rigid-body transformations have been retrieved automatically and the values of the motion parameters compared to those obtained by the Statistical Parametric Mapping (SPM99), and the Automatic Image Registration (AIR 3.08). Results indicated that AUTOALIGN offers subvoxel accuracy in correcting both misalignment and intensity among time points in fMR images time series, and also that its performance is comparable to that of SPM99 and AIR3.08

    Integração de localização baseada em movimento na aplicação móvel EduPARK

    Get PDF
    More and more, mobile applications require precise localization solutions in a variety of environments. Although GPS is widely used as localization solution, it may present some accuracy problems in special conditions such as unfavorable weather or spaces with multiple obstructions such as public parks. For these scenarios, alternative solutions to GPS are of extreme relevance and are widely studied recently. This dissertation studies the case of EduPARK application, which is an augmented reality application that is implemented in the Infante D. Pedro park in Aveiro. Due to the poor accuracy of GPS in this park, the implementation of positioning and marker-less augmented reality functionalities presents difficulties. Existing relevant systems are analyzed, and an architecture based on pedestrian dead reckoning is proposed. The corresponding implementation is presented, which consists of a positioning solution using the sensors available in the smartphones, a step detection algorithm, a distance traveled estimator, an orientation estimator and a position estimator. For the validation of this solution, functionalities were implemented in the EduPARK application for testing purposes and usability tests performed. The results obtained show that the proposed solution can be an alternative to provide accurate positioning within the Infante D. Pedro park, thus enabling the implementation of functionalities of geocaching and marker-less augmented reality.Cada vez mais, as aplicações móveis requerem soluções de localização precisa nos mais variados ambientes. Apesar de o GPS ser amplamente usado como solução para localização, pode apresentar alguns problemas de precisão em condições especiais, como mau tempo, ou espaços com várias obstruções, como parques públicos. Para estes casos, soluções alternativas ao GPS são de extrema relevância e veem sendo desenvolvidas. A presente dissertação estuda o caso do projeto EduPARK, que é uma aplicação móvel de realidade aumentada para o parque Infante D. Pedro em Aveiro. Devido à fraca precisão do GPS nesse parque, a implementação de funcionalidades baseadas no posionamento e de realidade aumentada sem marcadores apresenta dificuldades. São analisados sistemas relevantes existentes e é proposta uma arquitetura baseada em localização de pedestres. Em seguida é apresentada a correspondente implementação, que consiste numa solução de posicionamento usando os sensores disponiveis nos smartphones, um algoritmo de deteção de passos, um estimador de distância percorrida, um estimador de orientação e um estimador de posicionamento. Para a validação desta solução, foram implementadas funcionalidades na aplicação EduPARK para fins de teste, e realizados testes com utilizadores e testes de usabilidade. Os resultados obtidos demostram que a solução proposta pode ser uma alternativa para a localização no interior do parque Infante D. Pedro, viabilizando desta forma a implementação de funcionalidades baseadas no posicionamento e de realidade aumenta sem marcadores.EduPARK é um projeto financiado por Fundos FEDER através do Programa Operacional Competitividade e Internacionalização - COMPETE 2020 e por Fundos Nacionais através da FCT - Fundação para a Ciência e a Tecnologia no âmbito do projeto POCI-01-0145-FEDER-016542.Mestrado em Engenharia Informátic
    corecore