5,580 research outputs found

    Pose estimation system based on monocular cameras

    Get PDF
    Our world is full of wonders. It is filled with mysteries and challenges, which through the ages inspired and called for the human civilization to grow itself, either philosophically or sociologically. In time, humans reached their own physical limitations; nevertheless, we created technology to help us overcome it. Like the ancient uncovered land, we are pulled into the discovery and innovation of our time. All of this is possible due to a very human characteristic - our imagination. The world that surrounds us is mostly already discovered, but with the power of computer vision (CV) and augmented reality (AR), we are able to live in multiple hidden universes alongside our own. With the increasing performance and capabilities of the current mobile devices, AR is what we dream it can be. There are still many obstacles, but this future is already our reality, and with the evolving technologies closing the gap between the real and the virtual world, soon it will be possible for us to surround ourselves into other dimensions, or fuse them with our own. This thesis focuses on the development of a system to predict the camera’s pose estimation in the real-world regarding to the virtual world axis. The work was developed as a sub-module integrated on the M5SAR project: Mobile Five Senses Augmented Reality System for Museums, aiming to a more immerse experience with the total or partial replacement of the environments’ surroundings. It is based mainly on man-made buildings indoors and their typical rectangular cuboid shape. With the possibility of knowing the user’s camera direction, we can then superimpose dynamic AR content, inviting the user to explore the hidden worlds. The M5SAR project introduced a new way to explore the existent historical museums by exploring the human’s five senses: hearing, smell, taste, touch, vision. With this innovative technology, the user is able to enhance their visitation and immerse themselves into a virtual world blended with our reality. A mobile device application was built containing an innovating framework: MIRAR - Mobile Image Recognition based Augmented Reality - containing object recognition, navigation, and additional AR information projection in order to enrich the users’ visit, providing an intuitive and compelling information regarding the available artworks, exploring the hearing and vision senses. A device specially designed was built to explore the additional three senses: smell, taste and touch which, when attached to a mobile device, either smartphone or tablet, would pair with it and automatically react in with the offered narrative related to the artwork, immersing the user with a sensorial experience. As mentioned above, the work presented on this thesis is relative to a sub-module of the MIRAR regarding environment detection and the superimposition of AR content. With the main goal being the full replacement of the walls’ contents, and with the possibility of keeping the artwork visible or not, it presented an additional challenge with the limitation of using only monocular cameras. Without the depth information, any 2D image of an environment, to a computer doesn’t represent the tridimensional layout of the real-world dimensions. Nevertheless, man-based building tends to follow a rectangular approach to divisions’ constructions, which allows for a prediction to where the vanishing point on any environment image may point, allowing the reconstruction of an environment’s layout from a 2D image. Furthermore, combining this information with an initial localization through an improved image recognition to retrieve the camera’s spatial position regarding to the real-world coordinates and the virtual-world, alas, pose estimation, allowed for the possibility of superimposing specific localized AR content over the user’s mobile device frame, in order to immerse, i.e., a museum’s visitor into another era correlated to the present artworks’ historical period. Through the work developed for this thesis, it was also presented a better planar surface in space rectification and retrieval, a hybrid and scalable multiple images matching system, a more stabilized outlier filtration applied to the camera’s axis, and a continuous tracking system that works with uncalibrated cameras and is able to achieve particularly obtuse angles and still maintain the surface superimposition. Furthermore, a novelty method using deep learning models for semantic segmentation was introduced for indoor layout estimation based on monocular images. Contrary to the previous developed methods, there is no need to perform geometric calculations to achieve a near state of the art performance with a fraction of the parameters required by similar methods. Contrary to the previous work presented on this thesis, this method performs well even in unseen and cluttered rooms if they follow the Manhattan assumption. An additional lightweight application to retrieve the camera pose estimation is presented using the proposed method.O nosso mundo está repleto de maravilhas. Está cheio de mistérios e desafios, os quais, ao longo das eras, inspiraram e impulsionaram a civilização humana a evoluir, seja filosófica ou sociologicamente. Eventualmente, os humanos foram confrontados com os seus limites físicos; desta forma, criaram tecnologias que permitiram superá-los. Assim como as terras antigas por descobrir, somos impulsionados à descoberta e inovação da nossa era, e tudo isso é possível graças a uma característica marcadamente humana: a nossa imaginação. O mundo que nos rodeia está praticamente todo descoberto, mas com o poder da visão computacional (VC) e da realidade aumentada (RA), podemos viver em múltiplos universos ocultos dentro do nosso. Com o aumento da performance e das capacidades dos dispositivos móveis da atualidade, a RA pode ser exatamente aquilo que sonhamos. Continuam a existir muitos obstáculos, mas este futuro já é o nosso presente, e com a evolução das tecnologias a fechar o fosso entre o mundo real e o mundo virtual, em breve será possível cercarmo-nos de outras dimensões, ou fundi-las dentro da nossa. Esta tese foca-se no desenvolvimento de um sistema de predição para a estimação da pose da câmara no mundo real em relação ao eixo virtual do mundo. Este trabalho foi desenvolvido como um sub-módulo integrado no projeto M5SAR: Mobile Five Senses Augmented Reality System for Museums, com o objetivo de alcançar uma experiência mais imersiva com a substituição total ou parcial dos limites do ambiente. Dedica-se ao interior de edifícios de arquitetura humana e a sua típica forma de retângulo cuboide. Com a possibilidade de saber a direção da câmara do dispositivo, podemos então sobrepor conteúdo dinâmico de RA, num convite ao utilizador para explorar os mundos ocultos. O projeto M5SAR introduziu uma nova forma de explorar os museus históricos existentes através da exploração dos cinco sentidos humanos: a audição, o cheiro, o paladar, o toque e a visão. Com essa tecnologia inovadora, o utilizador pode engrandecer a sua visita e mergulhar num mundo virtual mesclado com a nossa realidade. Uma aplicação para dispositivo móvel foi criada, contendo uma estrutura inovadora: MIRAR - Mobile Image Recognition based Augmented Reality - a possuir o reconhecimento de objetos, navegação e projeção de informação de RA adicional, de forma a enriquecer a visita do utilizador, a fornecer informação intuitiva e interessante em relação às obras de arte disponíveis, a explorar os sentidos da audição e da visão. Foi também desenhado um dispositivo para exploração em particular dos três outros sentidos adicionais: o cheiro, o toque e o sabor. Este dispositivo, quando afixado a um dispositivo móvel, como um smartphone ou tablet, emparelha e reage com este automaticamente com a narrativa relacionada à obra de arte, a imergir o utilizador numa experiência sensorial. Como já referido, o trabalho apresentado nesta tese é relativo a um sub-módulo do MIRAR, relativamente à deteção do ambiente e a sobreposição de conteúdo de RA. Sendo o objetivo principal a substituição completa dos conteúdos das paredes, e com a possibilidade de manter as obras de arte visíveis ou não, foi apresentado um desafio adicional com a limitação do uso de apenas câmaras monoculares. Sem a informação relativa à profundidade, qualquer imagem bidimensional de um ambiente, para um computador isso não se traduz na dimensão tridimensional das dimensões do mundo real. No entanto, as construções de origem humana tendem a seguir uma abordagem retangular às divisões dos edifícios, o que permite uma predição de onde poderá apontar o ponto de fuga de qualquer ambiente, a permitir a reconstrução da disposição de uma divisão através de uma imagem bidimensional. Adicionalmente, ao combinar esta informação com uma localização inicial através de um reconhecimento por imagem refinado, para obter a posição espacial da câmara em relação às coordenadas do mundo real e do mundo virtual, ou seja, uma estimativa da pose, foi possível alcançar a possibilidade de sobrepor conteúdo de RA especificamente localizado sobre a moldura do dispositivo móvel, de maneira a imergir, ou seja, colocar o visitante do museu dentro de outra era, relativa ao período histórico da obra de arte em questão. Ao longo do trabalho desenvolvido para esta tese, também foi apresentada uma melhor superfície planar na recolha e retificação espacial, um sistema de comparação de múltiplas imagens híbrido e escalável, um filtro de outliers mais estabilizado, aplicado ao eixo da câmara, e um sistema de tracking contínuo que funciona com câmaras não calibradas e que consegue obter ângulos particularmente obtusos, continuando a manter a sobreposição da superfície. Adicionalmente, um algoritmo inovador baseado num modelo de deep learning para a segmentação semântica foi introduzido na estimativa do traçado com base em imagens monoculares. Ao contrário de métodos previamente desenvolvidos, não é necessário realizar cálculos geométricos para obter um desempenho próximo ao state of the art e ao mesmo tempo usar uma fração dos parâmetros requeridos para métodos semelhantes. Inversamente ao trabalho previamente apresentado nesta tese, este método apresenta um bom desempenho mesmo em divisões sem vista ou obstruídas, caso sigam a mesma premissa Manhattan. Uma leve aplicação adicional para obter a posição da câmara é apresentada usando o método proposto

    On-board Obstacle Avoidance in the Teleoperation of Unmanned Aerial Vehicles

    Get PDF
    Teleoperation von Drohnen in Umgebungen ohne GPS-Verbindung und wenig Bewegungsspielraum stellt den Operator vor besondere Herausforderungen. Hindernisse in einer unbekannten Umgebung erfordern eine zuverlässige Zustandsschätzung und Algorithmen zur Vermeidung von Kollisionen. In dieser Dissertation präsentieren wir ein System zur kollisionsfreien Navigation einer ferngesteuerten Drohne mit vier Propellern (Quadcopter) in abgeschlossenen Räumen. Die Plattform ist mit einem Miniaturcomputer und dem Minimum an Sensoren ausgestattet. Diese Ausstattung genügt den Anforderungen an die Rechenleistung. Dieses Setup ermöglicht des Weiteren eine hochgenaue Zustandsschätzung mit Hilfe einer Kaskaden-Architektur, sehr gutes Folgeverhalten bezüglich der kommandierten Geschwindigkeit, sowie eine kollisionsfreie Navigation. Ein Komplementärfilter berechnet die Höhe der Drohne, während ein Kalman-Filter Beschleunigung durch eine IMU und Messungen eines Optical-Flow Sensors fusioniert und in die Softwarearchitektur integriert. Eine RGB-D Kamera stellt dem Operator ein visuelles Feedback, sowie Distanzmessungen zur Verfügung, um ein Roboter-zentriertes Modell umliegender Hindernisse mit Hilfe eines Bin-Occupancy-Filters zu erstellen. Der Algorithmus speichert die Position dieser Hindernisse, auch wenn sie das Sehfeld des Sensors verlassen, mit Hilfe des geschätzten Zustandes des Roboters. Das Prinzip des Ausweich-Algorithmus basiert auf dem Ansatz einer modell-prädiktiven Regelung. Durch Vorhersage der wahrscheinlichen Position eines Hindernisses werden die durch den Operator kommandierten Sollwerte gefiltert, um eine mögliche Kollision mit einem Hindernis zu vermeiden. Die Plattform wurde experimentell sowohl in einer räumlich abgeschlossenen Umgebung mit zahlreichen Hindernissen als auch bei Testflügen in offener Umgebung mit natürlichen Hindernissen wie z.B. Bäume getestet. Fliegende Roboter bergen das Risiko, im Fall eines Fehlers, sei es ein Bedienungs- oder Berechnungsfehler, durch einen Aufprall am Boden oder an Hindernissen Schaden zu nehmen. Aus diesem Grund nimmt die Entwicklung von Algorithmen dieser Roboter ein hohes Maß an Zeit und Ressourcen in Anspruch. In dieser Arbeit präsentieren wir zwei Methoden (Software-in-the-loop- und Hardware-in-the-loop-Simulation) um den Entwicklungsprozess zu vereinfachen. Via Software-in-the-loop-Simulation konnte der Zustandsschätzer mit Hilfe simulierter Sensoren und zuvor aufgenommener Datensätze verbessert werden. Eine Hardware-in-the-loop Simulation ermöglichte uns, den Roboter in Gazebo (ein bekannter frei verfügbarer ROS-Simulator) mit zusätzlicher auf dem Roboter installierter Hardware in Simulation zu bewegen. Ebenso können wir damit die Echtzeitfähigkeit der Algorithmen direkt auf der Hardware validieren und verifizieren. Zu guter Letzt analysierten wir den Einfluss der Roboterbewegung auf das visuelle Feedback des Operators. Obwohl einige Drohnen die Möglichkeit einer mechanischen Stabilisierung der Kamera besitzen, können unsere Drohnen aufgrund von Gewichtsbeschränkungen nicht auf diese Unterstützung zurückgreifen. Eine Fixierung der Kamera verursacht, während der Roboter sich bewegt, oft unstetige Bewegungen des Bildes und beeinträchtigt damit negativ die Manövrierbarkeit des Roboters. Viele wissenschaftliche Arbeiten beschäftigen sich mit der Lösung dieses Problems durch Feature-Tracking. Damit kann die Bewegung der Kamera rekonstruiert und das Videosignal stabilisiert werden. Wir zeigen, dass diese Methode stark vereinfacht werden kann, durch die Verwendung der Roboter-internen IMU. Unsere Ergebnisse belegen, dass unser Algorithmus das Kamerabild erfolgreich stabilisieren und der rechnerische Aufwand deutlich reduziert werden kann. Ebenso präsentieren wir ein neues Design eines Quadcopters, um dessen Ausrichtung von der lateralen Bewegung zu entkoppeln. Unser Konzept erlaubt die Neigung der Propellerblätter unabhängig von der Ausrichtung des Roboters mit Hilfe zweier zusätzlicher Aktuatoren. Nachdem wir das dynamische Modell dieses Systems hergeleitet haben, synthetisierten wir einen auf Feedback-Linearisierung basierten Regler. Simulationen bestätigen unsere Überlegungen und heben die Verbesserung der Manövrierfähigkeit dieses neuartigen Designs hervor.The teleoperation of unmanned aerial vehicles (UAVs), especially in cramped, GPS-restricted, environments, poses many challenges. The presence of obstacles in an unfamiliar environment requires reliable state estimation and active algorithms to prevent collisions. In this dissertation, we present a collision-free indoor navigation system for a teleoperated quadrotor UAV. The platform is equipped with an on-board miniature computer and a minimal set of sensors for this task and is self-sufficient with respect to external tracking systems and computation. The platform is capable of highly accurate state-estimation, tracking of the velocity commanded by the user and collision-free navigation. The robot estimates its state in a cascade architecture. The attitude of the platform is calculated with a complementary filter and its linear velocity through a Kalman filter integration of inertial and optical flow measurements. An RGB-D camera serves the purpose of providing visual feedback to the operator and depth measurements to build a probabilistic, robot-centric obstacle state with a bin-occupancy filter. The algorithm tracks the obstacles when they leave the field of view of the sensor by updating their positions with the estimate of the robot's motion. The avoidance part of our navigation system is based on the Model Predictive Control approach. By predicting the possible future obstacles states, the UAV filters the operator commands by altering them to prevent collisions. Experiments in obstacle-rich indoor and outdoor environments validate the efficiency of the proposed setup. Flying robots are highly prone to damage in cases of control errors, as these most likely will cause them to fall to the ground. Therefore, the development of algorithm for UAVs entails considerable amount of time and resources. In this dissertation we present two simulation methods, i.e. software- and hardware-in-the-loop simulations, to facilitate this process. The software-in-the-loop testing was used for the development and tuning of the state estimator for our robot using both the simulated sensors and pre-recorded datasets of sensor measurements, e.g., from real robotic experiments. With hardware-in-the-loop simulations, we are able to command the robot simulated in Gazebo, a popular open source ROS-enabled physical simulator, using computational units that are embedded on our quadrotor UAVs. Hence, we can test in simulation not only the correct execution of algorithms, but also the computational feasibility directly on the robot's hardware. Lastly, we analyze the influence of the robot's motion on the visual feedback provided to the operator. While some UAVs have the capacity to carry mechanically stabilized camera equipment, weight limits or other problems may make mechanical stabilization impractical. With a fixed camera, the video stream is often unsteady due to the multirotor's movement and can impair the operator's situation awareness. There has been significant research on how to stabilize videos using feature tracking to determine camera movement, which in turn is used to manipulate frames and stabilize the camera stream. However, we believe that this process could be greatly simplified by using data from a UAV’s on-board inertial measurement unit to stabilize the camera feed. Our results show that our algorithm successfully stabilizes the camera stream with the added benefit of requiring less computational power. We also propose a novel quadrotor design concept to decouple its orientation from the lateral motion of the quadrotor. In our design the tilt angles of the propellers with respect to the quadrotor body are being simultaneously controlled with two additional actuators by employing the parallelogram principle. After deriving the dynamic model of this design, we propose a controller for this platform based on feedback linearization. Simulation results confirm our theoretical findings, highlighting the improved motion capabilities of this novel design with respect to standard quadrotors

    Web-based indoor positioning system using QR-codes as markers

    Get PDF
    Location tracking has been quite an important tool in our daily life. The outdoor location tracking can easily be supported by GPS. However, the technology of tracking smart device users indoor position is not at the same maturity level as outdoor tracking. AR technology could enable the tracking on users indoor location by scanning the AR marker with their smart devices. However, due to several limitations (capacity, error tolerance, etc.) AR markers are not widely adopted. Therefore, not serving as a good candidate to be a tracking marker. This paper carries out a research question whether QR code can replace the AR marker as the tracking marker to detect smart devices’ user indoor position. The paper has discussed the research question by researching the background of the QR code and AR technology. According to the research, QR code should be a suitable choice to implement as a tracking marker. Comparing to the AR marker, QR code has a better capacity, higher error tolerance, and widely adopted. Moreover, a web application has also been implemented as an experiment to support the research question. It utilized QR code as a tracking marker for AR technology which built a 3D model on the QR code. Hence, the position of the user can be estimated from the 3D model. This paper discusses the experiment result by comparing a pre-fixed target user’s position and real experiment position with three different QR code samples. The limitation of the experiment and improvement ideas have also been discussed in this paper. According to the experiment, the research question has being answered that a combination of QR code and AR technology could deliver a satisfying indoor location result in a smart device user

    Real-time human body detection and tracking for augmented reality mobile applications

    Get PDF
    Hoje em dia, cada vez mais experiências culturais são melhoradas tendo por base aplicações móveis, incluindo aqueles que usam Realidade Aumentada (RA). Estas aplicações têm crescido em número de utilizadores, em muito suportadas no aumento do poder de cálculo dos processadores mais recentes, na popularidade dos dispositivos móveis (com câmaras de alta definição e sistemas de posicionamento global – GPS), e na massificação da disponibilidade de conexões de internet. Tendo este contexto em mente, o projeto Mobile Five Senses Augmented Reality System for Museums (M5SAR) visa desenvolver um sistema de RA para ser um guia em eventos culturais, históricos e em museus, complementando ou substituindo a orientação tradicional dada pelos guias ou mapas. O trabalho descrito na presente tese faz parte do projeto M5SAR. O sistema completo consiste numa aplicação para dispositivos móveis e num dispositivo físico, a acoplar ao dispositivo móvel, que em conjunto visam explorar os 5 sentidos humanos: visão, audição, tato, olfacto e paladar. O projeto M5SAR tem como objetivos principais (a) detectar peças do museu (por exemplo, pinturas e estátuas (Pereira et al., 2017)), (b) detectar paredes / ambientes do museu (Veiga et al., 2017) e (c) detectar formas humanas para sobrepor o conteúdo de Realidade Aumentada (?). Esta tese apresenta uma abordagem relativamente ao último objectivo, combinando informações de articulações do corpo humano com métodos de sobreposição de roupas. Os atuais sistemas relacionados com a sobreposição de roupas, que permitem ao utilizador mover-se livremente, são baseados em sensores tridimensionais (3D), e.g., Sensor Kinect (Erra et al., 2018), sendo estes não portáteis. A contribuição desta tese é apresentar uma solução portátil baseado na câmara (RGB) do telemóvel que permite ao utilizador movimentar-se livremente, fazendo ao mesmo tempo a sobreposição de roupa (para o corpo completo). Nos últimos anos, a capacidade de Redes Neurais Convolucionais (CNN) foi comprovado numa grande variedade de tarefas de visão computacional, tais como classificação e detecção de objetos e no reconhecimento de faces e texto (Amos et al., 2016; Ren et al., 2015a). Uma das áreas de uso das CNN é a estimativa de posição (pose) humana em ambientes reais (Insafutdinov et al., 2017; Pishchulin et al., 2016). Recentemente, duas populares CNN frameworks para detecção e segmentação de formas humanas apresentam destaque, o OpenPose (Cao et al., 2017;Wei et al., 2016) e o Mask R-CNN (He et al., 2017). No entanto, testes experimentais mostraram que as implementações originais não são adequadas para dispositivos móveis. Apesar disso, estas frameworks são a base para as implementações mais recentes, que possibilitam o uso em dispositivos móveis. Uma abordagem que alcança a estimativa e a segmentação de pose de corpo inteiro é o Mask R-CNN2Go (Jindal, 2018), baseado na estrutura original do Mask R-CNN. A principal razão para o tempo de processamento ser reduzido foi a otimização do número de camadas de convolução e a largura de cada camada. Outra abordagem para obter a estimativa de pose humana em dispositivos móveis foi a modificação da arquitetura original do OpenPose para mobile (Kim, 2018; Solano, 2018) e sua combinação com MobileNets (Howard et al., 2017). MobileNets, como o nome sugere, é projetado para aplicativos móveis, fazendo uso de camadas de convoluções separáveis em profundidade. Essa modificação reduz o tempo de processamento, mas também reduz a precisão na estimativa da pose, quando comparado à arquitetura original. É importante ressaltar que apesar de a detecção de pessoas com a sobreposição de roupas ser um tema atual, já existem aplicações disponíveis no mercado, como o Pozus (GENTLEMINDS, 2018). O Pozus é disponibilizado numa versão beta que é executado no sistema operativo iOS, usa a câmera do telemóvel como entrada para a estimação da pose humana aplicando segmentos de texturas sobre o corpo humano. No entanto, Pozus não faz ajuste de texturas (roupas) à forma da pessoa. Na presente tese, o modelo OpenPose foi usado para determinar as articulações do corpo e diferentes abordagens foram usadas para sobreposição de roupas, enquanto uma pessoa se move em ambientes reais. A primeira abordagem utiliza o algoritmo GrabCut (Rother et al., 2004) para segmentação de pessoas, permitindo o ajuste de segmentos de roupas. Uma segunda abordagem usa uma ferramenta bidimensional (2D) de Animação do Esqueleto para permitir deformações em texturas 2D de acordo com as poses estimadas. A terceira abordagem é semelhante à anterior, mas usa modelos 3D, volumes, para obter uma simulação mais realista do processo de sobreposição de roupas. Os resultados e a prova de conceito são mostrados. Os resultados são coerentes com uma prova de conceito. Os testes revelaram que como trabalho futuro as otimizações para melhorar a precisão do modelo de estimação da pose e o tempo de execução ainda são necessárias para dispositivos móveis. O método final utilizado para sobrepor roupas no corpo demonstrou resultados positivos, pois possibilitaram uma simulação mais realística do processo de sobreposição de roupas.When it comes to visitors at museums and heritage places, objects speak for themselves. Nevertheless, it is important to give visitors the best experience possible, this will lead to an increase in the visits number and enhance the perception and value of the organization. With the aim of enhancing a traditional museum visit, a mobile Augmented Reality (AR) framework is being developed as part of the Mobile Five Senses Augmented Reality (M5SAR) project. This thesis presents an initial approach to human shape detection and AR content superimposition in a mobile environment, achieved by combining information of human body joints with clothes overlapping methods. The present existing systems related to clothes overlapping, that allow the user to move freely, are based mainly in three-dimensional (3D) sensors (e.g., Kinect sensor (Erra et al., 2018)), making them far from being portable. The contribution of this thesis is to present a portable system that allows the user to move freely and does full body clothes overlapping. The OpenPose model (Kim, 2018; Solano, 2018) was used to compute the body joints and different approaches were used for clothes overlapping, while a person is moving in real environments. The first approach uses GrabCut algorithm (Rother et al., 2004) for person segmentation, allowing to fit clothes segments. A second approach uses a bi-dimensional (2D) skeletal animation tool to allow deformations on 2D textures according to the estimated poses. The third approach is similar to the previous, but uses 3D clothes models (volumes) to achieve a more realistic simulation of the process of clothes superimposition. Results and proof-of-concept are shown

    Electronic Image Stabilization for Mobile Robotic Vision Systems

    Get PDF
    When a camera is affixed on a dynamic mobile robot, image stabilization is the first step towards more complex analysis on the video feed. This thesis presents a novel electronic image stabilization (EIS) algorithm for small inexpensive highly dynamic mobile robotic platforms with onboard camera systems. The algorithm combines optical flow motion parameter estimation with angular rate data provided by a strapdown inertial measurement unit (IMU). A discrete Kalman filter in feedforward configuration is used for optimal fusion of the two data sources. Performance evaluations are conducted by a simulated video truth model (capturing the effects of image translation, rotation, blurring, and moving objects), and live test data. Live data was collected from a camera and IMU affixed to the DAGSI Whegs™ mobile robotic platform as it navigated through a hallway. Template matching, feature detection, optical flow, and inertial measurement techniques are compared and analyzed to determine the most suitable algorithm for this specific type of image stabilization. Pyramidal Lucas-Kanade optical flow using Shi-Tomasi good features in combination with inertial measurement is the EIS algorithm found to be superior. In the presence of moving objects, fusion of inertial measurement reduces optical flow root-mean-squared (RMS) error in motion parameter estimates by 40%. No previous image stabilization algorithm to date directly fuses optical flow estimation with inertial measurement by way of Kalman filtering

    Critical Technologies in the Cluster of Virtual and Augmented Reality

    Get PDF
    Technologies of creating new products in the field of virtual reality have not only been widely developed, but have already reached the payback stage - primarily in the areas of computer games and simulators for drivers and operators of complex technology, including spacecraft, airplanes, helicopters, cars, etc. As a rule, when discussing these technologies, they add socalled technologies of augmented reality to them. This is logical, but the problem is that, for example, with government funding for the development of these two technologies in a single cluster of programs, there is a danger that all actual projects will be directed to commercialization in the field of virtual reality, whereas this is not so important, since may develop in ways of selffinancing. In this case, there is already a tendency to replace the enlarged concept only with its simplest component, i.e. The term “virtual reality” is used as a synonym for “virtual and augmented reality”, which is completely erroneous. This article aims to distinguish between these terms. To this end, a list of critical subtechnologies has been developed, which is divided into two subsections, one of which relates only to augmented reality technologies. The article may be useful in refining the state support program designed to develop this critical end-to-end digital technology

    高速ビジョンを用いたリアルタイムビデオモザイキングと安定化に関する研究

    Get PDF
    広島大学(Hiroshima University)博士(工学)Doctor of Engineeringdoctora

    Algoritmo de planificación de movimiento para un robot móvil con un sistema de visión artificial inteligente

    Get PDF
    Este estudio está dedicado a los desafíos de la planificación del movimiento para robots móviles con sistemas inteligentes de visión artificial. La planificación del movimiento para robots móviles en un entorno con obstáculos es un problema con el que lidiar al crear robots adecuados para operar en condiciones del mundo real. Las soluciones que se encuentran en la actualidad son predominantemente privadas y altamente especializadas, lo que impide juzgar qué tan exitosas son para resolver el problema de la planificación eficaz del movimiento. Ya existen soluciones con un campo de aplicación estrecho y ya se están desarrollando durante mucho tiempo, sin embargo, aún no se han observado avances importantes. Solo se puede observar una mejora sistemática en las características de tales sistemas. El propósito de este estudio: desarrollar e investigar un algoritmo de planificación de movimiento para un robot móvil con un sistema de visión artificial inteligente. El tema de investigación de este artículo es un algoritmo de planificación de movimiento para un robot móvil con un sistema de visión artificial inteligente. Este estudio proporciona una revisión de robots móviles nacionales y extranjeros que resuelven el problema de planificación de movimiento en un entorno conocido con obstáculos desconocidos. Se consideran los siguientes métodos de navegación para robots móviles: local, global, individual. En el transcurso del trabajo e investigación se ha construido un prototipo de robot móvil, capaz de reconocer obstáculos de formas geométricas regulares, así como planificar y corregir la trayectoria del movimiento. Los objetos del entorno se identifican y clasifican como obstáculos mediante métodos y algoritmos de procesamiento de imágenes digitales. La distancia al obstáculo y el ángulo relativo se calculan mediante métodos de fotogrametría, la calidad de la imagen se mejora mediante la mejora del contraste lineal y el filtrado lineal óptimo utilizando la ecuación de Wiener-Hopf. Se han revisado las herramientas virtuales, relacionadas con las pruebas de algoritmos de movimiento de robots móviles, lo que nos llevó a seleccionar el paquete de software Webots para las pruebas de prototipos. Los resultados de las pruebas nos permitieron sacar las siguientes conclusiones. El robot móvil identificó con éxito el obstáculo, planificó una ruta de acuerdo con el algoritmo de evitación de obstáculos y continuó avanzando hacia el destino. Se han extraído conclusiones con respecto a la investigación concluida

    Estimating Epipolar Geometry With The Use of a Camera Mounted Orientation Sensor

    Get PDF
    Context: Image processing and computer vision are rapidly becoming more and more commonplace, and the amount of information about a scene, such as 3D geometry, that can be obtained from an image, or multiple images of the scene is steadily increasing due to increasing resolutions and availability of imaging sensors, and an active research community. In parallel, advances in hardware design and manufacturing are allowing for devices such as gyroscopes, accelerometers and magnetometers and GPS receivers to be included alongside imaging devices at a consumer level. Aims: This work aims to investigate the use of orientation sensors in the field of computer vision as sources of data to aid with image processing and the determination of a scene’s geometry, in particular, the epipolar geometry of a pair of images - and devises a hybrid methodology from two sets of previous works in order to exploit the information available from orientation sensors alongside data gathered from image processing techniques. Method: A readily available consumer-level orientation sensor was used alongside a digital camera to capture images of a set of scenes and record the orientation of the camera. The fundamental matrix of these pairs of images was calculated using a variety of techniques - both incorporating data from the orientation sensor and excluding its use Results: Some methodologies could not produce an acceptable result for the Fundamental Matrix on certain image pairs, however, a method described in the literature that used an orientation sensor always produced a result - however in cases where the hybrid or purely computer vision methods also produced a result - this was found to be the least accurate. Conclusion: Results from this work show that the use of an orientation sensor to capture information alongside an imaging device can be used to improve both the accuracy and reliability of calculations of the scene’s geometry - however noise from the orientation sensor can limit this accuracy and further research would be needed to determine the magnitude of this problem and methods of mitigation
    corecore