3,691 research outputs found

    Real-Time GPS-Alternative Navigation Using Commodity Hardware

    Get PDF
    Modern navigation systems can use the Global Positioning System (GPS) to accurately determine position with precision in some cases bordering on millimeters. Unfortunately, GPS technology is susceptible to jamming, interception, and unavailability indoors or underground. There are several navigation techniques that can be used to navigate during times of GPS unavailability, but there are very few that result in GPS-level precision. One method of achieving high precision navigation without GPS is to fuse data obtained from multiple sensors. This thesis explores the fusion of imaging and inertial sensors and implements them in a real-time system that mimics human navigation. In addition, programmable graphics processing unit technology is leveraged to perform stream-based image processing using a computer\u27s video card. The resulting system can perform complex mathematical computations in a fraction of the time those same operations would take on a CPU-based platform. The resulting system is an adaptable, portable, inexpensive and self-contained software and hardware platform, which paves the way for advances in autonomous navigation, mobile cartography, and artificial intelligence

    A high performance vector rendering pipeline

    Get PDF
    Vector images are images which encode visible surfaces of a 3D scene, in a resolution independent format. Prior to this work generation of such an image was not real time. As such the benefits of using them in the graphics pipeline were not fully expressed. In this thesis we propose methods for addressing the following questions. How can we introduce vector images into the graphics pipeline, namingly, how can we produce them in real time. How can we take advantage of resolution independence, and how can we render vector images to a pixel display as efficiently as possible and with the highest quality. There are three main contributions of this work. We have designed a real time vector rendering system. That is, we present a GPU accelerated pipeline which takes as an input a scene with 3D geometry, and outputs a vector image. We call this system SVGPU: Scalable Vector Graphics on the GPU. As mentioned vector images are resolution independent. We have designed a cloud pipeline for streaming vector images. That is, we present system design and optimizations for streaming vector images across interconnection networks, which reduces the bandwidth required for transporting real time 3D content from server to client. Lastly, in this thesis we introduce another added benefit of vector images. We have created a method for rendering them with the highest possible quality. That is, we have designed a new set of operations on vector images, which allows us to anti-alias them during rendering to a canonical 2D image. Our contributions provide the system design, optimizations, and algorithms required to bring vector image utilization and benefits much closer to the real time graphics pipeline. Together they form an end to end pipeline to this purpose, i.e. "A High Performance Vector Rendering Pipeline.

    Real-time Visual Flow Algorithms for Robotic Applications

    Get PDF
    Vision offers important sensor cues to modern robotic platforms. Applications such as control of aerial vehicles, visual servoing, simultaneous localization and mapping, navigation and more recently, learning, are examples where visual information is fundamental to accomplish tasks. However, the use of computer vision algorithms carries the computational cost of extracting useful information from the stream of raw pixel data. The most sophisticated algorithms use complex mathematical formulations leading typically to computationally expensive, and consequently, slow implementations. Even with modern computing resources, high-speed and high-resolution video feed can only be used for basic image processing operations. For a vision algorithm to be integrated on a robotic system, the output of the algorithm should be provided in real time, that is, at least at the same frequency as the control logic of the robot. With robotic vehicles becoming more dynamic and ubiquitous, this places higher requirements to the vision processing pipeline. This thesis addresses the problem of estimating dense visual flow information in real time. The contributions of this work are threefold. First, it introduces a new filtering algorithm for the estimation of dense optical flow at frame rates as fast as 800 Hz for 640x480 image resolution. The algorithm follows a update-prediction architecture to estimate dense optical flow fields incrementally over time. A fundamental component of the algorithm is the modeling of the spatio-temporal evolution of the optical flow field by means of partial differential equations. Numerical predictors can implement such PDEs to propagate current estimation of flow forward in time. Experimental validation of the algorithm is provided using high-speed ground truth image dataset as well as real-life video data at 300 Hz. The second contribution is a new type of visual flow named structure flow. Mathematically, structure flow is the three-dimensional scene flow scaled by the inverse depth at each pixel in the image. Intuitively, it is the complete velocity field associated with image motion, including both optical flow and scale-change or apparent divergence of the image. Analogously to optic flow, structure flow provides a robotic vehicle with perception of the motion of the environment as seen by the camera. However, structure flow encodes the full 3D image motion of the scene whereas optic flow only encodes the component on the image plane. An algorithm to estimate structure flow from image and depth measurements is proposed based on the same filtering idea used to estimate optical flow. The final contribution is the spherepix data structure for processing spherical images. This data structure is the numerical back-end used for the real-time implementation of the structure flow filter. It consists of a set of overlapping patches covering the surface of the sphere. Each individual patch approximately holds properties such as orthogonality and equidistance of points, thus allowing efficient implementations of low-level classical 2D convolution based image processing routines such as Gaussian filters and numerical derivatives. These algorithms are implemented on GPU hardware and can be integrated to future Robotic Embedded Vision systems to provide fast visual information to robotic vehicles

    Map-based localization for urban service mobile robotics

    Get PDF
    Mobile robotics research is currently interested on exporting autonomous navigation results achieved in indoor environments, to more challenging environments, such as, for instance, urban pedestrian areas. Developing mobile robots with autonomous navigation capabilities in such urban environments supposes a basic requirement for a upperlevel service set that could be provided to an users community. However, exporting indoor techniques to outdoor urban pedestrian scenarios is not evident due to the larger size of the environment, the dynamism of the scene due to pedestrians and other moving obstacles, the sunlight conditions, and the high presence of three dimensional elements such as ramps, steps, curbs or holes. Moreover, GPS-based mobile robot localization has demonstrated insufficient performance for robust long-term navigation in urban environments. One of the key modules within autonomous navigation is localization. If localization supposes an a priori map, even if it is not a complete model of the environment, localization is called map-based. This assumption is realistic since current trends of city councils are on building precise maps of their cities, specially of the most interesting places such as city downtowns. Having robots localized within a map allows for a high-level planning and monitoring, so that robots can achieve goal points expressed on the map, by following in a deliberative way a previously planned route. This thesis deals with the mobile robot map-based localization issue in urban pedestrian areas. The thesis approach uses the particle filter algorithm, a well-known and widely used probabilistic and recursive method for data fusion and state estimation. The main contributions of the thesis are divided on four aspects: (1) long-term experiments of mobile robot 2D and 3D position tracking in real urban pedestrian scenarios within a full autonomous navigation framework, (2) developing a fast and accurate technique to compute on-line range observation models in 3D environments, a basic step required by the real-time performance of the developed particle filter, (3) formulation of a particle filter that integrates asynchronous data streams and (4) a theoretical proposal to solve the global localization problem in an active and cooperative way, defining cooperation as either information sharing among the robots or planning joint actions to solve a common goal.Actualment, la recerca en robòtica mòbil té un interés creixent en exportar els resultats de navegació autònoma aconseguits en entorns interiors cap a d'altres tipus d'entorns més exigents, com, per exemple, les àrees urbanes peatonals. Desenvolupar capacitats de navegació autònoma en aquests entorns urbans és un requisit bàsic per poder proporcionar un conjunt de serveis de més alt nivell a una comunitat d'usuaris. Malgrat tot, exportar les tècniques d'interiors cap a entorns exteriors peatonals no és evident, a causa de la major dimensió de l'entorn, del dinamisme de l'escena provocada pels peatons i per altres obstacles en moviment, de la resposta de certs sensors a la il.luminació natural, i de la constant presència d'elements tridimensionals tals com rampes, escales, voreres o forats. D'altra banda, la localització de robots mòbils basada en GPS ha demostrat uns resultats insuficients de cara a una navegació robusta i de llarga durada en entorns urbans. Una de les peces clau en la navegació autònoma és la localització. En el cas que la localització consideri un mapa conegut a priori, encara que no sigui un model complet de l'entorn, parlem d'una localització basada en un mapa. Aquesta assumpció és realista ja que la tendència actual de les administracions locals és de construir mapes precisos de les ciutats, especialment dels llocs d'interés tals com les zones més cèntriques. El fet de tenir els robots localitzats en un mapa permet una planificació i una monitorització d'alt nivell, i així els robots poden arribar a destinacions indicades sobre el mapa, tot seguint de forma deliberativa una ruta prèviament planificada. Aquesta tesi tracta el tema de la localització de robots mòbils, basada en un mapa i per entorns urbans peatonals. La proposta de la tesi utilitza el filtre de partícules, un mètode probabilístic i recursiu, ben conegut i àmpliament utilitzat per la fusió de dades i l'estimació d'estats. Les principals contribucions de la tesi queden dividides en quatre aspectes: (1) experimentació de llarga durada del seguiment de la posició, tant en 2D com en 3D, d'un robot mòbil en entorns urbans reals, en el context de la navegació autònoma, (2) desenvolupament d'una tècnica ràpida i precisa per calcular en temps d'execució els models d'observació de distàncies en entorns 3D, un requisit bàsic pel rendiment del filtre de partícules a temps real, (3) formulació d'un filtre de partícules que integra conjunts de dades asíncrones i (4) proposta teòrica per solucionar la localització global d'una manera activa i cooperativa, entenent la cooperació com el fet de compartir informació, o bé com el de planificar accions conjuntes per solucionar un objectiu comú

    Dynamic Volume Rendering of Functional Medical Data on Dissimilar Hardware Platforms

    Get PDF
    In the last 30 years, medical imaging has become one of the most used diagnostic tools in the medical profession. Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) technologies have become widely adopted because of their ability to capture the human body in a non-invasive manner. A volumetric dataset is a series of orthogonal 2D slices captured at a regular interval, typically along the axis of the body from the head to the feet. Volume rendering is a computer graphics technique that allows volumetric data to be visualized and manipulated as a single 3D object. Iso-surface rendering, image splatting, shear warp, texture slicing, and raycasting are volume rendering methods, each with associated advantages and disadvantages. Raycasting is widely regarded as the highest quality renderer of these methods. Originally, CT and MRI hardware was limited to providing a single 3D scan of the human body. The technology has improved to allow a set of scans capable of capturing anatomical movements like a beating heart. The capturing of anatomical data over time is referred to as functional imaging. Functional MRI (fMRI) is used to capture changes in the human body over time. While fMRI’s can be used to capture any anatomical data over time, one of the more common uses of fMRI is to capture brain activity. The fMRI scanning process is typically broken up into a time consuming high resolution anatomical scan and a series of quick low resolution scans capturing activity. The low resolution activity data is mapped onto the high resolution anatomical data to show changes over time. Academic research has advanced volume rendering and specifically fMRI volume rendering. Unfortunately, academic research is typically a one-off solution to a singular medical case or set of data, causing any advances to be problem specific as opposed to a general capability. Additionally, academic volume renderers are often designed to work on a specific device and operating system under controlled conditions. This prevents volume rendering from being used across the ever expanding number of different computing devices, such as desktops, laptops, immersive virtual reality systems, and mobile computers like phones or tablets. This research will investigate the feasibility of creating a generic software capability to perform real-time 4D volume rendering, via raycasting, on desktop, mobile, and immersive virtual reality platforms. Implementing a GPU-based 4D volume raycasting method for mobile devices will harness the power of the increasing number of mobile computational devices being used by medical professionals. Developing support for immersive virtual reality can enhance medical professionals’ interpretation of 3D physiology with the additional depth information provided by stereoscopic 3D. The results of this research will help expand the use of 4D volume rendering beyond the traditional desktop computer in the medical field. Developing the same 4D volume rendering capabilities across dissimilar platforms has many challenges. Each platform relies on their own coding languages, libraries, and hardware support. There are tradeoffs between using languages and libraries native to each platform and using a generic cross-platform system, such as a game engine. Native libraries will generally be more efficient during application run-time, but they require different coding implementations for each platform. The decision was made to use platform native languages and libraries in this research, whenever practical, in an attempt to achieve the best possible frame rates. 4D volume raycasting provides unique challenges independent of the platform. Specifically, fMRI data loading, volume animation, and multiple volume rendering. Additionally, real-time raycasting has never been successfully performed on a mobile device. Previous research relied on less computationally expensive methods, such as orthogonal texture slicing, to achieve real-time frame rates. These challenges will be addressed as the contributions of this research. The first contribution was exploring the feasibility of generic functional data input across desktop, mobile, and immersive virtual reality. To visualize 4D fMRI data it was necessary to build in the capability to read Neuroimaging Informatics Technology Initiative (NIfTI) files. The NIfTI format was designed to overcome limitations of 3D file formats like DICOM and store functional imagery with a single high-resolution anatomical scan and a set of low-resolution anatomical scans. Allowing input of the NIfTI binary data required creating custom C++ routines, as no object oriented APIs freely available for use existed. The NIfTI input code was built using C++ and the C++ Standard Library to be both light weight and cross-platform. Multi-volume rendering is another challenge of fMRI data visualization and a contribution of this work. fMRI data is typically broken into a single high-resolution anatomical volume and a series of low-resolution volumes that capture anatomical changes. Visualizing two volumes at the same time is known as multi-volume visualization. Therefore, the ability to correctly align and scale the volumes relative to each other was necessary. It was also necessary to develop a compositing method to combine data from both volumes into a single cohesive representation. Three prototype applications were built for the different platforms to test the feasibility of 4D volume raycasting. One each for desktop, mobile, and virtual reality. Although the backend implementations were required to be different between the three platforms, the raycasting functionality and features were identical. Therefore, the same fMRI dataset resulted in the same 3D visualization independent of the platform itself. Each platform uses the same NIfTI data loader and provides support for dataset coloring and windowing (tissue density manipulation). The fMRI data can be viewed changing over time by either animation through the time steps, like a movie, or using an interface slider to “scrub” through the different time steps of the data. The prototype applications data load times and frame rates were tested to determine if they achieved the real-time interaction goal. Real-time interaction was defined by achieving 10 frames per second (fps) or better, based on the work of Miller [1]. The desktop version was evaluated on a 2013 MacBook Pro running OS X 10.12 with a 2.6 GHz Intel Core i7 processor, 16 GB of RAM, and a NVIDIA GeForce GT 750M graphics card. The immersive application was tested in the C6 CAVE™, a 96 graphics node computer cluster comprised of NVIDIA Quadro 6000 graphics cards running Red Hat Enterprise Linux. The mobile application was evaluated on a 2016 9.7” iPad Pro running iOS 9.3.4. The iPad had a 64-bit Apple A9X dual core processor with 2 GB of built in memory. Two different fMRI brain activity datasets with different voxel resolutions were used as test datasets. Datasets were tested using both the 3D structural data, the 4D functional data, and a combination of the two. Frame rates for the desktop implementation were consistently above 10 fps, indicating that real-time 4D volume raycasting is possible on desktop hardware. The mobile and virtual reality platforms were able to perform real-time 3D volume raycasting consistently. This is a marked improvement for 3D mobile volume raycasting that was previously only able to achieve under one frame per second [2]. Both VR and mobile platforms were able to raycast the 4D only data at real-time frame rates, but did not consistently meet 10 fps when rendering both the 3D structural and 4D functional data simultaneously. However, 7 frames per second was the lowest frame rate recorded, indicating that hardware advances will allow consistent real-time raycasting of 4D fMRI data in the near future

    Variable Resolution & Dimensional Mapping For 3d Model Optimization

    Get PDF
    Three-dimensional computer models, especially geospatial architectural data sets, can be visualized in the same way humans experience the world, providing a realistic, interactive experience. Scene familiarization, architectural analysis, scientific visualization, and many other applications would benefit from finely detailed, high resolution, 3D models. Automated methods to construct these 3D models traditionally has produced data sets that are often low fidelity or inaccurate; otherwise, they are initially highly detailed, but are very labor and time intensive to construct. Such data sets are often not practical for common real-time usage and are not easily updated. This thesis proposes Variable Resolution & Dimensional Mapping (VRDM), a methodology that has been developed to address some of the limitations of existing approaches to model construction from images. Key components of VRDM are texture palettes, which enable variable and ultra-high resolution images to be easily composited; texture features, which allow image features to integrated as image or geometry, and have the ability to modify the geometric model structure to add detail. These components support a primary VRDM objective of facilitating model refinement with additional data. This can be done until the desired fidelity is achieved as practical limits of infinite detail are approached. Texture Levels, the third component, enable real-time interaction with a very detailed model, along with the flexibility of having alternate pixel data for a given area of the model and this is achieved through extra dimensions. Together these techniques have been used to construct models that can contain GBs of imagery data

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world
    corecore