435 research outputs found

    FPGA-based High-Performance Collision Detection: An Enabling Technique for Image-Guided Robotic Surgery

    Get PDF
    Collision detection, which refers to the computational problem of finding the relative placement or con-figuration of two or more objects, is an essential component of many applications in computer graphics and robotics. In image-guided robotic surgery, real-time collision detection is critical for preserving healthy anatomical structures during the surgical procedure. However, the computational complexity of the problem usually results in algorithms that operate at low speed. In this paper, we present a fast and accurate algorithm for collision detection between Oriented-Bounding-Boxes (OBBs) that is suitable for real-time implementation. Our proposed Sweep and Prune algorithm can perform a preliminary filtering to reduce the number of objects that need to be tested by the classical Separating Axis Test algorithm, while the OBB pairs of interest are preserved. These OBB pairs are re-checked by the Separating Axis Test algorithm to obtain accurate overlapping status between them. To accelerate the execution, our Sweep and Prune algorithm is tailor-made for the proposed method. Meanwhile, a high performance scalable hardware architecture is proposed by analyzing the intrinsic parallelism of our algorithm, and is implemented on FPGA platform. Results show that our hardware design on the FPGA platform can achieve around 8X higher running speed than the software design on a CPU platform. As a result, the proposed algorithm can achieve a collision frame rate of 1 KHz, and fulfill the requirement for the medical surgery scenario of Robot Assisted Laparoscopy.published_or_final_versio

    Analysis domain model for shared virtual environments

    Get PDF
    The field of shared virtual environments, which also encompasses online games and social 3D environments, has a system landscape consisting of multiple solutions that share great functional overlap. However, there is little system interoperability between the different solutions. A shared virtual environment has an associated problem domain that is highly complex raising difficult challenges to the development process, starting with the architectural design of the underlying system. This paper has two main contributions. The first contribution is a broad domain analysis of shared virtual environments, which enables developers to have a better understanding of the whole rather than the part(s). The second contribution is a reference domain model for discussing and describing solutions - the Analysis Domain Model

    Reducing redundancy of real time computer graphics in mobile systems

    Get PDF
    The goal of this thesis is to propose novel and effective techniques to eliminate redundant computations that waste energy and are performed in real-time computer graphics applications, with special focus on mobile GPU micro-architecture. Improving the energy-efficiency of CPU/GPU systems is not only key to enlarge their battery life, but also allows to increase their performance because, to avoid overheating above thermal limits, SoCs tend to be throttled when the load is high for a large period of time. Prior studies pointed out that the CPU and especially the GPU are the principal energy consumers in the graphics subsystem, being the off-chip main memory accesses and the processors inside the GPU the primary energy consumers of the graphics subsystem. First, we focus on reducing redundant fragment processing computations by means of improving the culling of hidden surfaces. During real-time graphics rendering, objects are processed by the GPU in the order they are submitted by the CPU, and occluded surfaces are often processed even though they will end up not being part of the final image. When the GPU realizes that an object or part of it is not going to be visible, all activity required to compute its color and store it has already been performed. We propose a novel architectural technique for mobile GPUs, Visibility Rendering Order (VRO), which reorders objects front-to-back entirely in hardware to maximize the culling effectiveness of the GPU and minimize overshading, hence reducing execution time and energy consumption. VRO exploits the fact that the objects in graphics animated applications tend to keep its relative depth order across consecutive frames (temporal coherence) to provide the feeling of smooth transition. VRO keeps visibility information of a frame, and uses it to reorder the objects of the following frame. VRO just requires adding a small hardware to capture the visibility information and use it later to guide the rendering of the following frame. Moreover, VRO works in parallel with the graphics pipeline, so negligible performance overheads are incurred. We illustrate the benefits of VRO using various unmodified commercial 3D applications for which VRO achieves 27% speed-up and 14.8% energy reduction on average. Then, we focus on avoiding redundant computations related to CPU Collision Detection (CD). Graphics applications such as 3D games represent a large percentage of downloaded applications for mobile devices and the trend is towards more complex and realistic scenes with accurate 3D physics simulations. CD is one of the most important algorithms in any physics kernel since it identifies the contact points between the objects of a scene and determines when they collide. However, real-time accurate CD is very expensive in terms of energy consumption. We propose Render Based Collision Detection (RBCD), a novel energy-efficient high-fidelity CD scheme that leverages some intermediate results of the rendering pipeline to perform CD, so that redundant tasks are done just once. Comparing RBCD with a conventional CD completely executed in the CPU, we show that its execution time is reduced by almost three orders of magnitude (600x speedup), because most of the CD task of our model comes for free by reusing the image rendering intermediate results. Although not necessarily, such a dramatic time improvement may result in better frames per second if physics simulation stays in the critical path. However, the most important advantage of our technique is the enormous energy savings that result from eliminating a long and costly CPU computation and converting it into a few simple operations executed by a specialized hardware within the GPU. Our results show that the energy consumed by CD is reduced on average by a factor of 448x (i.e., by 99.8\%). These dramatic benefits are accompanied by a higher fidelity CD analysis (i.e., with finer granularity), which improves the quality and realism of the application.El objetivo de esta tesis es proponer tĂ©cnicas efectivas y originales para eliminar computaciones inĂștiles que aparecen en aplicaciones grĂĄficas, con especial Ă©nfasis en micro-arquitectura de GPUs. Mejorar la eficiencia energĂ©tica de los sistemas CPU/GPU no es solo clave para alargar la vida de la baterĂ­a, sino tambiĂ©n incrementar su rendimiento. Estudios previos han apuntado que la CPU y especialmente la GPU son los principales consumidores de energĂ­a en el sub-sistema grĂĄfico, siendo los accesos a memoria off-chip y los procesadores dentro de la GPU los principales consumidores de energĂ­a del sub-sistema grĂĄfico. Primero, nos hemos centrado en reducir computaciones redundantes de la fase de fragment processing mediante la mejora en la eliminaciĂłn de superficies ocultas. Durante el renderizado de grĂĄficos en tiempo real, los objetos son procesados por la GPU en el orden en el que son enviados por la CPU, y las superficies ocultas son a menudo procesadas incluso si no no acaban formando parte de la imagen final. Cuando la GPU averigua que el objeto o parte de Ă©l no es visible, toda la actividad requerida para computar su color y guardarlo ha sido realizada. Proponemos una tĂ©cnica arquitectĂłnica original para GPUs mĂłviles, Visibility Rendering Order (VRO), la cual reordena los objetos de delante hacia atrĂĄs por completo en hardware para maximizar la efectividad del culling de la GPU y asĂ­ minimizar el overshading, y por lo tanto reducir el tiempo de ejecuciĂłn y el consumo de energĂ­a. VRO explota el hecho de que los objetos de las aplicaciones grĂĄficas animadas tienden a mantener su orden relativo en profundidad a travĂ©s de frames consecutivos (coherencia temporal) para proveer animaciones con transiciones suaves. Dado que las relaciones de orden en profundidad entre objetos son testeadas en la GPU, VRO introduce costes mĂ­nimos en energĂ­a. Solo requiere añadir una pequeña unidad hardware para capturar la informaciĂłn de visibilidad. AdemĂĄs, VRO trabaja en paralelo con el pipeline grĂĄfico, por lo que introduce costes insignificantes en tiempo. Ilustramos los beneficios de VRO usango varias aplicaciones 3D comerciales para las cuales VRO consigue un 27% de speed-up y un 14.8% de reducciĂłn de energĂ­a en media. En segundo lugar, evitamos computaciones redundantes relacionadas con la DetecciĂłn de Colisiones (CD) en la CPU. Las aplicaciones grĂĄficas animadas como los juegos 3D representan un alto porcentaje de las aplicaciones descargadas en dispositivos mĂłviles y la tendencia es hacia escenas mĂĄs complejas y realistas con simulaciones fĂ­sicas 3D precisas. La CD es uno de los algoritmos mĂĄs importantes entre los kernel de fĂ­sicas dado que identifica los puntos de contacto entre los objetos de una escena. Sin embargo, una CD en tiempo real y precisa es muy costosa en tĂ©rminos de consumo energĂ©tico. Proponemos Render Based Collision Detection (RBCD), una tĂ©cnica energĂ©ticamente eficiente y preciso de CD que utiliza resultados intermedios del rendering pipeline para realizar la CD. Comparando RBCD con una CD convencional completamente ejecutada en la CPU, mostramos que el tiempo de ejecuciĂłn es reducido casi tres Ăłrdenes de magnitud (600x speedup), porque la mayorĂ­a de la CD de nuestro modelo reusa resultados intermedios del renderizado de la imagen. Aunque no es asĂ­ necesariamente, esta espectacular en tiempo puede resultar en mejores frames por segundo si la simulaciĂłn de fĂ­sicas estĂĄ en el camino crĂ­tico. Sin embargo, la ventaja mĂĄs importante de nuestra tĂ©cnica es el enorme ahorro de energĂ­a que resulta de eliminar las largas y costosas computaciones en la CPU, sustituyĂ©ndolas por unas pocas operaciones ejecutadas en un hardware especializado dentro de la GPU. Nuestros resultados muestran que la energĂ­a consumida por la CD es reducidad en media por un factor de 448x. Estos dramĂĄticos beneficios vienen acompañados de una mayor fidelidad en la CD (i.e. con granularidad mĂĄs fina)Postprint (published version

    Efficient computation of discrete Voronoi diagram and homotopy-preserving simplified medial axis of a 3d polyhedron

    Get PDF
    The Voronoi diagram is a fundamental geometric data structure and has been well studied in computational geometry and related areas. A Voronoi diagram defined using the Euclidean distance metric is also closely related to the Blum medial axis, a well known skeletal representation. Voronoi diagrams and medial axes have been shown useful for many 3D computations and operations, including proximity queries, motion planning, mesh generation, finite element analysis, and shape analysis. However, their application to complex 3D polyhedral and deformable models has been limited. This is due to the difficulty of computing exact Voronoi diagrams in an efficient and reliable manner. In this dissertation, we bridge this gap by presenting efficient algorithms to compute discrete Voronoi diagrams and simplified medial axes of 3D polyhedral models with geometric and topological guarantees. We apply these algorithms to complex 3D models and use them to perform interactive proximity queries, motion planning and skeletal computations. We present three new results. First, we describe an algorithm to compute 3D distance fields of geometric models by using a linear factorization of Euclidean distance vectors. This formulation maps directly to the linearly interpolating graphics rasterization hardware and enables us to compute distance fields of complex 3D models at interactive rates. We also use clamping and culling algorithms based on properties of Voronoi diagrams to accelerate this computation. We introduce surface distance maps, which are a compact distance vector field representation based on a mesh parameterization of triangulated two-manifolds, and use them to perform proximity computations. Our second main result is an adaptive sampling algorithm to compute an approximate Voronoi diagram that is homotopy equivalent to the exact Voronoi diagram and preserves topological features. We use this algorithm to compute a homotopy-preserving simplified medial axis of complex 3D models. Our third result is a unified approach to perform different proximity queries among multiple deformable models using second order discrete Voronoi diagrams. We introduce a new query called N-body distance query and show that different proximity queries, including collision detection, separation distance and penetration depth can be performed based on Nbody distance query. We compute the second order discrete Voronoi diagram using graphics hardware and use distance bounds to overcome the sampling errors and perform conservative computations. We have applied these queries to various deformable simulations and observed up to an order of magnitude improvement over prior algorithms

    CGAMES'2009

    Get PDF

    A Motion Planning Processor on Reconfigurable Hardware

    Get PDF
    Motion planning algorithms enable us to ïŹnd feasible paths for moving objects. These algorithms utilize feasibility checks to differentiate valid paths from invalid ones. Unfortunately, the computationally expensive nature of such checks reduces the effectiveness of motion planning algorithms. However, by using hardware acceleration to speed up the feasibility checks, we can greatly enhance the performance of the motion planning algorithms. Of course, such acceleration is not limited to feasibility checks; other components of motion planning algorithms can also be accelerated using specially designed hardware. A Field Programmable Gate Array (FPGA) is a great platform to support such an acceleration. An FPGA is a collection of digital gates which can be reprogrammed at run time, i.e., it can be used as a CPU that reconïŹgures itself for a given task. In this paper, we study the feasibility of an FPGA based motion planning processor and evaluate its performance. In order to leverage its highly parallel nature and its modular structure, our processor utilizes the probabilistic roadmap method at its core. The modularity enables us to replace the feasibility criteria with other ones. The reconïŹgurability lets us run our processor in different roles, such as a motion planning co-processor, an autonomous motion planning processor or dedicated collision detection chip. Our experiments show that such a processor is not only feasible but also can greatly increase the performance of current algorithms

    Fast and exact continuous collision detection with Bernstein sign classification

    Get PDF
    We present fast algorithms to perform accurate CCD queries between triangulated models. Our formulation uses properties of the Bernstein basis and BĂ©zier curves and reduces the problem to evaluating signs of polynomials. We present a geometrically exact CCD algorithm based on the exact geometric computation paradigm to perform reliable Boolean collision queries. Our algorithm is more than an order of magnitude faster than prior exact algorithms. We evaluate its performance for cloth and FEM simulations on CPUs and GPUs, and highlight the benefits

    Towards a High Quality Real-Time Graphics Pipeline

    Get PDF
    Modern graphics hardware pipelines create photorealistic images with high geometric complexity in real time. The quality is constantly improving and advanced techniques from feature film visual effects, such as high dynamic range images and support for higher-order surface primitives, have recently been adopted. Visual effect techniques have large computational costs and significant memory bandwidth usage. In this thesis, we identify three problem areas and propose new algorithms that increase the performance of a set of computer graphics techniques. Our main focus is on efficient algorithms for the real-time graphics pipeline, but parts of our research are equally applicable to offline rendering. Our first focus is texture compression, which is a technique to reduce the memory bandwidth usage. The core idea is to store images in small compressed blocks which are sent over the memory bus and are decompressed on-the-fly when accessed. We present compression algorithms for two types of texture formats. High dynamic range images capture environment lighting with luminance differences over a wide intensity range. Normal maps store perturbation vectors for local surface normals, and give the illusion of high geometric surface detail. Our compression formats are tailored to these texture types and have compression ratios of 6:1, high visual fidelity, and low-cost decompression logic. Our second focus is tessellation culling. Culling is a commonly used technique in computer graphics for removing work that does not contribute to the final image, such as completely hidden geometry. By discarding rendering primitives from further processing, substantial arithmetic computations and memory bandwidth can be saved. Modern graphics processing units include flexible tessellation stages, where rendering primitives are subdivided for increased geometric detail. Images with highly detailed models can be synthesized, but the incurred cost is significant. We have devised a simple remapping technique that allowsfor better tessellation distribution in screen space. Furthermore, we present programmable tessellation culling, where bounding volumes for displaced geometry are computed and used to conservatively test if a primitive can be discarded before tessellation. We introduce a general tessellation culling framework, and an optimized algorithm for rendering of displaced BĂ©zier patches, which is expected to be a common use case for graphics hardware tessellation. Our third and final focus is forward-looking, and relates to efficient algorithms for stochastic rasterization, a rendering technique where camera effects such as depth of field and motion blur can be faithfully simulated. We extend a graphics pipeline with stochastic rasterization in spatio-temporal space and show that stochastic motion blur can be rendered with rather modest pipeline modifications. Furthermore, backface culling algorithms for motion blur and depth of field rendering are presented, which are directly applicable to stochastic rasterization. Hopefully, our work in this field brings us closer to high quality real-time stochastic rendering

    WebGL-Based Simulation of Bone Removal in Surgical Orthopeadic Procedures

    Get PDF
    The effective role of virtual reality simulators in surgical operations has been demonstrated during the last decades. The proposed work has been done to give a perspective of the actual orthopeadic surgeries such as a total shoulder arthroplasty with low incidence and visibility of the operation to the surgeon. The research in this thesis is focused on the design and implementation of a web-based graphical feedback for a total shoulder arthroplasty (TSA) surgery. For portability of the simulation and powerful 3D programming features, WebGL is being applied. To simulate the reaming process of the shoulder bone, multiple steps has been passed to be able to remove the volumetric amount of bone which was touched by the reamer tool. A fast and accurate collision detection algorithm utilizing Möller –Trumbore ray-triangle method was implemented to detect the first collision of the bone and the tool in order to accelerate the computations for the bone removal process. Once the collision detected, a mesh Boolean operation using CSG method is being invoked to calculate the volumetric amount of bone which is intersected with the tool and should be removed. This work involves the user interaction to transform the tool in a Three.js scene for the simulated operation
    • 

    corecore