607 research outputs found

    Performance Analysis of a Novel GPU Computation-to-core Mapping Scheme for Robust Facet Image Modeling

    Get PDF
    Though the GPGPU concept is well-known in image processing, much more work remains to be done to fully exploit GPUs as an alternative computation engine. This paper investigates the computation-to-core mapping strategies to probe the efficiency and scalability of the robust facet image modeling algorithm on GPUs. Our fine-grained computation-to-core mapping scheme shows a significant performance gain over the standard pixel-wise mapping scheme. With in-depth performance comparisons across the two different mapping schemes, we analyze the impact of the level of parallelism on the GPU computation and suggest two principles for optimizing future image processing applications on the GPU platform

    ImMesh: An Immediate LiDAR Localization and Meshing Framework

    Full text link
    In this paper, we propose a novel LiDAR(-inertial) odometry and mapping framework to achieve the goal of simultaneous localization and meshing in real-time. This proposed framework termed ImMesh comprises four tightly-coupled modules: receiver, localization, meshing, and broadcaster. The localization module utilizes the prepossessed sensor data from the receiver, estimates the sensor pose online by registering LiDAR scans to maps, and dynamically grows the map. Then, our meshing module takes the registered LiDAR scan for incrementally reconstructing the triangle mesh on the fly. Finally, the real-time odometry, map, and mesh are published via our broadcaster. The key contribution of this work is the meshing module, which represents a scene by an efficient hierarchical voxels structure, performs fast finding of voxels observed by new scans, and reconstructs triangle facets in each voxel in an incremental manner. This voxel-wise meshing operation is delicately designed for the purpose of efficiency; it first performs a dimension reduction by projecting 3D points to a 2D local plane contained in the voxel, and then executes the meshing operation with pull, commit and push steps for incremental reconstruction of triangle facets. To the best of our knowledge, this is the first work in literature that can reconstruct online the triangle mesh of large-scale scenes, just relying on a standard CPU without GPU acceleration. To share our findings and make contributions to the community, we make our code publicly available on our GitHub: https://github.com/hku-mars/ImMesh

    Geometric Feature Learning for 3D Meshes

    Full text link
    Geometric feature learning for 3D meshes is central to computer graphics and highly important for numerous vision applications. However, deep learning currently lags in hierarchical modeling of heterogeneous 3D meshes due to the lack of required operations and/or their efficient implementations. In this paper, we propose a series of modular operations for effective geometric deep learning over heterogeneous 3D meshes. These operations include mesh convolutions, (un)pooling and efficient mesh decimation. We provide open source implementation of these operations, collectively termed \textit{Picasso}. The mesh decimation module of Picasso is GPU-accelerated, which can process a batch of meshes on-the-fly for deep learning. Our (un)pooling operations compute features for newly-created neurons across network layers of varying resolution. Our mesh convolutions include facet2vertex, vertex2facet, and facet2facet convolutions that exploit vMF mixture and Barycentric interpolation to incorporate fuzzy modelling. Leveraging the modular operations of Picasso, we contribute a novel hierarchical neural network, PicassoNet-II, to learn highly discriminative features from 3D meshes. PicassoNet-II accepts primitive geometrics and fine textures of mesh facets as input features, while processing full scene meshes. Our network achieves highly competitive performance for shape analysis and scene parsing on a variety of benchmarks. We release Picasso and PicassoNet-II on Github https://github.com/EnyaHermite/Picasso.Comment: Submitted to TPAM

    Improving digital correlation algorithm for real time use.

    Get PDF
    V této práci je prezentována sada vylepšení algoritmu DIC. Výsledkem těchto vylepšení by měla větší uživatelská přívětivost algoritmu DIC a použitelnost pro běžného uživatele.První sada vylepšení se zaměřuje na implementaci algoritmu DIC pomocí programovacího jazyka OpenCL. To umožňuje spustit algoritmus na širokém spektru dostupného hardwaru, zejména na GPU. Jak ukazují testy, výpočet DIC na GPU vede k významnému zrychlení (až 30x oproti základní variantě a 10x v porovnání s paralelní variantou). Další vylepšení se zaměřují na optimalizaci velikosti dat s cílem snížit režii přenosu dat z RAM na GPU a studii o tom, jak implementace OpenCL funguje na integrovaných GPU a procesorech.Další vylepšení se snaží předzpracovat vstupní data tak, aby zlepšila strukturu vzorků, čímž aby se zlepšila kvalita výsledné korelace. Výsledky ukazují zlepšení kvality výsledků, jsou ale vykoupeny zvýšenou dobou výpočtu. Poslední vylepšení je návrh plně automatického algoritmu, který vybírá nejlepší velikost okna pro dosažení co nejlepšího výsledku. Algoritmus se pokusí nalézt optimální velikost okna pro vyvážení systematických a náhodných chyb sledováním funkční závislosti kvality korelace a velikosti okna.In this work, a set of improvements to DIC algorithm is presented. The result of these improvements should make the DIC algorithm more user friendly and better usable for common users.First set of improvements focuses on implementing the DIC algorithm using OpenCL programming language. This allows to run the algorithm on wide range of available hardware, most notably on GPUs. As tests show, running DIC on GPU leads to significant speedup (reaching 30x compared to basic variant and 10x compared to threaded variant). Further improvements focus on optimizing the data size in order to lower the overhead of RAM to GPU transfers and a study on how the OpenCL implementation performs on integrated GPUs and CPUs.Next improvement processes the input data in order to enhance the specimens texture to improve the quality of the correlations. The experiments show improvement of the quality of the results, but they are redeemed in increased computation time.Last improvement is a design of an fully automatic algorithm that selects the best subset size to get the best results possible. The algorithm tries to find the optimal subset size to balance the systematic and random errors by monitoring the function of correlation quality versus subset size

    Compression, Modeling, and Real-Time Rendering of Realistic Materials and Objects

    Get PDF
    The realism of a scene basically depends on the quality of the geometry, the illumination and the materials that are used. Whereas many sources for the creation of three-dimensional geometry exist and numerous algorithms for the approximation of global illumination were presented, the acquisition and rendering of realistic materials remains a challenging problem. Realistic materials are very important in computer graphics, because they describe the reflectance properties of surfaces, which are based on the interaction of light and matter. In the real world, an enormous diversity of materials can be found, comprising very different properties. One important objective in computer graphics is to understand these processes, to formalize them and to finally simulate them. For this purpose various analytical models do already exist, but their parameterization remains difficult as the number of parameters is usually very high. Also, they fail for very complex materials that occur in the real world. Measured materials, on the other hand, are prone to long acquisition time and to huge input data size. Although very efficient statistical compression algorithms were presented, most of them do not allow for editability, such as altering the diffuse color or mesostructure. In this thesis, a material representation is introduced that makes it possible to edit these features. This makes it possible to re-use the acquisition results in order to easily and quickly create deviations of the original material. These deviations may be subtle, but also substantial, allowing for a wide spectrum of material appearances. The approach presented in this thesis is not based on compression, but on a decomposition of the surface into several materials with different reflection properties. Based on a microfacette model, the light-matter interaction is represented by a function that can be stored in an ordinary two-dimensional texture. Additionally, depth information, local rotations, and the diffuse color are stored in these textures. As a result of the decomposition, some of the original information is inevitably lost, therefore an algorithm for the efficient simulation of subsurface scattering is presented as well. Another contribution of this work is a novel perception-based simplification metric that includes the material of an object. This metric comprises features of the human visual system, for example trichromatic color perception or reduced resolution. The proposed metric allows for a more aggressive simplification in regions where geometric metrics do not simplif

    Meshless Mechanics and Point-Based Visualization Methods for Surgical Simulations

    Get PDF
    Computer-based modeling and simulation practices have become an integral part of the medical education field. For surgical simulation applications, realistic constitutive modeling of soft tissue is considered to be one of the most challenging aspects of the problem, because biomechanical soft-tissue models need to reflect the correct elastic response, have to be efficient in order to run at interactive simulation rates, and be able to support operations such as cuts and sutures. Mesh-based solutions, where the connections between the individual degrees of freedom (DoF) are defined explicitly, have been the traditional choice to approach these problems. However, when the problem under investigation contains a discontinuity that disrupts the connectivity between the DoFs, the underlying mesh structure has to be reconfigured in order to handle the newly introduced discontinuity correctly. This reconfiguration for mesh-based techniques is typically called dynamic remeshing, and most of the time it causes the performance bottleneck in the simulation. In this dissertation, the efficiency of point-based meshless methods is investigated for both constitutive modeling of elastic soft tissues and visualization of simulation objects, where arbitrary discontinuities/cuts are applied to the objects in the context of surgical simulation. The point-based deformable object modeling problem is examined in three functional aspects: modeling continuous elastic deformations with, handling discontinuities in, and visualizing a point-based object. Algorithmic and implementation details of the presented techniques are discussed in the dissertation. The presented point-based techniques are implemented as separate components and integrated into the open-source software framework SOFA. The presented meshless continuum mechanics model of elastic tissue were verified by comparing it to the Hertzian non-adhesive frictionless contact theory. Virtual experiments were setup with a point-based deformable block and a rigid indenter, and force-displacement curves obtained from the virtual experiments were compared to the theoretical solutions. The meshless mechanics model of soft tissue and the integrated novel discontinuity treatment technique discussed in this dissertation allows handling cuts of arbitrary shape. The implemented enrichment technique not only modifies the internal mechanics of the soft tissue model, but also updates the point-based visual representation in an efficient way preventing the use of costly dynamic remeshing operations

    General Purpose Computing on Graphics Processing Units for Accelerated Deep Learning in Neural Networks

    Get PDF
    Graphics processing units (GPUs) contain a significant number of cores relative to central processing units (CPUs), allowing them to handle high levels of parallelization in multithreading. A general-purpose GPU (GPGPU) is a GPU that has its threads and memory repurposed on a software level to leverage the multithreading made possible by the GPU’s hardware, and thus is an extremely strong platform for intense computing – there is no hardware difference between GPUs and GPGPUs. Deep learning is one such example of intense computing that is best implemented on a GPGPU, as its hardware structure of a grid of blocks, each containing processing threads, can handle the immense number of necessary calculations in parallel. A convolutional neural network (CNN) created for financial data analysis shows this advantage in the runtime of the training and testing of a neural network
    corecore