607 research outputs found
Performance Analysis of a Novel GPU Computation-to-core Mapping Scheme for Robust Facet Image Modeling
Though the GPGPU concept is well-known
in image processing, much more work remains to be done
to fully exploit GPUs as an alternative computation
engine. This paper investigates the computation-to-core
mapping strategies to probe the efficiency and scalability
of the robust facet image modeling algorithm on GPUs.
Our fine-grained computation-to-core mapping scheme
shows a significant performance gain over the standard
pixel-wise mapping scheme. With in-depth performance
comparisons across the two different mapping schemes,
we analyze the impact of the level of parallelism on
the GPU computation and suggest two principles for
optimizing future image processing applications on the
GPU platform
ImMesh: An Immediate LiDAR Localization and Meshing Framework
In this paper, we propose a novel LiDAR(-inertial) odometry and mapping
framework to achieve the goal of simultaneous localization and meshing in
real-time. This proposed framework termed ImMesh comprises four tightly-coupled
modules: receiver, localization, meshing, and broadcaster. The localization
module utilizes the prepossessed sensor data from the receiver, estimates the
sensor pose online by registering LiDAR scans to maps, and dynamically grows
the map. Then, our meshing module takes the registered LiDAR scan for
incrementally reconstructing the triangle mesh on the fly. Finally, the
real-time odometry, map, and mesh are published via our broadcaster. The key
contribution of this work is the meshing module, which represents a scene by an
efficient hierarchical voxels structure, performs fast finding of voxels
observed by new scans, and reconstructs triangle facets in each voxel in an
incremental manner. This voxel-wise meshing operation is delicately designed
for the purpose of efficiency; it first performs a dimension reduction by
projecting 3D points to a 2D local plane contained in the voxel, and then
executes the meshing operation with pull, commit and push steps for incremental
reconstruction of triangle facets. To the best of our knowledge, this is the
first work in literature that can reconstruct online the triangle mesh of
large-scale scenes, just relying on a standard CPU without GPU acceleration. To
share our findings and make contributions to the community, we make our code
publicly available on our GitHub: https://github.com/hku-mars/ImMesh
Geometric Feature Learning for 3D Meshes
Geometric feature learning for 3D meshes is central to computer graphics and
highly important for numerous vision applications. However, deep learning
currently lags in hierarchical modeling of heterogeneous 3D meshes due to the
lack of required operations and/or their efficient implementations. In this
paper, we propose a series of modular operations for effective geometric deep
learning over heterogeneous 3D meshes. These operations include mesh
convolutions, (un)pooling and efficient mesh decimation. We provide open source
implementation of these operations, collectively termed \textit{Picasso}. The
mesh decimation module of Picasso is GPU-accelerated, which can process a batch
of meshes on-the-fly for deep learning. Our (un)pooling operations compute
features for newly-created neurons across network layers of varying resolution.
Our mesh convolutions include facet2vertex, vertex2facet, and facet2facet
convolutions that exploit vMF mixture and Barycentric interpolation to
incorporate fuzzy modelling. Leveraging the modular operations of Picasso, we
contribute a novel hierarchical neural network, PicassoNet-II, to learn highly
discriminative features from 3D meshes. PicassoNet-II accepts primitive
geometrics and fine textures of mesh facets as input features, while processing
full scene meshes. Our network achieves highly competitive performance for
shape analysis and scene parsing on a variety of benchmarks. We release Picasso
and PicassoNet-II on Github https://github.com/EnyaHermite/Picasso.Comment: Submitted to TPAM
Improving digital correlation algorithm for real time use.
V této práci je prezentována sada vylepšení algoritmu DIC. Výsledkem těchto vylepšení by měla větší uživatelská přívětivost algoritmu DIC a použitelnost pro běžného uživatele.První sada vylepšení se zaměřuje na implementaci algoritmu DIC pomocí programovacího jazyka OpenCL. To umožňuje spustit algoritmus na širokém spektru dostupného hardwaru, zejména na GPU. Jak ukazují testy, výpočet DIC na GPU vede k významnému zrychlení (až 30x oproti základní variantě a 10x v porovnání s paralelní variantou). Další vylepšení se zaměřují na optimalizaci velikosti dat s cílem snížit režii přenosu dat z RAM na GPU a studii o tom, jak implementace OpenCL funguje na integrovaných GPU a procesorech.Další vylepšení se snaží předzpracovat vstupní data tak, aby zlepšila strukturu vzorků, čímž aby se zlepšila kvalita výsledné korelace. Výsledky ukazují zlepšení kvality výsledků, jsou ale vykoupeny zvýšenou dobou výpočtu. Poslední vylepšení je návrh plně automatického algoritmu, který vybírá nejlepší velikost okna pro dosažení co nejlepšího výsledku. Algoritmus se pokusí nalézt optimální velikost okna pro vyvážení systematických a náhodných chyb sledováním funkční závislosti kvality korelace a velikosti okna.In this work, a set of improvements to DIC algorithm is presented. The result of these improvements should make the DIC algorithm more user friendly and better usable for common users.First set of improvements focuses on implementing the DIC algorithm using OpenCL programming language. This allows to run the algorithm on wide range of available hardware, most notably on GPUs. As tests show, running DIC on GPU leads to significant speedup (reaching 30x compared to basic variant and 10x compared to threaded variant). Further improvements focus on optimizing the data size in order to lower the overhead of RAM to GPU transfers and a study on how the OpenCL implementation performs on integrated GPUs and CPUs.Next improvement processes the input data in order to enhance the specimens texture to improve the quality of the correlations. The experiments show improvement of the quality of the results, but they are redeemed in increased computation time.Last improvement is a design of an fully automatic algorithm that selects the best subset size to get the best results possible. The algorithm tries to find the optimal subset size to balance the systematic and random errors by monitoring the function of correlation quality versus subset size
Compression, Modeling, and Real-Time Rendering of Realistic Materials and Objects
The realism of a scene basically depends on the quality of the geometry, the
illumination and the materials that are used. Whereas many sources for
the creation of three-dimensional geometry exist and numerous algorithms
for the approximation of global illumination were presented, the acquisition
and rendering of realistic materials remains a challenging problem.
Realistic materials are very important in computer graphics, because
they describe the reflectance properties of surfaces, which are based on the
interaction of light and matter. In the real world, an enormous diversity of
materials can be found, comprising very different properties. One important
objective in computer graphics is to understand these processes, to formalize
them and to finally simulate them.
For this purpose various analytical models do already exist, but their
parameterization remains difficult as the number of parameters is usually
very high. Also, they fail for very complex materials that occur in the real
world. Measured materials, on the other hand, are prone to long acquisition
time and to huge input data size. Although very efficient statistical
compression algorithms were presented, most of them do not allow for editability,
such as altering the diffuse color or mesostructure. In this thesis,
a material representation is introduced that makes it possible to edit these
features. This makes it possible to re-use the acquisition results in order to
easily and quickly create deviations of the original material. These deviations
may be subtle, but also substantial, allowing for a wide spectrum of
material appearances.
The approach presented in this thesis is not based on compression, but on
a decomposition of the surface into several materials with different reflection
properties. Based on a microfacette model, the light-matter interaction is
represented by a function that can be stored in an ordinary two-dimensional
texture. Additionally, depth information, local rotations, and the diffuse
color are stored in these textures. As a result of the decomposition, some
of the original information is inevitably lost, therefore an algorithm for the
efficient simulation of subsurface scattering is presented as well.
Another contribution of this work is a novel perception-based simplification
metric that includes the material of an object. This metric comprises
features of the human visual system, for example trichromatic color
perception or reduced resolution. The proposed metric allows for a more
aggressive simplification in regions where geometric metrics do not simplif
Meshless Mechanics and Point-Based Visualization Methods for Surgical Simulations
Computer-based modeling and simulation practices have become an integral part of the medical education field. For surgical simulation applications, realistic constitutive modeling of soft tissue is considered to be one of the most challenging aspects of the problem, because biomechanical soft-tissue models need to reflect the correct elastic response, have to be efficient in order to run at interactive simulation rates, and be able to support operations such as cuts and sutures.
Mesh-based solutions, where the connections between the individual degrees of freedom (DoF) are defined explicitly, have been the traditional choice to approach these problems. However, when the problem under investigation contains a discontinuity that disrupts the connectivity between the DoFs, the underlying mesh structure has to be reconfigured in order to handle the newly introduced discontinuity correctly. This reconfiguration for mesh-based techniques is typically called dynamic remeshing, and most of the time it causes the performance bottleneck in the simulation.
In this dissertation, the efficiency of point-based meshless methods is investigated for both constitutive modeling of elastic soft tissues and visualization of simulation objects, where arbitrary discontinuities/cuts are applied to the objects in the context of surgical simulation. The point-based deformable object modeling problem is examined in three functional aspects: modeling continuous elastic deformations with, handling discontinuities in, and visualizing a point-based object. Algorithmic and implementation details of the presented techniques are discussed in the dissertation. The presented point-based techniques are implemented as separate components and integrated into the open-source software framework SOFA.
The presented meshless continuum mechanics model of elastic tissue were verified by comparing it to the Hertzian non-adhesive frictionless contact theory. Virtual experiments were setup with a point-based deformable block and a rigid indenter, and force-displacement curves obtained from the virtual experiments were compared to the theoretical solutions.
The meshless mechanics model of soft tissue and the integrated novel discontinuity treatment technique discussed in this dissertation allows handling cuts of arbitrary shape. The implemented enrichment technique not only modifies the internal mechanics of the soft tissue model, but also updates the point-based visual representation in an efficient way preventing the use of costly dynamic remeshing operations
General Purpose Computing on Graphics Processing Units for Accelerated Deep Learning in Neural Networks
Graphics processing units (GPUs) contain a significant number of cores relative to central processing units (CPUs), allowing them to handle high levels of parallelization in multithreading. A general-purpose GPU (GPGPU) is a GPU that has its threads and memory repurposed on a software level to leverage the multithreading made possible by the GPU’s hardware, and thus is an extremely strong platform for intense computing – there is no hardware difference between GPUs and GPGPUs. Deep learning is one such example of intense computing that is best implemented on a GPGPU, as its hardware structure of a grid of blocks, each containing processing threads, can handle the immense number of necessary calculations in parallel. A convolutional neural network (CNN) created for financial data analysis shows this advantage in the runtime of the training and testing of a neural network
- …