145 research outputs found

    Compressed Coverage Masks for Path Rendering on Mobile GPUs

    Get PDF
    We present an algorithm to accelerate resolution independent curve rendering on mobile GPUs. Recent trends in graphics hardware have created a plethora of compressed texture formats specific to GPU manufacturers. However, certain implementations of platform independent path rendering require generating grayscale textures on the CPU containing the extent that each pixel is covered by the curve. In this paper, we demonstrate that generating a compressed grayscale texture prior to uploading it to the GPU creates faster rendering times in addition to the memory savings. We implement a real-time compression technique for coverage masks and compare our results against the GPU-based implementation of the highly optimized Skia rendering library. We also analyze the worst case properties of our compression algorithms. We observe up to a 2 × speed improvement over the existing GPU-based methods in addition to up to a 9:1 improvement in GPU memory gains. We demonstrate the performance on multiple mobile platforms

    GPU voxelization

    Get PDF
    Given a triangulated model, we want to identify which voxels of a voxel grid are intersected by the boundary of this model. There are other branch of implemented voxelizations, in which not only the boundary is detected, also the interior of the model. Often these voxels are cubes. But it is not a restriction, there are other presented techniques in which the voxel grid is the view frustum, and voxels are prisms. There are di erent kind of voxelizations depending on the rasterization behavior. Approximate rasterization is the standard way of rasterizing fragments in GPU. It means only those fragments whose center lies inside the projection of the primitive are identi ed. Conservative rasterization (Hasselgren et al. , 2005) involves a dilation operation over the primitive. This is done in GPU to ensure that in the rasterization stage all the intersected fragments have its center inside the dilated primitive. However, this can produce spurious fragments, non-intersected pixels. Exact voxelization detects only those voxels that we need.

    Fast and robust generation of city scale urban ground plan

    Get PDF
    Since the introduction of the concept of Digital Earth, almost every major international city has been re-constructed in the virtual world. A large volume of geometric models describing urban objects has become freely available in public domain via software like Google Earth. Although mostly created for visualization, these urban models can benefit many applications beyond visualization including video games, city scale evacuation plan, traffic simulation and earth phenomenon simulations. However, these urban models are mostly loosely structured and implicitly defined and require tedious manual preparation that usually take weeks if not months before they can be used. In this paper, we present a framework that produces well-defined ground plans from these urban models, an important step in the preparation process. Designing algorithms that can robustly and efficiently handle unstructured urban models at city scale is the main technical challenge. In this work, we show both theoretically and empirically that our method is resolution complete, efficient and numerically stable. Based on our review of the related work, we believe this is the first work that attempts to create urban ground plans automatically from 3D architectural meshes at city level. With the goal of providing greater benefit beyond visualization from this large volume of urban models, our initial results are encouraging.published_or_final_versio

    WebGL-Based Simulation of Bone Removal in Surgical Orthopeadic Procedures

    Get PDF
    The effective role of virtual reality simulators in surgical operations has been demonstrated during the last decades. The proposed work has been done to give a perspective of the actual orthopeadic surgeries such as a total shoulder arthroplasty with low incidence and visibility of the operation to the surgeon. The research in this thesis is focused on the design and implementation of a web-based graphical feedback for a total shoulder arthroplasty (TSA) surgery. For portability of the simulation and powerful 3D programming features, WebGL is being applied. To simulate the reaming process of the shoulder bone, multiple steps has been passed to be able to remove the volumetric amount of bone which was touched by the reamer tool. A fast and accurate collision detection algorithm utilizing Möller –Trumbore ray-triangle method was implemented to detect the first collision of the bone and the tool in order to accelerate the computations for the bone removal process. Once the collision detected, a mesh Boolean operation using CSG method is being invoked to calculate the volumetric amount of bone which is intersected with the tool and should be removed. This work involves the user interaction to transform the tool in a Three.js scene for the simulated operation

    Efficient Algorithms for Coastal Geographic Problems

    Get PDF
    The increasing performance of computers has made it possible to solve algorithmically problems for which manual and possibly inaccurate methods have been previously used. Nevertheless, one must still pay attention to the performance of an algorithm if huge datasets are used or if the problem iscomputationally difficult. Two geographic problems are studied in the articles included in this thesis. In the first problem the goal is to determine distances from points, called study points, to shorelines in predefined directions. Together with other in-formation, mainly related to wind, these distances can be used to estimate wave exposure at different areas. In the second problem the input consists of a set of sites where water quality observations have been made and of the results of the measurements at the different sites. The goal is to select a subset of the observational sites in such a manner that water quality is still measured in a sufficient accuracy when monitoring at the other sites is stopped to reduce economic cost. Most of the thesis concentrates on the first problem, known as the fetch length problem. The main challenge is that the two-dimensional map is represented as a set of polygons with millions of vertices in total and the distances may also be computed for millions of study points in several directions. Efficient algorithms are developed for the problem, one of them approximate and the others exact except for rounding errors. The solutions also differ in that three of them are targeted for serial operation or for a small number of CPU cores whereas one, together with its further developments, is suitable also for parallel machines such as GPUs.Tietokoneiden suorituskyvyn kasvaminen on tehnyt mahdolliseksi ratkaista algoritmisesti ongelmia, joita on aiemmin tarkasteltu paljon ihmistyötä vaativilla, mahdollisesti epätarkoilla, menetelmillä. Algoritmien suorituskykyyn on kuitenkin toisinaan edelleen kiinnitettävä huomiota lähtömateriaalin suuren määrän tai ongelman laskennallisen vaikeuden takia. Väitöskirjaansisältyvissäartikkeleissatarkastellaankahtamaantieteellistä ongelmaa. Ensimmäisessä näistä on määritettävä etäisyyksiä merellä olevista pisteistä lähimpään rantaviivaan ennalta määrätyissä suunnissa. Etäisyyksiä ja tuulen voimakkuutta koskevien tietojen avulla on mahdollista arvioida esimerkiksi aallokon voimakkuutta. Toisessa ongelmista annettuna on joukko tarkkailuasemia ja niiltä aiemmin kerättyä tietoa erilaisista vedenlaatua kuvaavista parametreista kuten sameudesta ja ravinteiden määristä. Tehtävänä on valita asemajoukosta sellainen osa joukko, että vedenlaatua voidaan edelleen tarkkailla riittävällä tarkkuudella, kun mittausten tekeminen muilla havaintopaikoilla lopetetaan kustannusten säästämiseksi. Väitöskirja keskittyy pääosin ensimmäisen ongelman, suunnattujen etäisyyksien, ratkaisemiseen. Haasteena on se, että tarkasteltava kaksiulotteinen kartta kuvaa rantaviivan tyypillisesti miljoonista kärkipisteistä koostuvana joukkonapolygonejajaetäisyyksiäonlaskettavamiljoonilletarkastelupisteille kymmenissä eri suunnissa. Ongelmalle kehitetään tehokkaita ratkaisutapoja, joista yksi on likimääräinen, muut pyöristysvirheitä lukuun ottamatta tarkkoja. Ratkaisut eroavat toisistaan myös siinä, että kolme menetelmistä on suunniteltu ajettavaksi sarjamuotoisesti tai pienellä määrällä suoritinytimiä, kun taas yksi menetelmistä ja siihen tehdyt parannukset soveltuvat myös voimakkaasti rinnakkaisille laitteille kuten GPU:lle. Vedenlaatuongelmassa annetulla asemajoukolla on suuri määrä mahdollisia osajoukkoja. Lisäksi tehtävässä käytetään aikaa vaativia operaatioita kuten lineaarista regressiota, mikä entisestään rajoittaa sitä, kuinka monta osajoukkoa voidaan tutkia. Ratkaisussa käytetäänkin heuristiikkoja, jotkaeivät välttämättä tuota optimaalista lopputulosta.Siirretty Doriast

    GPU voxelization

    Get PDF
    Given a triangulated model, we want to identify which voxels of a voxel grid are intersected by the boundary of this model. There are other branch of implemented voxelizations, in which not only the boundary is detected, also the interior of the model. Often these voxels are cubes. But it is not a restriction, there are other presented techniques in which the voxel grid is the view frustum, and voxels are prisms. There are di erent kind of voxelizations depending on the rasterization behavior. Approximate rasterization is the standard way of rasterizing fragments in GPU. It means only those fragments whose center lies inside the projection of the primitive are identi ed. Conservative rasterization (Hasselgren et al. , 2005) involves a dilation operation over the primitive. This is done in GPU to ensure that in the rasterization stage all the intersected fragments have its center inside the dilated primitive. However, this can produce spurious fragments, non-intersected pixels. Exact voxelization detects only those voxels that we need.

    OPTIMIZATION APPROACHES TO MPI AND AREA MERGING-BASED PARALLEL BUFFER ALGORITHM

    Get PDF
    On buffer zone construction, the rasterization-based dilation method inevitablyintroduces errors, and the double-sided parallel line method involves a series ofcomplex operations. In this paper, we proposed a parallel buffer algorithm based onarea merging and MPI (Message Passing Interface) to improve the performances ofbuffer analyses on processing large datasets. Experimental results reveal that thereare three major performance bottlenecks which significantly impact the serial andparallel buffer construction efficiencies, including the area merging strategy, thetask load balance method and the MPI inter-process results merging strategy.Corresponding optimization approaches involving tree-like area merging strategy, the vertex number oriented parallel task partition method and the inter-processresults merging strategy were suggested to overcome these bottlenecks. Experimentswere carried out to examine the performance efficiency of the optimized parallelalgorithm. The estimation results suggested that the optimization approaches couldprovide high performance and processing ability for buffer construction in a clusterparallel environment. Our method could provide insights into the parallelization ofspatial analysis algorithm

    Unlimited object instancing in real-time

    Get PDF
    In this paper, we propose a novel approach to efficient rendering of an unlimited number of 3D objects in real-time. We present a rendering pipeline that is based on a new computer graphics programming paradigm implementing a holistic approach to the virtual scene definition. Using Signed Distance Functions (SDF) for a virtual scene representation, we managed to control the content and complexity of the virtual scene with the use of mathematical equations. In order to solve the limited hardware problem, especially the limited capacity of the GPU memory, we propose a scene element repository which extends the idea of the data based amplification. The content of the repository strongly depends on a 3D object visualization method. One of the most important requirements of the developed pipeline is the possibility to render 3D objects created by artists. In order to achieve that, the object visualization method uses Sparse Voxel Octree (SVO) ray casting. The developed rendering pipeline is fully compatible with the available SVO algorithms. We show how to avoid occlusion errors which can occur in the SDF and SVO integration single-pass rendering pipeline. Finally, in order to control the content and complexity of the virtual scenes in an unlimited way, we propose a collection of global operators applicable to the virtual scene distance function. Developed Unlimited Object Instancing rendering pipeline can be easily integrated with traditional visualization methods, e.g. the triangle rasterization. The only hardware requirement for our approach is the support for compute shaders or any GPGPU API

    Hierarchical N-Body problem on graphics processor unit

    Get PDF
    Galactic simulation is an important cosmological computation, and represents a classical N-body problem suitable for implementation on vector processors. Barnes-Hut algorithm is a hierarchical N-Body method used to simulate such galactic evolution systems. Stream processing architectures expose data locality and concurrency available in multimedia applications. On the other hand, there are numerous compute-intensive scientific or engineering applications that can potentially benefit from such computational and communication models. These applications are traditionally implemented on vector processors. Stream architecture based graphics processor units (GPUs) present a novel computational alternative for efficiently implementing such high-performance applications. Rendering on a stream architecture sustains high performance, while user-programmable modules allow implementing complex algorithms efficiently. GPUs have evolved over the years, from being fixed-function pipelines to user programmable processors. In this thesis, we focus on the implementation of Barnes-Hut algorithm on typical current-generation programmable GPUs. We exploit computation and communication requirements present in Barnes-Hut algorithm to expose their suitability for user-programmable GPUs. Our implementation of the Barnes-Hut algorithm is formulated as a fragment shader targeting the selected GPU. We discuss implementation details, design issues, results, and challenges encountered in programming the fragment shader
    corecore