Search CORE

2,847 research outputs found

Hydra: An Accelerator for Real-Time Edge-Aware Permeability Filtering in 65nm CMOS

Author: Benini Luca
Cavigelli Lukas
Eggimann Manuel
Gloor Christelle
Schaffner Michael
Scheidegger Florian
Smolic Aljosa
Publication venue
Publication date: 08/11/2017
Field of study

Many modern video processing pipelines rely on edge-aware (EA) filtering methods. However, recent high-quality methods are challenging to run in real-time on embedded hardware due to their computational load. To this end, we propose an area-efficient and real-time capable hardware implementation of a high quality EA method. In particular, we focus on the recently proposed permeability filter (PF) that delivers promising quality and performance in the domains of HDR tone mapping, disparity and optical flow estimation. We present an efficient hardware accelerator that implements a tiled variant of the PF with low on-chip memory requirements and a significantly reduced external memory bandwidth (6.4x w.r.t. the non-tiled PF). The design has been taped out in 65 nm CMOS technology, is able to filter 720p grayscale video at 24.8 Hz and achieves a high compute density of 6.7 GFLOPS/mm2 (12x higher than embedded GPUs when scaled to the same technology node). The low area and bandwidth requirements make the accelerator highly suitable for integration into SoCs where silicon area budget is constrained and external memory is typically a heavily contended resource

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Scalable wavelet-based coding of irregular meshes with interactive region-of-interest support

Author: Khalil Jonas El Sayeh
Lambert Peter
Munteanu Adrian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

This paper proposes a novel functionality in wavelet-based irregular mesh coding, which is interactive region-of-interest (ROI) support. The proposed approach enables the user to define the arbitrary ROIs at the decoder side and to prioritize and decode these regions at arbitrarily high-granularity levels. In this context, a novel adaptive wavelet transform for irregular meshes is proposed, which enables: 1) varying the resolution across the surface at arbitrarily fine-granularity levels and 2) dynamic tiling, which adapts the tile sizes to the local sampling densities at each resolution level. The proposed tiling approach enables a rate-distortion-optimal distribution of rate across spatial regions. When limiting the highest resolution ROI to the visible regions, the fine granularity of the proposed adaptive wavelet transform reduces the required amount of graphics memory by up to 50%. Furthermore, the required graphics memory for an arbitrary small ROI becomes negligible compared to rendering without ROI support, independent of any tiling decisions. Random access is provided by a novel dynamic tiling approach, which proves to be particularly beneficial for large models of over 10(6) similar to 10(7) vertices. The experiments show that the dynamic tiling introduces a limited lossless rate penalty compared to an equivalent codec without ROI support. Additionally, rate savings up to 85% are observed while decoding ROIs of tens of thousands of vertices

Ghent University Academic Bibliography

Scalable Interactive Volume Rendering Using Off-the-shelf Components

Author: Breen David
Heirich Alan
Lombeyda Santiago
Moll Laurent
Shand Mark
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/2001
Field of study

This paper describes an application of a second generation implementation of the Sepia architecture (Sepia-2) to interactive volu-metric visualization of large rectilinear scalar fields. By employingpipelined associative blending operators in a sort-last configuration a demonstration system with 8 rendering computers sustains 24 to 28 frames per second while interactively rendering large data volumes (1024x256x256 voxels, and 512x512x512 voxels). We believe interactive performance at these frame rates and data sizes is unprecedented. We also believe these results can be extended to other types of structured and unstructured grids and a variety of GL rendering techniques including surface rendering and shadow map-ping. We show how to extend our single-stage crossbar demonstration system to multi-stage networks in order to support much larger data sizes and higher image resolutions. This requires solving a dynamic mapping problem for a class of blending operators that includes Porter-Duff compositing operators

CiteSeerX

Caltech Authors

Developing efficient web-based GIS applications

Author: Adnan M.
Longley P.
Singleton A.
Publication venue: Centre for Advanced Spatial Analysis (UCL)
Publication date: 01/02/2010
Field of study

There is an increase in the number of web-based GIS applications over the recent years. This paper describes different mapping technologies, database standards, and web application development standards that are relevant to the development of web-based GIS applications. Different mapping technologies for displaying geo-referenced data are available and can be used in different situations. This paper also explains why Oracle is the system of choice for geospatial applications that need to handle large amounts of data. Wireframing and design patterns have been shown to be useful in making GIS web applications efficient, scalable and usable, and should be an important part of every web-based GIS application. A range of different development technologies are available, and their use in different operating environments has been discussed here in some detail

UCL Discovery

Memory architecture for efficient utilization of SDRAM: a case study of the computation/memory access trade-off

Author: Gleerup Thomas Møller
Holten-Lund Hans Erik
Madsen Jan
Pedersen Steen
Publication venue
Publication date: 01/01/2000
Field of study

This paper discusses the trade-off between calculations and memory accesses in a 3D graphics tile renderer for visualization of data from medical scanners. The performance requirement of this application is a frame rate of 25 frames per second when rendering 3D models with 2 million triangles, i.e. 50 million triangles per second, sustained (not peak). At present, a software implementation is capable of 3-4 frames per second for a 1 million triangle model

CiteSeerX

Online Research Database In Technology

A Parallel Rendering Algorithm for MIMD Architectures

Author: Crockett Thomas W.
Orloff Tobias
Publication venue
Publication date
Field of study

Applications such as animation and scientific visualization demand high performance rendering of complex three dimensional scenes. To deliver the necessary rendering rates, highly parallel hardware architectures are required. The challenge is then to design algorithms and software which effectively use the hardware parallelism. A rendering algorithm targeted to distributed memory MIMD architectures is described. For maximum performance, the algorithm exploits both object-level and pixel-level parallelism. The behavior of the algorithm is examined both analytically and experimentally. Its performance for large numbers of processors is found to be limited primarily by communication overheads. An experimental implementation for the Intel iPSC/860 shows increasing performance from 1 to 128 processors across a wide range of scene complexities. It is shown that minimal modifications to the algorithm will adapt it for use on shared memory architectures as well

NASA Technical Reports Server

Embedded 3D Graphics Core for FPGA-based System-on-Chip Applications

Author: Holten-Lund Hans Erik
Publication venue: Electrum-Kista
Publication date: 01/01/2005
Field of study

Online Research Database In Technology

Design for scalability in 3D computer graphics architectures

Author: Holten-Lund Hans Erik
Publication venue
Publication date: 01/03/2002
Field of study

Online Research Database In Technology