10 research outputs found
High Performance Multiview Video Coding
Following the standardization of the latest video coding standard High Efficiency Video Coding in 2013, in 2014, multiview extension of HEVC (MV-HEVC) was published and brought significantly better compression performance of around 50% for multiview and 3D videos compared to multiple independent single-view HEVC coding. However, the extremely high computational complexity of MV-HEVC demands significant optimization of the encoder. To tackle this problem, this work investigates the possibilities of using modern parallel computing platforms and tools such as single-instruction-multiple-data (SIMD) instructions, multi-core CPU, massively parallel GPU, and computer cluster to significantly enhance the MVC encoder performance. The aforementioned computing tools have very different computing characteristics and misuse of the tools may result in poor performance improvement and sometimes even reduction. To achieve the best possible encoding performance from modern computing tools, different levels of parallelism inside a typical MVC encoder are identified and analyzed. Novel optimization techniques at various levels of abstraction are proposed, non-aggregation massively parallel motion estimation (ME) and disparity estimation (DE) in prediction unit (PU), fractional and bi-directional ME/DE acceleration through SIMD, quantization parameter (QP)-based early termination for coding tree unit (CTU), optimized resource-scheduled wave-front parallel processing for CTU, and workload balanced, cluster-based multiple-view parallel are proposed. The result shows proposed parallel optimization techniques, with insignificant loss to coding efficiency, significantly improves the execution time performance. This , in turn, proves modern parallel computing platforms, with appropriate platform-specific algorithm design, are valuable tools for improving the performance of computationally intensive applications
Thermal Characterization of Next-Generation Workloads on Heterogeneous MPSoCs
Next-generation High-Performance Computing (HPC) applications need to tackle outstanding computational complexity while meeting latency and Quality-of-Service constraints. Heterogeneous Multi-Processor Systems-on-Chip (MPSoCs), equipped with a mix of general-purpose cores and reconfigurable fabric for custom acceleration of computational blocks, are key in providing the flexibility to meet the requirements of next-generation HPC. However, heterogeneity brings new challenges to efficient chip thermal management. In this context, accurate and fast thermal simulators are becoming crucial to understand and exploit the trade-offs brought by heterogeneous MPSoCs. In this paper, we first thermally characterize a next-generation HPC workload, the online video transcoding application, using a highly-accurate Infra-Red (IR) microscope. Second, we extend the 3D-ICE thermal simulation tool with a new generic heat spreader model capable of accurately reproducing package surface temperature, with an average error of 6.8% for the hot spots of the chip. Our model is used to characterize the thermal behaviour of the online transcoding application when running on a heterogeneous MPSoC. Moreover, by using our detailed thermal system characterization we are able to explore different application mappings as well as the thermal limits of such heterogeneous platforms
Analysis of HEVC Video Encoder Using ARM Cortex –A8 with NEON Technology
This work presents an implementation of the last software version of video processing the High Efficiency Video Coding (HEVC) encoder in architecture of mobile processors single low cost processor ARM Cortex A8 using on NEON architecture which is a Single Input Multiple Data (SIMD). By using an optimization using this technology the execution time was highly accelerated
Diseño de una arquitectura para estimación de movimiento fraccional según el estándar de codificación HEVC para video de alta resolución en tiempo real
Las labores de organizaciones especializadas como ITU-T Video Coding Experts
Group e ISO/IEC Moving Picture Experts Group han permitido el desarrollo de la
codificación de video a lo largo de estos años. Durante la primera década de este
siglo, el trabajo de estas organizaciones estuvo centrado en el estándar
H.264/AVC; sin embargo, el incremento de servicios como transmisión de video
por Internet y redes móviles así como el surgimiento de mayores resoluciones
como 4k u 8k llevó al desarrollo de un nuevo estándar de codificación
denominado HEVC o H.265, el cual busca representar los cuadros de video con
menor información sin afectar la calidad de la imagen.
El presente trabajo de tesis está centrado en el módulo de Estimación de
Movimiento Fraccional el cual forma parte del codificador HEVC y presenta una
elevada complejidad computacional. En este trabajo, se han tomado en cuenta
las mejoras incluidas por el estándar HEVC las cuales radican en los filtros de
interpolación empleados para calcular las muestras fraccionales.
Para verificar el algoritmo, se realizó la implementación del mismo utilizando el
entorno de programación MATLAB®. Este programa también ha permitido
contrastar los resultados obtenidos por medio de la simulación de la arquitectura.
Posteriormente, se diseñó la arquitectura teniendo como criterios principales la
frecuencia de procesamiento así como optimizar la cantidad de recursos lógicos
requeridos. La arquitectura fue descrita utilizando el lenguaje de descripción de
hardware VHDL y fue sintetizada para los dispositivos FPGA de la familia Virtex
los cuales pertenecen a la compañía Xilinx®. La verificación funcional fue
realizada por medio de la herramienta ModelSim empleando Testbenchs.
Los resultados de máxima frecuencia de operación fueron obtenidos por medio
de la síntesis de la arquitectura; adicionalmente, por medio de las simulaciones
se verificó la cantidad de ciclos de reloj para realizar el algoritmo. Con estos
datos se puede fundamentar que la arquitectura diseñada es capaz de procesar
secuencias de video HDTV (1920x1080 píxeles) a una tasa de procesamiento
mayor o igual a 30 cuadros por segundo.Tesi
The Effective Transmission and Processing of Mobile Multimedia
Ph.DDOCTOR OF PHILOSOPH
Runtime methods for energy-efficient, image processing using significance driven learning.
Ph. D. Thesis.Image and Video processing applications are opening up a whole
range of opportunities for processing at the "edge" or IoT applications
as the demand for high accuracy processing high resolution images
increases. However this comes with an increase in the quantity of data
to be processed and stored, thereby causing a significant increase in
the computational challenges. There is a growing interest in developing
hardware systems that provide energy efficient solutions to this
challenge. The challenges in Image Processing are unique because the
increase in resolution, not only increases the data to be processed but
also the amount of information detail scavenged from the data is also
greatly increased. This thesis addresses the concept of extracting the
significant image information to enable processing the data intelligently
within a heterogeneous system.
We propose a unique way of defining image significance, based on
what causes us to react when something "catches our eye", whether it
be static or dynamic, whether it be in our central field of focus or our
peripheral vision. This significance technique proves to be a relatively
economical process in terms of energy and computational effort.
We investigate opportunities for further computational and energy
efficiency that are available by elective use of heterogeneous system
elements.
We utilise significance to adaptively select regions of interest for selective
levels of processing dependent on their relative significance.
We further demonstrate that exploiting the computational slack time
released by this process, we can apply throttling of the processor
speed to effect greater energy savings. This demonstrates a reduction
in computational effort and energy efficiency a process that we term
adaptive approximate computing.
We demonstrate that our approach reduces energy in a range of 50 to
75%, dependent on user quality demand, for a real-time performance
requirement of 10 fps for a WQXGA image, when compared with the
existing approach that is agnostic of significance. We further hypothesise
that by use of heterogeneous elements that savings up to 90%
could be achievable in both performance and energy when compared
with running OpenCV on the CPU alone
Proceedings of the First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014): Porto, Portugal
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014). Porto (Portugal), August 27-28, 2014
Deep Learning Methods for Remote Sensing
Remote sensing is a field where important physical characteristics of an area are exacted using emitted radiation generally captured by satellite cameras, sensors onboard aerial vehicles, etc. Captured data help researchers develop solutions to sense and detect various characteristics such as forest fires, flooding, changes in urban areas, crop diseases, soil moisture, etc. The recent impressive progress in artificial intelligence (AI) and deep learning has sparked innovations in technologies, algorithms, and approaches and led to results that were unachievable until recently in multiple areas, among them remote sensing. This book consists of sixteen peer-reviewed papers covering new advances in the use of AI for remote sensing