245 research outputs found

    Sample-Parallel Execution of EBCOT in Fast Mode

    Get PDF
    JPEG 2000’s most computationally expensive building block is the Embedded Block Coder with Optimized Truncation (EBCOT). This paper evaluates how encoders targeting a parallel architecture such as a GPU can increase their throughput in use cases where very high data rates are used. The compression efficiency in the less significant bit-planes is then often poor and it is beneficial to enable the Selective Arithmetic Coding Bypass style (fast mode) in order to trade a small loss in compression efficiency for a reduction of the computational complexity. More importantly, this style exposes a more finely grained parallelism that can be exploited to execute the raw coding passes, including bit-stuffing, in a sample-parallel fashion. For a latency- or memory critical application that encodes one frame at a time, EBCOT’s tier-1 is sped up between 1.1x and 2.4x compared to an optimized GPU-based implementation. When a low GPU occupancy has already been addressed by encoding multiple frames in parallel, the throughput can still be improved by 5% for high-entropy images and 27% for low-entropy images. Best results are obtained when enabling the fast mode after the fourth significant bit-plane. For most of the test images the compression rate is within 1% of the original

    Evaluation of GPU/CPU Co-Processing Models for JPEG 2000 Packetization

    Get PDF
    With the bottom-line goal of increasing the throughput of a GPU-accelerated JPEG 2000 encoder, this paper evaluates whether the post-compression rate control and packetization routines should be carried out on the CPU or on the GPU. Three co-processing models that differ in how the workload is split among the CPU and GPU are introduced. Both routines are discussed and algorithms for executing them in parallel are presented. Experimental results for compressing a detail-rich UHD sequence to 4 bits/sample indicate speed-ups of 200x for the rate control and 100x for the packetization compared to the single-threaded implementation in the commercial Kakadu library. These two routines executed on the CPU take 4x as long as all remaining coding steps on the GPU and therefore present a bottleneck. Even if the CPU bottleneck could be avoided with multi-threading, it is still beneficial to execute all coding steps on the GPU as this minimizes the required device-to-host transfer and thereby speeds up the critical path from 17.2 fps to 19.5 fps for 4 bits/sample and to 22.4 fps for 0.16 bits/sample

    A Simulation Workflow for Membrane Computing: From MeCoSim to PMCGPU Through P-Lingua

    Get PDF
    P system simulators are of high importance in Membrane Computing, since they provide tools to assist on model validation and verification. Keeping a balance between generality and flexibility, on the one side, and efficiency, on the other hand, is always challenging, but it is worth the effort. Besides, in order to prove the feasibility of P system models as practical tools for solving problems and aid in decision making, it is essential to provide functional mechanisms to have all the elements required at disposal of the potential users smoothly integrated in a robust workflow. The aim of this paper is to describe the main components and connections within the approach followed in this pipeline.Ministerio de Industria, Economía y Competitividad TIN2017-89842-

    Accelerated Simulation of P Systems on the GPU: A Survey

    Get PDF
    The acceleration of P system simulations is required increasingly, since they are at the core of model verification and validation processes. For this purpose, GPU computing is an alternative to more classic approaches in Parallel Computing. It provides a manycore platform with a level of high parallelism at a low cost. In this paper, we survey the developments of P systems simulators using the GPU, and analyze some performance considerations.Ministerio de Economía y Competitividad TIN2012- 37434Junta de Andalucía P08-TIC-0420

    A New Strategy to Improve the Performance of PDP-Systems Simulators

    Get PDF
    One of the major challenges that current P systems simulators have to deal with is to be as efficient as possible. A P system is syntactically described as a membrane structure delimiting regions where multisets of objects evolve by means of evolution rules. According to that, on each computation step, the applicability of the rules for the current P system configuration must be calculated. In this paper we extend previous works that use Rete-based simulation algorithm in order to improve the time consumed during the checking phase in the selection of rules. A new approach is presented, oriented to the acceleration of Population Dynamics P Systems simulations.Ministerio de Economía y Competitividad TIN2012- 3743

    An Improved GPU Simulator For Spiking Neural P Systems

    Get PDF
    Spiking Neural P (SNP) systems, variants of Psystems (under Membrane and Natural computing), are computing models that acquire abstraction and inspiration from the way neurons 'compute' or process information. Similar to other P system variants, SNP systems are Turing complete models that by nature compute non-deterministically and in a maximally parallel manner. P systems usually trade (often exponential) space for (polynomial to constant) time. Due to this nature, P system variants are currently limited to parallel simulations, and several variants have already been simulated in parallel devices. In this paper we present an improved SNP system simulator based on graphics processing units (GPUs). Among other reasons, current GPUs are architectured for massively parallel computations, thus making GPUs very suitable for SNP system simulation. The computing model, hardware/software considerations, and simulation algorithm are presented, as well as the comparisons of the CPU only and CPU-GPU based simulators.Ministerio de Ciencia e Innovación TIN2009–13192Junta de Andalucía P08-TIC-0420

    Movies Tags Extraction Using Deep Learning

    Get PDF
    Retrieving information from movies is becoming increasingly demanding due to the enormous amount of multimedia data generated each day. Not only it helps in efficient search, archiving and classification of movies, but is also instrumental in content censorship and recommendation systems. Extracting key information from a movie and summarizing it in a few tags which best describe the movie presents a dedicated challenge and requires an intelligent approach to automatically analyze the movie. In this paper, we formulate movies tags extraction problem as a machine learning classification problem and train a Convolution Neural Network (CNN) on a carefully constructed tag vocabulary. Our proposed technique first extracts key frames from a movie and applies the trained classifier on the key frames. The predictions from the classifier are assigned scores and are filtered based on their relative strengths to generate a compact set of most relevant key tags. We performed a rigorous subjective evaluation of our proposed technique for a wide variety of movies with different experiments. The evaluation results presented in this paper demonstrate that our proposed approach can efficiently extract the key tags of a movie with a good accuracy

    Probabilistic Guarded P Systems, A New Formal Modelling Framework

    Get PDF
    Multienvironment P systems constitute a general, formal framework for modelling the dynamics of population biology, which consists of two main approaches: stochastic and probabilistic. The framework has been successfully used to model biologic systems at both micro (e.g. bacteria colony) and macro (e.g. real ecosystems) levels, respectively. In this paper, we extend the general framework in order to include a new case study related to P. Oleracea species. The extension is made by a new variant within the probabilistic approach, called Probabilistic Guarded P systems (in short, PGP systems). We provide a formal definition, a simulation algorithm to capture the dynamics, and a survey of the associated software.Ministerio de Economía y Competitividad TIN2012- 37434Junta de Andalucía P08-TIC-0420

    Spiking Neural P Systems with Structural Plasticity: Attacking the Subset Sum Problem

    Get PDF
    Spiking neural P systems with structural plasticity (in short, SNPSP systems) are models of computations inspired by the function and structure of biological neurons. In SNPSP systems, neurons can create or delete synapses using plasticity rules. We report two families of solutions: a non-uniform and a uniform one, to the NP-complete problem Subset Sum using SNPSP systems. Instead of the usual rule-level nondeterminism (choosing which rule to apply) we use synapse-level nondeterminism (choosing which synapses to create or delete). The nondeterminism due to plasticity rules have the following improvements from a previous solution: in our non-uniform solution, plasticity rules allowed for a normal form to be used (i.e. without forgetting rules or rules with delays, system is simple, only synapse-level nondeterminism); in our uniform solution the number of neurons and the computation steps are reduced.Ministerio de Economía y Competitividad TIN2012-3743

    Simulating FRSN P Systems with Real Numbers in P-Lingua on sequential and CUDA platforms

    Get PDF
    Fuzzy Reasoning Spiking Neural P systems (FRSN P systems, for short) is a variant of Spiking Neural P systems incorporating fuzzy logic elements that make it suitable to model fuzzy diagnosis knowledge and reasoning required for fault diagnosis applications. In this sense, several FRSN P system variants have been proposed, dealing with real numbers, trapezoidal numbers, weights, etc. The model incorporating real numbers was the first introduced [13], presenting promising applications in the field of fault diagnosis of electrical systems. For this variant, a matrix-based algorithm was provided which, when executed on parallel computing platforms, fully exploits the model maximally parallel capacities. In this paper we introduce a P-Lingua framework extension to parse and simulate FRSN P systems with real numbers. Two simulators, implementing a variant of the original matrix-based simulation algorithm, are provided: a sequential one (written in Java), intended to run on traditional CPUs, and a parallel one, intended to run on CUDAenabled devices.Ministerio de Economía y Competitividad TIN2012-3743
    corecore