627 research outputs found

    Sample-Parallel Execution of EBCOT in Fast Mode

    Get PDF
    JPEG 2000’s most computationally expensive building block is the Embedded Block Coder with Optimized Truncation (EBCOT). This paper evaluates how encoders targeting a parallel architecture such as a GPU can increase their throughput in use cases where very high data rates are used. The compression efficiency in the less significant bit-planes is then often poor and it is beneficial to enable the Selective Arithmetic Coding Bypass style (fast mode) in order to trade a small loss in compression efficiency for a reduction of the computational complexity. More importantly, this style exposes a more finely grained parallelism that can be exploited to execute the raw coding passes, including bit-stuffing, in a sample-parallel fashion. For a latency- or memory critical application that encodes one frame at a time, EBCOT’s tier-1 is sped up between 1.1x and 2.4x compared to an optimized GPU-based implementation. When a low GPU occupancy has already been addressed by encoding multiple frames in parallel, the throughput can still be improved by 5% for high-entropy images and 27% for low-entropy images. Best results are obtained when enabling the fast mode after the fourth significant bit-plane. For most of the test images the compression rate is within 1% of the original

    Simulating Spiking Neural P systems without delays using GPUs

    Get PDF
    We present in this paper our work regarding simulating a type of P system known as a spiking neural P system (SNP system) using graphics processing units (GPUs). GPUs, because of their architectural optimization for parallel computations, are well-suited for highly parallelizable problems. Due to the advent of general purpose GPU computing in recent years, GPUs are not limited to graphics and video processing alone, but include computationally intensive scientific and mathematical applications as well. Moreover P systems, including SNP systems, are inherently and maximally parallel computing models whose inspirations are taken from the functioning and dynamics of a living cell. In particular, SNP systems try to give a modest but formal representation of a special type of cell known as the neuron and their interactions with one another. The nature of SNP systems allowed their representation as matrices, which is a crucial step in simulating them on highly parallel devices such as GPUs. The highly parallel nature of SNP systems necessitate the use of hardware intended for parallel computations. The simulation algorithms, design considerations, and implementation are presented. Finally, simulation results, observations, and analyses using an SNP system that generates all numbers in N\mathbb N - {1} are discussed, as well as recommendations for future work.Comment: 19 pages in total, 4 figures, listings/algorithms, submitted at the 9th Brainstorming Week in Membrane Computing, University of Seville, Spai

    Evaluation of GPU/CPU Co-Processing Models for JPEG 2000 Packetization

    Get PDF
    With the bottom-line goal of increasing the throughput of a GPU-accelerated JPEG 2000 encoder, this paper evaluates whether the post-compression rate control and packetization routines should be carried out on the CPU or on the GPU. Three co-processing models that differ in how the workload is split among the CPU and GPU are introduced. Both routines are discussed and algorithms for executing them in parallel are presented. Experimental results for compressing a detail-rich UHD sequence to 4 bits/sample indicate speed-ups of 200x for the rate control and 100x for the packetization compared to the single-threaded implementation in the commercial Kakadu library. These two routines executed on the CPU take 4x as long as all remaining coding steps on the GPU and therefore present a bottleneck. Even if the CPU bottleneck could be avoided with multi-threading, it is still beneficial to execute all coding steps on the GPU as this minimizes the required device-to-host transfer and thereby speeds up the critical path from 17.2 fps to 19.5 fps for 4 bits/sample and to 22.4 fps for 0.16 bits/sample

    A New Strategy to Improve the Performance of PDP-Systems Simulators

    Get PDF
    One of the major challenges that current P systems simulators have to deal with is to be as efficient as possible. A P system is syntactically described as a membrane structure delimiting regions where multisets of objects evolve by means of evolution rules. According to that, on each computation step, the applicability of the rules for the current P system configuration must be calculated. In this paper we extend previous works that use Rete-based simulation algorithm in order to improve the time consumed during the checking phase in the selection of rules. A new approach is presented, oriented to the acceleration of Population Dynamics P Systems simulations.Ministerio de Economía y Competitividad TIN2012- 3743

    An Improved GPU Simulator For Spiking Neural P Systems

    Get PDF
    Spiking Neural P (SNP) systems, variants of Psystems (under Membrane and Natural computing), are computing models that acquire abstraction and inspiration from the way neurons 'compute' or process information. Similar to other P system variants, SNP systems are Turing complete models that by nature compute non-deterministically and in a maximally parallel manner. P systems usually trade (often exponential) space for (polynomial to constant) time. Due to this nature, P system variants are currently limited to parallel simulations, and several variants have already been simulated in parallel devices. In this paper we present an improved SNP system simulator based on graphics processing units (GPUs). Among other reasons, current GPUs are architectured for massively parallel computations, thus making GPUs very suitable for SNP system simulation. The computing model, hardware/software considerations, and simulation algorithm are presented, as well as the comparisons of the CPU only and CPU-GPU based simulators.Ministerio de Ciencia e Innovación TIN2009–13192Junta de Andalucía P08-TIC-0420

    Model Driven Evolution of an Agent-Based Home Energy Management System

    Get PDF
    Advanced smart home appliances and new models of energy tariffs imposed by energy providers pose new challenges in the automation of home energy management. Users need some assistant tool that helps them to make complex decisions with different goals, depending on the current situation. Multi-agent systems have proved to be a suitable technology to develop self-management systems, able to take the most adequate decision under different context-dependent situations, like the home energy management. The heterogeneity of home appliances and also the changes in the energy policies of providers introduce the necessity of explicitly modeling this variability. But, multi-agent systems lack of mechanisms to effectively deal with the different degrees of variability required by these kinds of systems. Software Product Line technologies, including variability models, has been successfully applied to different domains to explicitly model any kind of variability. We have defined a software product line development process that performs a model driven generation of agents embedded in heterogeneous smart objects with different degrees of self-management. However, once deployed, the home energy assistant system has to be able to evolve to self-adapt its decision making or devices to new requirements. So, in this paper we propose a model driven mechanism to automatically manage the evolution of multi-agent systems distributed among several devices.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Movies Tags Extraction Using Deep Learning

    Get PDF
    Retrieving information from movies is becoming increasingly demanding due to the enormous amount of multimedia data generated each day. Not only it helps in efficient search, archiving and classification of movies, but is also instrumental in content censorship and recommendation systems. Extracting key information from a movie and summarizing it in a few tags which best describe the movie presents a dedicated challenge and requires an intelligent approach to automatically analyze the movie. In this paper, we formulate movies tags extraction problem as a machine learning classification problem and train a Convolution Neural Network (CNN) on a carefully constructed tag vocabulary. Our proposed technique first extracts key frames from a movie and applies the trained classifier on the key frames. The predictions from the classifier are assigned scores and are filtered based on their relative strengths to generate a compact set of most relevant key tags. We performed a rigorous subjective evaluation of our proposed technique for a wide variety of movies with different experiments. The evaluation results presented in this paper demonstrate that our proposed approach can efficiently extract the key tags of a movie with a good accuracy

    Solving Sudoku with Membrane Computing

    Get PDF
    Sudoku is a very popular puzzle which consists on placing several numbers in a squared grid according to some simple rules. In this paper we present an efficient family of P systems which solve sudokus of any order verifying a specific property. The solution is searched by using a simple human-style method. If the sudoku cannot be solved by using this strategy, the P system detects this drawback and then the computations stops and returns No. Otherwise, the P system encodes the solution and returns Yes in the last computation step.Ministerio de Ciencia e Innovación TIN2008-04487-EMinisterio de Ciencia e Innovación TIN2009–13192Junta de Andalucía P08-TIC-0420

    Spiking Neural P Systems with Structural Plasticity: Attacking the Subset Sum Problem

    Get PDF
    Spiking neural P systems with structural plasticity (in short, SNPSP systems) are models of computations inspired by the function and structure of biological neurons. In SNPSP systems, neurons can create or delete synapses using plasticity rules. We report two families of solutions: a non-uniform and a uniform one, to the NP-complete problem Subset Sum using SNPSP systems. Instead of the usual rule-level nondeterminism (choosing which rule to apply) we use synapse-level nondeterminism (choosing which synapses to create or delete). The nondeterminism due to plasticity rules have the following improvements from a previous solution: in our non-uniform solution, plasticity rules allowed for a normal form to be used (i.e. without forgetting rules or rules with delays, system is simple, only synapse-level nondeterminism); in our uniform solution the number of neurons and the computation steps are reduced.Ministerio de Economía y Competitividad TIN2012-3743

    Simulating FRSN P Systems with Real Numbers in P-Lingua on sequential and CUDA platforms

    Get PDF
    Fuzzy Reasoning Spiking Neural P systems (FRSN P systems, for short) is a variant of Spiking Neural P systems incorporating fuzzy logic elements that make it suitable to model fuzzy diagnosis knowledge and reasoning required for fault diagnosis applications. In this sense, several FRSN P system variants have been proposed, dealing with real numbers, trapezoidal numbers, weights, etc. The model incorporating real numbers was the first introduced [13], presenting promising applications in the field of fault diagnosis of electrical systems. For this variant, a matrix-based algorithm was provided which, when executed on parallel computing platforms, fully exploits the model maximally parallel capacities. In this paper we introduce a P-Lingua framework extension to parse and simulate FRSN P systems with real numbers. Two simulators, implementing a variant of the original matrix-based simulation algorithm, are provided: a sequential one (written in Java), intended to run on traditional CPUs, and a parallel one, intended to run on CUDAenabled devices.Ministerio de Economía y Competitividad TIN2012-3743
    corecore