232 research outputs found

    Dataflow Computing with Polymorphic Registers

    Get PDF
    Heterogeneous systems are becoming increasingly popular for data processing. They improve performance of simple kernels applied to large amounts of data. However, sequential data loads may have negative impact. Data parallel solutions such as Polymorphic Register Files (PRFs) can potentially accelerate applications by facilitating high speed, parallel access to performance-critical data. Furthermore, by PRF customization, specific data path features are exposed to the programmer in a very convenient way. PRFs allow additional control over the registers dimensions, and the number of elements which can be simultaneously accessed by computational units. This paper shows how PRFs can be integrated in dataflow computational platforms. In particular, starting from an annotated source code, we present a compiler-based methodology that automatically generates the customized PRFs and the enhanced computational kernels that efficiently exploit them

    An Expressive Language and Efficient Execution System for Software Agents

    Full text link
    Software agents can be used to automate many of the tedious, time-consuming information processing tasks that humans currently have to complete manually. However, to do so, agent plans must be capable of representing the myriad of actions and control flows required to perform those tasks. In addition, since these tasks can require integrating multiple sources of remote information ? typically, a slow, I/O-bound process ? it is desirable to make execution as efficient as possible. To address both of these needs, we present a flexible software agent plan language and a highly parallel execution system that enable the efficient execution of expressive agent plans. The plan language allows complex tasks to be more easily expressed by providing a variety of operators for flexibly processing the data as well as supporting subplans (for modularity) and recursion (for indeterminate looping). The executor is based on a streaming dataflow model of execution to maximize the amount of operator and data parallelism possible at runtime. We have implemented both the language and executor in a system called THESEUS. Our results from testing THESEUS show that streaming dataflow execution can yield significant speedups over both traditional serial (von Neumann) as well as non-streaming dataflow-style execution that existing software and robot agent execution systems currently support. In addition, we show how plans written in the language we present can represent certain types of subtasks that cannot be accomplished using the languages supported by network query engines. Finally, we demonstrate that the increased expressivity of our plan language does not hamper performance; specifically, we show how data can be integrated from multiple remote sources just as efficiently using our architecture as is possible with a state-of-the-art streaming-dataflow network query engine

    Configurable computer systems can support dataflow computing

    Get PDF
    This work presents a practical implementation of a uni-processor system design. This design, named D2-CPU, satisfies the pure data-driven paradigm, which is a radical alternative to the conventional von Neumann paradigm and exploits the instruction-level parallelism to its full extent. The D2-CPU uses the natural flow of the program, dataflow, by minimizing redundant instructions like fetch, store, and write back. This leads to a design with the better performance, lower power consumption and efficient use of the on-chip resources. This extraordinary performance is the result of a simple, pipelined and superscalar architecture with a very wide data bus and a completely out of order execution of instructions. This creates a program counter less, distributed controlled system design with the realization of intelligent memories. Upon the availability of data, the instructions advance further in the memory hierarchy and ultimately to the execution units by themselves, instead of having the CPU fetch the required instructions from the memory as in controlled flow processors. This application (data) oriented execution process is in contrast to application ignorant CPUs in conventional machines. The D2-CPU solves current architectural challenges and puts into practice a pure data-driven microprocessor. This work employs an FPGA implementation of the D2-CPU to prove the practicability of the data-driven computer paradigm using configurable logic. A relative analysis at the end confirms its superiority in performance, resource utilization and ease of programming over conventional CPUs

    The Case for Polymorphic Registers in Dataflow Computing

    Get PDF
    Heterogeneous systems are becoming increasingly popular, delivering high performance through hardware specialization. However, sequential data accesses may have a negative impact on performance. Data parallel solutions such as Polymorphic Register Files (PRFs) can potentially accelerate applications by facilitating high-speed, parallel access to performance-critical data. This article shows how PRFs can be integrated into dataflow computational platforms. Our semi-automatic, compiler-based methodology generates customized PRFs and modifies the computational kernels to efficiently exploit them. We use a separable 2D convolution case study to evaluate the impact of memory latency and bandwidth on performance compared to a state-of-the-art NVIDIA Tesla C2050 GPU. We improve the throughput up to 56.17X and show that the PRF-augmented system outperforms the GPU for 9×9 or larger mask sizes, even in bandwidth-constrained systems

    Implementation of a dna compression algorithm using dataflow computing

    Get PDF
    The amount of DNA sequences databases has increased a lot in the last years, the amount of space required to store the sequences is increasing more than the space available to store them, that means a higher cost to store DNA sequences and also the read sequences which are fragments of the whole sequence. This situation has led to the use of compression algorithms for storing DNA files. The main objective of the project is to increase the efficiency of the compression of DNA sequences because the process requires a lot of compute. An FPGA with dataflow architecture has been used to develop the project with the aim of exploiting the available parallelism in the algorithm chosen. The compression method has been developed to process sequence reads with a fixed amount of mutations per read and the test has been developed for 4, 8, 12 and 16 mutations per reads using an architecture that allows up to 160 reads to be processed in only one thick. Experimental results showed that even with a low amount of processing units, the performance increases a lot using the DFE architecture, the only disadvantage is the store/reading time. Palabras claves : Compression, Dataflow Engine (DFE), FPGA, CPU, DNA, Maxele

    An AppGallery for dataflow computing

    Get PDF

    A SURVEY ON TRADITIONAL PLATFORMS AND NEW TRENDS IN PARALLEL COMPUTATION

    Get PDF
    The information processing is in continuous progress. High Performance Computing is now a trend. The Parallel Computing is a synonymous phrase for High Performance Computing. Parallel Computing is the main field of information processing in our age. Both, the hardware systems and the software platforms are developing very fast to support the simple, rational and easy parallel data processing and programming. This paper shows an overview of issues and improvements in parallel processing. This paper deals with likewise the qualities and the favorable circumstances of distinctive stages for parallelism. Herewith are dealt with different architectures, innovations for parallelism and comparisons of results of parallel processing. The main question is to introduce the Dataflow model and some solid illustrations of Dataflow arrangement. The paper compares the traditional control flow parallel platform in contrasts to the data flow innovation

    A SURVEY ON TRADITIONAL PLATFORMS AND NEW TRENDS IN PARALLEL COMPUTATION

    Get PDF
    The information processing is in continuous progress. High Performance Computing is now a trend. The Parallel Computing is a synonymous phrase for High Performance Computing. Parallel Computing is the main field of information processing in our age. Both, the hardware systems and the software platforms are developing very fast to support the simple, rational and easy parallel data processing and programming. This paper shows an overview of issues and improvements in parallel processing. This paper deals with likewise the qualities and the favorable circumstances of distinctive stages for parallelism. Herewith are dealt with different architectures, innovations for parallelism and comparisons of results of parallel processing. The main question is to introduce the Dataflow model and some solid illustrations of Dataflow arrangement. The paper compares the traditional control flow parallel platform in contrasts to the data flow innovation

    Acceleration of Image Segmentation Algorithm for (Breast) Mammogram Images Using High-Performance Reconfigurable Dataflow Computers

    Get PDF
    Image segmentation is one of the most common procedures in medical imaging applications. It is also a very important task in breast cancer detection. Breast cancer detection procedure based on mammography can be divided into several stages. The first stage is the extraction of the region of interest from a breast image, followed by the identification of suspicious mass regions, their classification, and comparison with the existing image database. It is often the case that already existing image databases have large sets of data whose processing requires a lot of time, and thus the acceleration of each of the processing stages in breast cancer detection is a very important issue. In this paper, the implementation of the already existing algorithm for region-of-interest based image segmentation for mammogram images on High-Performance Reconfigurable Dataflow Computers (HPRDCs) is proposed. As a dataflow engine (DFE) of such HPRDC, Maxeler's acceleration card is used. The experiments for examining the acceleration of that algorithm on the Reconfigurable Dataflow Computers (RDCs) are performed with two types of mammogram images with different resolutions. There were, also, several DFE configurations and each of them gave a different acceleration value of algorithm execution. Those acceleration values are presented and experimental results showed good acceleration
    • …
    corecore