Search CORE

91,646 research outputs found

A general framework for efficient FPGA implementation of matrix product

Author: Amira A.
Bensaali F.
Sotudeh R.
Publication venue
Publication date: 01/01/2007
Field of study

Original article can be found at: http://www.medjcn.com/ Copyright Softmotor LimitedHigh performance systems are required by the developers for fast processing of computationally intensive applications. Reconfigurable hardware devices in the form of Filed-Programmable Gate Arrays (FPGAs) have been proposed as viable system building blocks in the construction of high performance systems at an economical price. Given the importance and the use of matrix algorithms in scientific computing applications, they seem ideal candidates to harness and exploit the advantages offered by FPGAs. In this paper, a system for matrix algorithm cores generation is described. The system provides a catalog of efficient user-customizable cores, designed for FPGA implementation, ranging in three different matrix algorithm categories: (i) matrix operations, (ii) matrix transforms and (iii) matrix decomposition. The generated core can be either a general purpose or a specific application core. The methodology used in the design and implementation of two specific image processing application cores is presented. The first core is a fully pipelined matrix multiplier for colour space conversion based on distributed arithmetic principles while the second one is a parallel floating-point matrix multiplier designed for 3D affine transformations.Peer reviewe

University of Hertfordshire Research Archive

Recommended from our members

Collaborative analysis of multi-gigapixel imaging data using Cytomine

Author: Benjamin Stévens
Carpenter
de Souza
Gilles Louppe
Jean-Michel Begon
Leroi
Louis Wehenkel
Loïc Rollus
Marée
Philipp Kainz
Pierre Geurts
Raphaël Marée
Renaud Hoyoux
Rémy Vandaele
Weekers
Publication venue: 'Oxford University Press (OUP)'
Publication date: 10/01/2016
Field of study

Motivation: Collaborative analysis of massive imaging datasets is essential to enable scientific discoveries. Results: We developed Cytomine to foster active and distributed collaboration of multidisciplinary teams for large-scale image-based studies. It uses web development methodologies and machine learning in order to readily organize, explore, share and analyze (semantically and quantitatively) multi-gigapixel imaging data over the internet. We illustrate how it has been used in several biomedical applications

Central Archive at the University of Reading

Crossref

PubMed Central

Open Repository and Bibliography - Liège

JPEG steganography with particle swarm optimization accelerated by AVX

Author: Bramas B
Clerc M
Cox I
Jongen HT
Katzenbeisser S
Price KV
Raggo M
Wayner P
Publication venue: 'Wiley'
Publication date: 01/01/2020
Field of study

Digital steganography aims at hiding secret messages in digital data transmitted over insecure channels. The JPEG format is prevalent in digital communication, and images are often used as cover objects in digital steganography. Optimization methods can improve the properties of images with embedded secret but introduce additional computational complexity to their processing. AVX instructions available in modern CPUs are, in this work, used to accelerate data parallel operations that are part of image steganography with advanced optimizations.Web of Science328art. no. e544

Crossref

DSpace at VSB Technical University of Ostrava

A Digital Neuromorphic Architecture Efficiently Facilitating Complex Synaptic Response Functions Applied to Liquid State Machines

Author: Aimone James B.
Carlson Kristofor D.
Donaldson Jonathon
Follett David R.
Follett Pamela L.
Hill Aaron J.
James Conrad D.
Naegle John H.
Smith Michael R.
Vineyard Craig M.
Publication venue
Publication date: 21/03/2017
Field of study

Information in neural networks is represented as weighted connections, or synapses, between neurons. This poses a problem as the primary computational bottleneck for neural networks is the vector-matrix multiply when inputs are multiplied by the neural network weights. Conventional processing architectures are not well suited for simulating neural networks, often requiring large amounts of energy and time. Additionally, synapses in biological neural networks are not binary connections, but exhibit a nonlinear response function as neurotransmitters are emitted and diffuse between neurons. Inspired by neuroscience principles, we present a digital neuromorphic architecture, the Spiking Temporal Processing Unit (STPU), capable of modeling arbitrary complex synaptic response functions without requiring additional hardware components. We consider the paradigm of spiking neurons with temporally coded information as opposed to non-spiking rate coded neurons used in most neural networks. In this paradigm we examine liquid state machines applied to speech recognition and show how a liquid state machine with temporal dynamics maps onto the STPU-demonstrating the flexibility and efficiency of the STPU for instantiating neural algorithms.Comment: 8 pages, 4 Figures, Preprint of 2017 IJCN

arXiv.org e-Print Archive

Crossref

Algorithms for sketching surfaces

Author: Dowson Kurt
Visvalingam Maheswari
Publication venue
Publication date
Field of study

CISRG discussion paper ; 1

Repository@Hull - Worktribe

A mixed-signal early vision chip with embedded image and programming memories and digital I/O

Author: Domínguez Castro Rafael
Espejo Meana Servando Carlos
Liñán Cembrano Gustavo
Rodríguez Vázquez Ángel Benito
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2003
Field of study

From a system level perspective, this paper presents a 128 × 128 flexible and reconfigurable Focal-Plane Analog Programmable Array Processor, which has been designed as a single chip in a 0.35μm standard digital 1P-5M CMOS technology. The core processing array has been designed to achieve high-speed of operation and large-enough accuracy (∼ 7bit) with low power consumption. The chip includes on-chip program memory to allow for the execution of complex, sequential and/or bifurcation flow image processing algorithms. It also includes the structures and circuits needed to guarantee its embedding into conventional digital hosting systems: external data interchange and control are completely digital. The chip contains close to four million transistors, 90% of them working in analog mode. The chip features up to 330GOPs (Giga Operations per second), and uses the power supply (180GOP/Joule) and the silicon area (3.8 GOPS/mm2) efficiently, as it is able to maintain VGA processing throughputs of 100Frames/s with about 15 basic image processing tasks on each frame

idUS. Depósito de Investigación Universidad de Sevilla

Inviwo -- A Visualization System with Usage Abstraction Levels

Author: Englund Rickard
Falk Martin
Hotz Ingrid
Jönsson Daniel
Kottravel Sathish
Ropinski Timo
Steneteg Peter
Sundén Erik
Ynnerman Anders
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/10/2019
Field of study

The complexity of today's visualization applications demands specific visualization systems tailored for the development of these applications. Frequently, such systems utilize levels of abstraction to improve the application development process, for instance by providing a data flow network editor. Unfortunately, these abstractions result in several issues, which need to be circumvented through an abstraction-centered system design. Often, a high level of abstraction hides low level details, which makes it difficult to directly access the underlying computing platform, which would be important to achieve an optimal performance. Therefore, we propose a layer structure developed for modern and sustainable visualization systems allowing developers to interact with all contained abstraction levels. We refer to this interaction capabilities as usage abstraction levels, since we target application developers with various levels of experience. We formulate the requirements for such a system, derive the desired architecture, and present how the concepts have been exemplary realized within the Inviwo visualization system. Furthermore, we address several specific challenges that arise during the realization of such a layered architecture, such as communication between different computing platforms, performance centered encapsulation, as well as layer-independent development by supporting cross layer documentation and debugging capabilities

arXiv.org e-Print Archive

Publikationer från Linköpings universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line