25,622 research outputs found
Analysing Astronomy Algorithms for GPUs and Beyond
Astronomy depends on ever increasing computing power. Processor clock-rates
have plateaued, and increased performance is now appearing in the form of
additional processor cores on a single chip. This poses significant challenges
to the astronomy software community. Graphics Processing Units (GPUs), now
capable of general-purpose computation, exemplify both the difficult
learning-curve and the significant speedups exhibited by massively-parallel
hardware architectures. We present a generalised approach to tackling this
paradigm shift, based on the analysis of algorithms. We describe a small
collection of foundation algorithms relevant to astronomy and explain how they
may be used to ease the transition to massively-parallel computing
architectures. We demonstrate the effectiveness of our approach by applying it
to four well-known astronomy problems: Hogbom CLEAN, inverse ray-shooting for
gravitational lensing, pulsar dedispersion and volume rendering. Algorithms
with well-defined memory access patterns and high arithmetic intensity stand to
receive the greatest performance boost from massively-parallel architectures,
while those that involve a significant amount of decision-making may struggle
to take advantage of the available processing power.Comment: 10 pages, 3 figures, accepted for publication in MNRA
Selective Decoding in Associative Memories Based on Sparse-Clustered Networks
Associative memories are structures that can retrieve previously stored
information given a partial input pattern instead of an explicit address as in
indexed memories. A few hardware approaches have recently been introduced for a
new family of associative memories based on Sparse-Clustered Networks (SCN)
that show attractive features. These architectures are suitable for
implementations with low retrieval latency, but are limited to small networks
that store a few hundred data entries. In this paper, a new hardware
architecture of SCNs is proposed that features a new data-storage technique as
well as a method we refer to as Selective Decoding (SD-SCN). The SD-SCN has
been implemented using a similar FPGA used in the previous efforts and achieves
two orders of magnitude higher capacity, with no error-performance penalty but
with the cost of few extra clock cycles per data access.Comment: 4 pages, Accepted in IEEE Global SIP 2013 conferenc
High volume colour image processing with massively parallel embedded processors
Currently Oc´e uses FPGA technology for implementing colour image processing for their high volume colour printers. Although FPGA technology provides enough performance it, however, has a rather tedious development process. This paper describes the research conducted on an alternative implementation technology: software defined massively parallel processing. It is shown that this technology not only leads to a reduction in development time but also adds flexibility to the design
- …