Search CORE

6 research outputs found

Fast decimal floating-point division

Author: Lim C.
Nikmehr H.
Phillips B.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

A new implementation for decimal floating-point (DFP) division is introduced. The algorithm is based on high-radix SRT division The SRT division algorithm is named after D. Sweeney, J. E. Robertson, and T. D. Tocher. with the recurrence in a new decimal signed-digit format. Quotient digits are selected using comparison multiples, where the magnitude of the quotient digit is calculated by comparing the truncated partial remainder with limited precision multiples of the divisor. The sign is determined concurrently by investigating the polarity of the truncated partial remainder. A timing evaluation using a logic synthesis shows a significant decrease in the division execution time in contrast with one of the fastest DFP dividers reported in the open literatureHooman Nikmehr, Braden Phillips and Cheng-Chew Li

Crossref

Adelaide Research & Scholarship

Think! Interactive Systems Need Safety Locks

Author: Harold Thimbleby
Publication venue: 'University of Zagreb - University Computing Centre'
Publication date: 01/01/2010
Field of study

This paper uses a simple analogy. A gun is designed to shoot bullets, but it is obvious that accidentally shooting is a danger one should avoid if at all possible. Thus guns have safety locks, which aim to protect users and bystanders. Interactive computer systems sometimes accidentally do bad things too, but something like “safety locks” are not often enough implemented to help protect users or bystanders from harm. Worse, user interfaces often behave quite unpredictably with erroneous input — rather than blocking errors and requiring the user to correct them. This is a bit like guns that misbehave. Computers and computers embedded in everyday devices are not always as dangerous as guns, although there are many cases where they can be as dangerous. Medical devices may give patients undetected overdoses. In-car entertainment devices, like radios, may, through their badly-designed user interfaces, cause a driver to have an accident. A slip in a spreadsheet may be the first step towards an organisation going bankrupt. And so on. The solution should include better design, including the concept of safety locks, that block some forms of user error

CiteSeerX

Crossref

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Area and performance tradeoffs in floating-point divide and square-root implementations

Author: ANDERSON S. F.
ATKINS D.E.
BANNON P.
BERRY M.
BRIGGS W. S.
CASE B.
COL T.
CONTE T.M.
DAWALLU K.
ERCEGOVAC M.D.
FOWLER D. L.
FRANTZESKAKIS E. N.
GREENLY D.
GWENNAP L.
GWENNAP L.
HENNESSY J.L.
HENNESSY J.L.
HUNT D.
LANG T.
MATS BARA
Miriam Leeser
MOLER C.
PENG V.
Peter Soderquist
QUILLAN S. E.
QUILLAN S.E.
SARMA D. D.
SARMA D. D.
STEARNS C.C.
TAYLOR G.S.
WHITE S.W.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Area And Performance Tradeoffs In Floating-Point Divide And Square Root Implementations

Author: Miriam Leeser
Miriam Leeser
Peter Soderquist
Peter Soderquist
Square Root Implementations
Square Root Implementations
Publication venue
Publication date
Field of study

ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or [email protected]. AREA AND PERFORMANCE TRADEOFFS IN FLOATING-POINT DIVIDE AND SQUARE ROOT IMPLEMENTATIONS Peter Soderquist Miriam Leeser School of Electrical Engineering Dept. of Electrical and Computer Engineering Cornell University Northeastern University Ithaca, New York 14853 Boston, Massachusetts 02115 E-mail: [email protected] E-mail: [email protected] Abstract Floating-point divide and square root operations are essential to many scientific and engineering applications, and are required in all computer systems that support the IEEE floating-point standard. Yet many current microprocessors provide only weak support for these operations. Th..

CiteSeerX

Acceleration for the many, not the few

Author: Woodruff Jackson
Publication venue: The University of Edinburgh
Publication date: 06/08/2024
Field of study

Although specialized hardware promises orders of magnitude performance gains, their uptake has been limited by how challenging it is to program them. Hardware accelerators present challenges programmers are not used to, exposing details of the hardware that are often hidden and requiring new programming styles to use them effectively. Existing programming models often involve learning complex and hardware-specific APIs, using Domain Specific Languages (DSLs), or programming in customized assembly languages. These programming models for hardware accelerators present a significant challenge to uptake: a steep, unforgiving, and untransferable learning curve. However, programming hardware accelerators using traditional programming models presents a challenge: mapping code not written with hardware accelerators in mind to accelerators with restricted behaviour. This thesis presents these challenges in the context of the acceleration equation, and it presents solutions to it in three different contexts: for regular expression accelerators, for API-programmable accelerators (with Fourier Transforms as a key case-study) and for heterogeneous coarse-grained reconfigurable arrays (CGRAs). This thesis shows that automatically morphing software written in traditional manners to fit hardware accelerators is possible with no programmer effort and that huge potential speedups are available

Edinburgh Research Archive

Investigating ray tracing algorithms and data structures in the context of visibility.

Author: RAVI KAMMAJE
Publication venue
Publication date: 01/01/2009
Field of study

Ray tracing is a popular rendering method with built in visibility determination. However, the computational costs are significant. To reduce them, there has been extensive research leading to innovative data structures and algorithms that optimally utilize both object and image coherence. Investigating these from a visibility determination context without considering further optical effects is the main motivation of the research. Three methods - one structure and two coherent tree traversal algorithms - are discussed. While the structure aims to increase coherence, the algorithms aim to optimise utilization of coherence provided by ray tracing structures (kd-trees, octrees). RBSF trees - Restricted Binary Space Partitioning Trees - build upon the research in ray tracing with kd-trees. A higher degree of freedom for split plane selection increases object coherence implying a reduction in the number of node traversals and triangle intersections for most scenes. Consequently, reduced ray casting times for scenes with predominantly non-axis-aligned triangles is observed. Coherent Rendering is a rendering method that shows improved complexity, but at an absolute performance that is much slower than packet ray tracing. However, since it led to the creation of the Row Tracing' algorithm, it is described briefly. Row Tracing can be considered as an adaptation of Coherent Rendering, scanline rendering or packet ray tracing. One row of the image is considered and its pixels are determined. Similar to Coherent Rendering, an adapted version of Hierarchical Occlusion Maps is used to identify and skip occluded nodes. To maximize utilisation of coherence, the method is extended so that several adjacent rows are traversed through the tree. The two versions of Row Tracing demonstrate excellent performance, exceeding that of packet ray tracing. Further, it is shown that for larger models (2 million+ triangles). Row Tracing and Packet Row Tracing significantly outperform Z-buffer based methods (OpenGL). Row tracing show's scalability over scene sizes leading to a rendering method that has fast rendering times for both large and small models. In addition it has excellent parallelisation properties allowing utilisation of multiple cores with ease. Thus, the Row Tracing and Packet Row Tracing algorithms can be considered as the significant contributions of the Ph.D. These data structures and algorithms demonstrate that ray tracing data structures and adaptations of ray tracing algorithms exhibit excellent potential in a visibility context

Cronfa at Swansea University