C
OMPUTER arithmetic is a field that encompasses the definition and standardization of arithmetic system for computers. The field also deals with issues of hardware and software implementations and their subsequent testing and verification. Many practitioners of the field also focus on the art and science of using computer arithmetic to carry out scientific and engineering computations. Computer arithmetic is therefore an interdisciplinary field that draws upon mathematics, computer science and electrical engineering. Advances in this field span from being highly theoretical (for instance, new exotic number systems) to being highly practical (for instance, new floating-point units for microprocessors).
Computer arithmetic has been an active field since the advent of computers. Like many persistent fields, the focus evolves along with the overall macroscopic technology advances and trends. Prominent recent technology themes include massive parallelism, heterogeneous computing, power efficiency and human-computer interaction. These trends spawn many activities such as the design of specialpurpose arithmetic units and accelerators for high performance and low energy consumption, developing algorithms that yield reproducible floating-point arithmetic results independent of a specific parallel or distributed execution schedules and the definition, implementation and algorithms related to human-friendly decimal arithmetic. Indeed, the field of computer arithmetic is as active as ever! Since 1969 the IEEE Symposium on Computer Arithmetic is the premier international event for computer arithmetic research. The most recent is the 21st edition of the conference (held every two years since 1981), having taken place in Austin in April 2013. After the conference, an open call for papers (for extended versions of the conference papers and for new papers) was released for this special section on computer arithmetic. A total of 44 full manuscripts were submitted. The papers were reviewed by 67 experts, each paper receiving at least three reviews. Two rounds of reviews led to the selection of just six papers for this special section. Three papers consider special-purpose hardware arithmetic units, ranging from special instruction implementation to special application domains. One paper considers the issue of numerical reproducibility in interval arithmetic. Two papers consider different aspects of decimal arithmetic.
Linear algebra is a widely applicable building block for scientific and engineering computations. Among previously proposed linear algebra accelerators, a common approach is to make them highly efficient on matrix-matrix product and let the general-purpose engine handle most of the algorithmic details. The paper "Algorithm, Architecture, and Floating-Point Unit Codesign of a Matrix Factorization Accelerator" by Ardavan Pedram, Andreas Gerstlauer and Robert A. van de Geijn proposes a different approach. The accelerator is built upon an enhanced instruction set and control flow logic to handle a good portion of several important factorization algorithms: Cholesky, LU, and QR. A strong case is made that this is a moderate increase of an accelerator's complexity that results in a significant increase in performance per watt.
Finding the maximum number in an unsorted list of binary numbers is an important task in many applications such as bioinformatics, video processing, sorting networks, etc.. The paper "Fast and Efficient Circuit Topologies for Finding the Maximum of n k-bit Numbers" by Bilgiday Yuce, H. Fatih Ugurdag, Sezer G€ oren, and Gunhan Dundar provides a detailed survey of existing Maximum Finder topologies and proposes a number of new parallel topologies offering greater efficiency than previous work in timing (latency), area, and energy consumption.
For visual applications, high degrees of parallelism for numerical calculations are provided in GPUs. To allow for the increasing amount of arithmetic units in GPUs, each individual operation needs to be energy efficient, so that the whole unit can stay within alloted energy budgets. The paper "Energy-Efficient Pixel-Arithmetic" by Nam Sung Kim, Syed Gilani, and Michael Schulte proposes energyefficient hardware support for multiplications for the special, but surprisingly common operand types of power of two values or sum of power of two values. For other types of operands approximate calculations are proposed that trade numerical accuracy with energy savings. The paper describes and evaluates an implementation of the proposed architecture and compares it to a standard implementation.
Binary to decimal conversion is a crucial functionality needed for any reasonable human-computer interaction. The classical algorithm that converts a binary integer to decimal uses repeated division. This algorithm is slow in general and can be prohibitively so in the context of arbitrary precision arithmetic. The paper "Division Free Binary-to-Decimal Conversion" by Cyril Bouvier and Paul Zimmermann introduces several algorithms that are purely multiplication based. The complexities ranges from quadratic to subquadratic. Correctness proofs together with rigorous complexity analysis are provided.
The hardware implementation of decimal fixed-point and floating-point arithmetic has regained importance in the last years for financial and accounting applications. The paper "Fast Radix-10 Multiplication Using Redundant BCD Codes" by Alvaro Vazquez, Elisardo Antelo and Javier D. Bruguera introduces a general Binary-Coded Decimal (BCD) redundant representation which includes the overloaded BCD and BCD representations as special cases. The proposed representation is applied to the design of BCD parallel multipliers. Thanks to the redundant BCD representation, binary carry-save adder trees are used to reduce the latency of the partial product reduction, and of the entire multiplier.
The main way to speedup numerical computations is the exploitation of parallelism. The broad availability of processors with multiple cores facilitates the scaling of the performance of numerical algorithms in this way, but also raises the issue of the reproducibility of the numerical results when running the algorithms on different platforms. The paper "Numerical Reproducibility and Parallel Computations: Issues for Interval Algorithms: Issues for Interval Algorithms" by Nathalie Revol and Philippe Th eveny addresses the specific issues of reproducibility for interval computations on parallel platforms. This paper categorizes the difficulties and explains various compensating strategies for achieving numerical reproducibility for parallel interval computations.
On behalf of every reader of our published papers, the guest editors would like to thank all the authors who submitted papers and labored to respond to reviewers suggestions, all the anonymous reviewers who evaluated submissions and suggested improvements, the editor in chief professor Albert Zomaya, and the entire staff of IEEE Transactions on Computers who oversaw the whole process.
Alberto Nannarelli Peter-Michael Seidel Ping Tak Peter Tang Guest Editors
Alberto Nannarelli graduated in electrical engineering from the University of Roma "La Sapienza," Italy, in 1988 Ping Tak Peter Tang received the BS degree in computer science from the University of Hawaii, Manoa, in 1980 and the PhD degree in mathematics from the University of California, Berkeley, in 1987. He is a senior principal engineer at Intel Corporation. After working as a research scientist at several U.S. Department of Energy laboratories, he transitioned to industry and has been working at Intel Corporation since 1999, except for a three-year period at D. E. Shaw Research where he helped design the supercomputer Anton 2. His interest includes computer arithmetic, numerical analysis and scientific computing. He is a member of the IEEE.
