Search CORE

12 research outputs found

The Electronic Controls Used in a Search For Fractional Charges in Mercury Drops

Author: Abrams Peter C.
Joyce David C.
Koburn K. R.
Walters William
Young Betty A.
Publication venue: Scholar Commons
Publication date: 01/01/1982
Field of study

At San Francisco State University, we have developed an Automatic Millikan Device (AMI)) for measuring the charge on small drops of Mercury. The device uses a standard atomic physics laboratory Millikan chamber, a piezoelectric driven ink-jet glass dropper, and a laser-photomultiplier system for tracking the motion of the drop. This paper describes the electronic control and error detection system used with the AMO. Signals from this system are sent to a microprocessor which controls the experiment. To this date (Dec 7, 1981), we have measured 175 micrograms of Hg and found no fractional charges in 1.05 x 1020 nucleons

Scholar Commons - Santa Clara University

How to square floats accurately and efficiently on the ST231 integer processor

Author: Jeannerod Claude-Pierre
Jourdan-Lu Jingyan
Monat Christophe
Revy Guillaume
Publication venue: HAL CCSD
Publication date: 19/11/2010
Field of study

We consider the problem of computing IEEE floating-point squares by means of integer arithmetic. We show how the specific properties of squaring can be exploited in order to design and implement algorithms that have much lower latency than those for general multiplication, while still guaranteeing correct rounding. Our algorithm descriptions are parameterized by the floating-point format, aim at high instruction-level parallelism (ILP) exposure, and cover all rounding modes. We show further that their C implementation for the binary32 format yields efficient codes for targets like the ST231 VLIW integer processor from STMicroelectronics, with a latency at least 1.75x smaller than that of general multiplication in the same context

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Truncated Binary Multipliers with minimum Mean Square Error: analytical characterization, circuit implementation and applications

Author: Garofalo Valeria
Publication venue
Publication date: 30/11/2009
Field of study

In the wireless multimedia word, DSP systems are ubiquitous. DSP algorithms are computationally intensive and test the limits of battery life in portable device such as cell phones, hearing aids, MP3 players, digital video recorders and so on. Multiplication and squaring are the main operation in many signal processing algorithms (filtering, convolution, FFT, DCT, euclidean distance etc.), hence efficient parallel multipliers are desirable. A full-width digital nxn bits multiplier computes the 2n bits output as a weighted sum of partial products. A multiplier with the output represented on n bits output is useful, as example, in DSP datapaths which saves the output in the same n bits registers of the input. Note that the truncated multipliers are useful not only for DSP but also for digital, computational intensive, ASICs where the bit-widths at the output of the arithmetic blocks are chosen on the basis of system-related accuracy issues. Hence 2n bits of precision at the multiplier output are very often more than required. A truncated multiplier is an nxn multiplier with n bits output. Since in a truncated multiplier the n less-significant bits of the full-width product are discarded, some of the partial products are removed and replaced by a suitable compensation function, to trade-off accuracy with hardware cost. Several techniques have been proposed in the Literature following this basic idea. The difference between the various circuits is in the choice and the implementation of the compensation circuit. The correction techniques proposed in the Literature are obtained through exhaustive search. This means that the results are only available for small n values and that the proposed approach are not extendable to greater bit widths. Furthermore the analytical characterization of the error is not possible. In this dissertation an innovative solution for the design and characterization of truncated multipliers is presented. The proposed circuits are based on the analytical calculation of the error of the truncated multiplier. This approach allows to have the description of a multiplier characterized by a minimum mean square error which gives a fast and low power VLSI implementation. Furthermore the analytical approach yields to a closed form expression of the mean square error and maximum absolute error for the proposed truncated multipliers. In this way the a priori knowledge of the output error is available. The errors are known for every bit width of the multiplier and it is also possible to decide, for a given bit width, which correction circuit has to be used in order to obtain a certain error. This analytical relation between the error and the parameters of hardware implementation is extremely important for the digital designer, since now it is possible to select the suitable implementation as a function of the desired accuracy. Proposed truncated multipliers overcome the previously proposed truncated multipliers since provide lower error, lower power dissipation, lower area occupation and also provide higher working frequency. The circuits are also easily implemented and allow an automatic HDL description as a function of bit width and desired error. The complete description of the errors for the truncated multipliers allows the use of these circuits as building blocks for more complex systems. It will be shown how the proposed multiplier can be used to design low area occupation FIR filters and an efficient PI temperature controller

Università degli Studi di Napoli Federico Il Open Archive

Squared Law Algorithms: Theory and Applications.

Author: Rao Poornachandra Bellamkonda
Publication venue: LSU Digital Commons
Publication date: 01/01/1993
Field of study

This dissertation focuses on a new approach for a hardware implementation of the cyclic convolution operation. The cyclic convolution operation is the core of several functions used in applications related to digital signal processing and error control. Since the operation is multiplication intensive and the cost of a multiplication operation is very high, most of the present research effort attempts to reduce the number of multiplications. Our approach, however, aims at obtaining an efficient implementation by relying on the properties of the special case of multiplication, namely, the squaring operation. Due to the properties exhibited by the squaring operation the hardware cost and time delay of a squarer unit is both cheaper and faster than that of a multiplication unit. This is true for both memory and non-memory based implementations. In this dissertation we have developed all the necessary theory required to express the cyclic convolution of two n-point sequences, where n is a power of 2, in terms of the elementary arithmetic operations add, square, and subtract. Our algorithms require fewer squaring operations than multiplication operations required by a traditional implementation of the cyclic convolution operation, do not introduce any round-off errors, place no restriction on word length, and are valid when the number of points to be convolved is a power of two. We then clearly demonstrate that our algorithms are also more hardware efficient for both memory and non-memory based implementations. Further, schemes to multiply two numbers based on the cyclic convolution operation are presented. Finally, efficient ways of computing the squaring operation when arithmetic is performed in modular rings are developed

Louisiana State University

Investigation into synchronization for partial response signals and the development of a clock recovery scheme for 49QPRS signals

Author: Jordaan Gert Daniel
Publication venue: Bloemfontein: Central University of Technology, Free State
Publication date: 01/01/1997
Field of study

ThesisData communication is used increasingly in modern society. It is against this background that research is conducted worldwide toward the improvement of existing, as well as the development of new, improved communication techniques. Correlative encoding of data before transmission IS a very frequency-effective communication technique. The extent to which any communication technique is used, however, is dependent on a wide variety of factors. This study regarding the synchronisation of 49QPRS signals was undertaken with this in mind. Since digital signal processing (DSP) is used increasingly in modern communication systems, both a data transmitter and receiver were implemented by making use of this technique. Not only would this result in a system with all the desirable characteristics inherent to DSP, but, by making limited changes to the supporting software, the evaluation of a wide variety of alternatives became feasible. During the study a system making use of a pilot tone at one third the frequency of the carrier frequency was developed. The receiver recovers this signal by means of DSP techniques and its frequency is tripled. The phase of this recovered signal is crosscorrelated every 650 ~s in time with a locally generated signal of the correct frequency - and the phase of the locally generated signal is adjusted accordingly. It was found that the accuracy and stability of the locally generated signal were such that sufficient synchronisation was obtained in this manner. The quality of synchronisation is a function of the level of the pilot tone and if this tone should decrease to below a certain value, unacceptably large phase adjustments have to be made. This results in a senous degradation of the spectral purity of the recovered signal. However, the system as described exhibits extremely good noise immunity. During the development of the clock frequency recovery system, a baseband filter with a unique frequency response was defined. Making use of this, in conjunction with a limited amount of pre-processing, and an absolute value rectifier, recovery of the clock frequency becomes possible. In order to limit the amount of processing by the receiver, the baseband filter was implemented in its entirety in the transmitter. The recovered signal showed a moderate amount of amplitude variation, but an extremely stable synchronising signal could be derived from this. During the study both levels of synchronisation required by a hypothetical 49QPRS data communication system were therefore investigated fully and solutions found

Central University Of Technology Free State - LibraryCUT, South Africa

Comparison of logarithmic and floating-point number systems implemented on Xilinx Virtex-II field-programmable gate arrays

Author: Lee Barry Roland
Publication venue
Publication date
Field of study

The aim of this thesis is to compare the implementation of parameterisable LNS (logarithmic number system) and floating-point high dynamic range number systems on FPGA. The Virtex/Virtex-II range of FPGAs from Xilinx, which are the most popular FPGA technology, are used to implement the designs. The study focuses on using the low level primitives of the technology in an efficient way and so initially the design issues in implementing fixed-point operators are considered. The four basic operations of addition, multiplication, division and square root are considered. Carry- free adders, ripple-carry adders, parallel multipliers and digit recurrence division and square root are discussed. The floating-point operators use the word format and exceptions as described by the IEEE std-754. A dual-path adder implementation is described in detail, as are floating-point multiplier, divider and square root components. Results and comparisons with other works are given. The efficient implementation of function evaluation methods is considered next. An overview of current FPGA methods is given and a new piecewise polynomial implementation using the Taylor series is presented and compared with other designs in the literature. In the next section the LNS word format, accuracy and exceptions are described and two new LNS addition/subtraction function approximations are described. The algorithms for performing multiplication, division and powering in the LNS domain are also described and are compared with other designs in the open literature. Parameterisable conversion algorithms to convert to/from the fixed-point domain from/to the LNS and floating-point domain are described and implementation results given. In the next chapter MATLAB bit-true software models are given that have the exact functionality as the hardware models. The interfaces of the models are given and a serial communication system to perform low speed system tests is described. A comparison of the LNS and floating-point number systems in terms of area and delay is given. Different functions implemented in LNS and floating-point arithmetic are also compared and conclusions are drawn. The results show that when the LNS is implemented with a 6-bit or less characteristic it is superior to floating-point. However, for larger characteristic lengths the floating-point system is more efficient due to the delay and exponential area increase of the LNS addition operator. The LNS is beneficial for larger characteristics than 6-bits only for specialist applications that require a high portion of division, multiplication, square root, powering operations and few additions

Online Research @ Cardiff

Energy-precision tradeoffs in the graphics pipeline

Author: Pool Jeff
Publication venue
Publication date: 01/05/2012
Field of study

The energy consumption of a graphics processing unit (GPU) is an important factor in its design, whether for a server, desktop, or mobile device. Mobile products, such as smart phones, tablets, and laptop computers, rely on batteries to function; the less the demand for power is on these batteries, the longer they will last before needing to be recharged. GPUs used in servers and desktops, while not dependent on a battery for operation, are still limited by the efficiency of power supplies and heat dissipation techniques. In this dissertation, I propose to lower the energy consumption of GPUs by reducing the precision of floating-point arithmetic in the graphics pipeline and the data sent and stored on- and off-chip. The key idea behind this work is twofold: energy can be saved through a systematic and targeted reduction in the number of bits 1) computed and 2) communicated. Reducing the number of bits computed will necessarily reduce either the precision or range of a floating point number. I focus on saving energy by way of reducing precision, which can exploit the over-provisioning of bits in many stages of the graphics pipeline. Reducing the number of bits communicated takes several forms. First, I propose enhancements to existing compression schemes for off-chip buffers to save bandwidth. I also suggest a simple extension that exploits unused bits in reduced-precision data undergoing compression. Finally, I present techniques for saving energy in on-chip communication of reduced-precision data. By designing and simulating variable-precision arithmetic circuits with promising energy versus precision characteristics and tradeoffs, I have developed an energy model for GPUs. Using this model and my techniques, I have shown that significant savings (up to 70% in computation in the vertex and pixel shader stages) are possible by reducing the precision of the arithmetic. Further, my compression approaches have enabled improvements of 1.26x over past work, and a general-purpose compressor design has achieved bandwidth savings of 34%, 87%, and 65% for color, depth, and geometry data, respectively, which is competitive with past work. Lastly, an initial exploration in signal gating unused lines in on-chip buses has suggested savings of 13-48% for the tested applications' traffic from a multiprocessor's register file to its L1 cache

Carolina Digital Repository