2,807 research outputs found
Hardware-software co-design of an iris recognition algorithm
This paper describes the implementation of an iris recognition algorithm based
on hardware-software co-design. The system architecture consists of a general-purpose 32-
bit microprocessor and several slave coprocessors that accelerate the most intensive
calculations. The whole iris recognition algorithm has been implemented on a low-cost
Spartan 3 FPGA, achieving significant reduction in execution time when compared to a
conventional software-based application. Experimental results show that with a clock
speed of 40 MHz, an IrisCode is obtained in less than 523 ms from an image of 640x480
pixels, which is just 20% of the total time needed by a software solution running on the
same microprocessor embedded in the architecture.Peer ReviewedPreprin
Testing Embedded Memories in Telecommunication Systems
Extensive system testing is mandatory nowadays to achieve high product quality. Telecommunication systems are particularly sensitive to such a requirement; to maintain market competitiveness, manufacturers need to combine reduced costs, shorter life cycles, advanced technologies, and high quality. Moreover, strict reliability constraints usually impose very low fault latencies and a high degree of fault detection for both permanent and transient faults. This article analyzes major problems related to testing complex telecommunication systems, with particular emphasis on their memory modules, often so critical from the reliability point of view. In particular, advanced BIST-based solutions are analyzed, and two significant industrial case studies presente
A configurable vector processor for accelerating speech coding algorithms
The growing demand for voice-over-packer (VoIP) services and multimedia-rich
applications has made increasingly important the efficient, real-time implementation of
low-bit rates speech coders on embedded VLSI platforms. Such speech coders are
designed to substantially reduce the bandwidth requirements thus enabling dense multichannel
gateways in small form factor. This however comes at a high computational cost
which mandates the use of very high performance embedded processors.
This thesis investigates the potential acceleration of two major ITU-T speech coding
algorithms, namely G.729A and G.723.1, through their efficient implementation on a
configurable extensible vector embedded CPU architecture. New scalar and vector ISAs
were introduced which resulted in up to 80% reduction in the dynamic instruction count
of both workloads. These instructions were subsequently encapsulated into a parametric,
hybrid SISD (scalar processor)–SIMD (vector) processor. This work presents the research
and implementation of the vector datapath of this vector coprocessor which is tightly-coupled
to a Sparc-V8 compliant CPU, the optimization and simulation methodologies
employed and the use of Electronic System Level (ESL) techniques to rapidly design
SIMD datapaths
The development and implementation of a single-line intelligent digital telephone answering unit on a personal computer
ThesisCommercial telephone answering machines are limited to some extent by
one or more of the following factors:
• limited facilities
• difficult to upgrade
• nonstandard telephone interfacing
• expensive
• lack of user-friendliness
• lack of dialogue and intelligence
The purpose of this study is to design an intelligent digital telephone
system which will overcome as many of the above-mentioned problems
as possible. The following features are proposed and will be discussed:
The use of a commonly available, but powerful, personal computer
processor and memory instead of the elementary and rigid processor and
magnetic tape storage units of the commercial telephone answering
machine . This allows the quick storage and retrieval of digitized
messages, each with its individual name, time and date stamp.
Using the personal computer's hardware and not duplicating the
processor and memory units allows a more cost-effective system
upgrade. Upgrades mainly consist of software changes and minor
hardware changes. This means that an upgrade does not implicate a total
hardware redesign. Standards as prescribed by the local switching network standards and
the Department of Post and Telecommunications, apply to this design
and are applicable for licensing of the product.
It is evident that the cost of this project and design is kept minimal by
not duplicating expensive components like the microprocessor and the
memory units, although these are used in the design. In this respect
upgrades are software orientated to further limit the costs.
The personal computer is equipped with a display which allows the user
to make easy selections in order to execute the required instructions or to
obtain information by using the help functions. This real-time help
function eliminates the need for a user manual.
Dialogue between user and personal computer over the telephone
network offers a simple method of delivering information without the
need for any extra equipment such as modems, keyboards or display
units.
The software used on the personal computer is designed in such a way
that the system is intelligent and capable of decision making.
Communication from the public telephone network is possible by using
the telephone keypad and Dual Tone Multifrequency (DTMF) signalling
Integrated Control of Microfluidics – Application in Fluid Routing, Sensor Synchronization, and Real-Time Feedback Control
Microfluidic applications range from combinatorial chemical synthesis to high-throughput screening, with platforms integrating analog perfusion components, digitally controlled microvalves, and a range of sensors that demand a variety of communication protocols. A comprehensive solution for microfluidic control has to support an arbitrary combination of microfluidic components and to meet the demand for easy-to-operate system as it arises from the growing community of unspecialized microfluidics users. It should also be an easy to modify and extendable platform, which offer an adequate computational resources, preferably without a need for a local computer terminal for increased mobility. Here we will describe several implementation of microfluidics control technologies and propose a microprocessor-based unit that unifies them. Integrated control can streamline the generation process of complex perfusion sequences required for sensor-integrated microfluidic platforms that demand iterative operation procedures such as calibration, sensing, data acquisition, and decision making. It also enables the implementation of intricate optimization protocols, which often require significant computational resources. System integration is an imperative developmental milestone for the field of microfluidics, both in terms of the scalability of increasingly complex platforms that still lack standardization, and the incorporation and adoption of emerging technologies in biomedical research. Here we describe a modular integration and synchronization of a complex multicomponent microfluidic platform
Vector processing-aware advanced clock-gating techniques for low-power fused multiply-add
The need for power efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a retailoring for the mobile market that they are entering now. Floating-point (FP) fused multiply-add (FMA), being a functional unit with high power consumption, deserves special attention. Although clock gating is a well-known method to reduce switching power in synchronous designs, there are unexplored opportunities for its application to vector processors, especially when considering active operating mode. In this research, we comprehensively identify, propose, and evaluate the most suitable clock-gating techniques for vector FMA units (VFUs). These techniques ensure power savings without jeopardizing the timing. We evaluate the proposed techniques using both synthetic and “real-world” application-based benchmarking. Using vector masking and vector multilane-aware clock gating, we report power reductions of up to 52%, assuming active VFU operating at the peak performance. Among other findings, we observe that vector instruction-based clock-gating techniques achieve power savings for all vector FP instructions. Finally, when evaluating all techniques together, using “real-world” benchmarking, the power reductions are up to 80%. Additionally, in accordance with processor design trends, we perform this research in a fully parameterizable and automated fashion.The research leading to these results has received funding from the RoMoL ERC Advanced Grant GA 321253 and is supported in part by the European Union (FEDER funds) under contract TTIN2015-65316-P.
The work of I. Ratkovic was supported by a FPU research grant from the Spanish MECD.Peer ReviewedPostprint (author's final draft
The "Tiepstem" : an experimental Dutch keyboard-to-speech system for the speech impaired
An experimental Dutch keyboard-to-speech system has been developed to explor the possibilities and limitations of Dutch speech synthesis in a communication aid for the speech impaired. The system uses diphones and a formant synthesizer chip for speech synthesis. Input to the system is in pseudo-phonetic notation. Intonation contours using a declination line and various rises and falls are generated starting from an input consisting of punctuation and accent marks. The hardware design has resulted in a small, portable and battery-powered device. A short evaluation with users has been carried out, which has shown possibilities for such a device but has also indicated some problems with the current pseudo-phonetic input
Fast, area-efficient 32-bit LNS for computer arithmetic operations
PhD ThesisThe logarithmic number system has been proposed as an alternative to floating-point.
Multiplication, division and square-root operations are accomplished with fixedpoint
arithmetic, but addition and subtraction are considerably more challenging.
Recent work has demonstrated that these operations too can be done with similar
speed and accuracy to their floating-point equivalents, but the necessary circuitry is
complex. In particular, it is dominated by the need for large lookup tables for the
storage of a non-linear function.
This thesis describes the architectures required to implement a newly design
approach for producing fast and area-efficient 32-bit LNS arithmetic unit. The
designs are structured based on two different algorithms. At first, a new cotransformation
procedure is introduced in the singularity region whilst performing
subtractions in which the technique capable to generate less total storage than the cotransformation
method in the previous LNS architecture. Secondly, improvement to
an existing interpolation process is proposed, that also reduce the total tables to an
extent that allows their easy synthesis in logic. Consequently, the total delays in the
system can be significantly reduced.
According to the comparison analysis with previous best LNS design and
floating-point units, it is shown that the new LNS architecture capable to offer
significantly better in speed while sustaining its accuracy within floating-point limit.
In addition, its implementation is more economical than previous best LNS system
and almost equivalent with existing floating-point arithmetic unit.University Malaysia Perlis:
Ministry of Higher Education, Malaysia
Design of a portable microprocessor-based International Phonetic Alphabet (IPA) text-to-speech conversion device for use by the speech impaired
A portable microprocessor-based alternate communication device was designed and a prototype fabricated. This device allows speech impaired individuals, whose language skills remain intact, to input and edit an utterance of unrestricted vocabulary via a keyboard/LCD display system. The utterance, which is specified using the International Phonetic Alphabet (IPA), is then converted into an appropriate set of speech synthesizer parameters, using context sensitive rules. An interrupt driven system is used to pass each set of parameters, in order and at the appropriate time, to the synthesizer, thus generating an audible output;Tests conducted using the device in its present state have shown the utterance specification process to be slower than desired and the speech produced to be rather robotic in nature. Although understanding the speech generated is sometimes a problem, intelligibility improves with time as the listener becomes used to the synthesized speech and the programmer\u27s ability to specify the utterance improves;As the current implementation does not include rules that vary the suprasegmental features, it is anticipated that the introduction of such rules will improve the quality and intelligibility of the speech output. Further suggestions for future development are provided, and focus on expediting the utterance specification process and improving the quality of the speech generated
- …