Search CORE

32 research outputs found

Exploiting primitive grouping constraints for noise robust automatic speech recognition : studies with simultaneous speech.

Author: Coy André
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 01/01/2008
Field of study

Significant strides have been made in the field of automatic speech recognition over the past three decades. However, the systems are not robust; their performance degrades in the presence of even moderate amounts of noise. This thesis presents an approach to developing a speech recognition system that takes inspiration firom the approach of human speech recognition

White Rose E-theses Online

OpenGrey Repository

A Structured Design Methodology for High Performance VLSI Arrays

Author
Publication venue
Publication date: 01/01/2012
Field of study

abstract: The geometric growth in the integrated circuit technology due to transistor scaling also with system-on-chip design strategy, the complexity of the integrated circuit has increased manifold. Short time to market with high reliability and performance is one of the most competitive challenges. Both custom and ASIC design methodologies have evolved over the time to cope with this but the high manual labor in custom and statistic design in ASIC are still causes of concern. This work proposes a new circuit design strategy that focuses mostly on arrayed structures like TLB, RF, Cache, IPCAM etc. that reduces the manual effort to a great extent and also makes the design regular, repetitive still achieving high performance. The method proposes making the complete design custom schematic but using the standard cells. This requires adding some custom cells to the already exhaustive library to optimize the design for performance. Once schematic is finalized, the designer places these standard cells in a spreadsheet, placing closely the cells in the critical paths. A Perl script then generates Cadence Encounter compatible placement file. The design is then routed in Encounter. Since designer is the best judge of the circuit architecture, placement by the designer will allow achieve most optimal design. Several designs like IPCAM, issue logic, TLB, RF and Cache designs were carried out and the performance were compared against the fully custom and ASIC flow. The TLB, RF and Cache were the part of the HEMES microprocessor.Dissertation/ThesisPh.D. Electrical Engineering 201

ASU Digital Repository

Calcul sur architecture non fiable

Author: Yangyang Tang
Publication venue: HAL CCSD
Publication date: 29/01/2013
Field of study

Although materials could be fabricated as error-free theoretically with a huge cost for worst-case design methodologies, the circuit is still susceptible to transient faults by the effects of radiation, temperature sensitivity, and etc. On the contrary, an error-resilient design enables the manufacturing process to be relieved from the variability issue so as to save material cost. Since variability and transient upsets are worsening as emerging fabrication process and size shrink are tending intense, the requirement of robust design is imminent. This thesis addresses the issue of designing on unreliable circuit. The main contributions are fourfold. Firstly a fast error-correction and low cost redundancy fault-tolerant method is presented. Moreover, we introduce judicious two-dimensional criteria to estimate the reliability and the hardware efﬁciency of a circuit. A general-purpose model offers low-redundancy error-resilience for contemporary logic systems as well as future nanoeletronic architectures. At last, a decoder against internal transient faults is designed in this work.En théorie, les circuits électroniques conçus selon la méthode du pire-cas sont supposés garantir un fonctionnement sans erreur pourun coût d’implémentation élevé. Dans la pratique les circuits restent sujets aux erreurs transitoires du fait de leur sensibilité aux aléastels que la radiation et la température. En revanche, une conception prenant en compte la tolérance aux fautes permet de faire face à detels aléas comme la variabilité du processus de fabrication. De plus, les erreurs transitoires et la variabilité de fabrication s’intensiﬁentavec l’émergence de nouveaux processus de fabrication et des circuits de dimension de plus en plus réduite. La demande d’une conceptionintégrant la tolérance aux fautes devient désormais primordiale. La présente thèse a pour objectif de cerner la problématique de laconception de circuits sur des puces peu ﬁables et apporte des contributions suivant quatre aspects. Dans un premier temps, nous proposonsune méthode de tolérance aux fautes, basée sur la correction d’erreurs et la redondance à faible coût. Puis, nous présentonsun critère bidimensionnel judicieux permettant d’évaluer la ﬁabilité et l’efﬁcacité matérielle de circuits. Nous proposons ensuite un modèleuniversel qui apporte une tolérance avec fautes à redondance faible pour les systèmes logiques d’aujourd’hui et les architecturesnanoélectroniques de demain. Enﬁn, nous découvrons un décodeur tolérant aux fautes transitoires internes

Thèses en Ligne

HAL-Université de Bretagne Occidentale

Time Series Analysis and Classification with State-Space Models for Industrial Processes and the Life Sciences

Author: Jäger Mark Christoph
Publication venue
Publication date: 01/01/2007
Field of study

In this thesis the use of state-space models for analysis and classification of time series data, gathered from industrial manufacturing processes and the life sciences, is investigated. To overcome hitherto unsolved problems in both application domains the temporal behavior of the data is captured using state-space models. Industrial laser welding processes are monitored with a high speed camera and the appearance of unusual events in the image sequences correlates with errors on the produced part. Thus, novel classification frameworks are developed to robustly detect these unusual events with a small false positive rate. For classifier learning, class labels are by default only available for the complete image sequence, since scanning the sequences for anomalies is expensive. The first framework combines appearance based features and state-space models for the unusual event detection in image sequences. For the first time, ideas adapted from face recognition are used for the automatic dimension reduction of images recorded from laser welding processes. The state-space model is trained incrementally and can learn from erroneous sequences without the need of manually labeling the position of the error event within sequences. %The limitation to weakly labeled data helps to reduce the labeling effort. In addition, a second framework for the object-based detection of sputter events in laser welding processes is developed. The framework successfully combines for the first time temporal change detection, object tracking and trajectory classification for the detection of weak sputter events. %This is the first time that object tracking is successfully applied to automatic sputter detection. For the application in the life sciences the improvement and further development of data analysis methods for Single Molecule Fluorescence Spectroscopy (SMFS) is considered. SMFS experiments allow to study biochemical processes on a single molecule basis. The single molecule is excited with a laser and the photons which are emitted thereon by fluorescence contain important information about conformational changes of the molecule. Advanced statistical analysis techniques are necessary to infer state changes of the molecule from changes in the photon emissions. By using state-space models, it is possible to extract information from recorded photon streams which would be lost with traditional analysis techniques

Heidelberger Dokumentenserver

Enhanced coding, clock recovery and detection for a magnetic credit card

Author: Smith Daniel Felix
Publication venue: 'University of Plymouth'
Publication date: 01/01/1998
Field of study

Merged with duplicate record 10026.1/2299 on 03.04.2017 by CS (TIS)This thesis describes the background, investigation and construction of a system for storing data on the magnetic stripe of a standard three-inch plastic credit in: inch card. Investigation shows that the information storage limit within a 3.375 in by 0.11 in rectangle of the stripe is bounded to about 20 kBytes. Practical issues limit the data storage to around 300 Bytes with a low raw error rate: a four-fold density increase over the standard. Removal of the timing jitter (that is prob-' ably caused by the magnetic medium particle size) would increase the limit to 1500 Bytes with no other system changes. This is enough capacity for either a small digital passport photograph or a digitized signature: making it possible to remove printed versions from the surface of the card. To achieve even these modest gains has required the development of a new variable rate code that is more resilient to timing errors than other codes in its efficiency class. The tabulation of the effects of timing errors required the construction of a new code metric and self-recovering decoders. In addition, a new method of timing recovery, based on the signal 'snatches' has been invented to increase the rapidity with which a Bayesian decoder can track the changing velocity of a hand-swiped card. The timing recovery and Bayesian detector have been integrated into one computation (software) unit that is self-contained and can decode a general class of (d, k) constrained codes. Additionally, the unit has a signal truncation mechanism to alleviate some of the effects of non-linear distortion that are present when a magnetic card is read with a magneto-resistive magnetic sensor that has been driven beyond its bias magnetization. While the storage density is low and the total storage capacity is meagre in comparison with contemporary storage devices, the high density card may still have a niche role to play in society. Nevertheless, in the face of the Smart card its long term outlook is uncertain. However, several areas of coding and detection under short-duration extreme conditions have brought new decoding methods to light. The scope of these methods is not limited just to the credit card

Plymouth Electronic Archive and Research Library

Cross-Layer Optimization for Power-Efficient and Robust Digital Circuits and Systems

Author: Huang Yanxiang
Publication venue
Publication date: 15/09/2017
Field of study

With the increasing digital services demand, performance and power-efficiency become vital requirements for digital circuits and systems. However, the enabling CMOS technology scaling has been facing significant challenges of device uncertainties, such as process, voltage, and temperature variations. To ensure system reliability, worst-case corner assumptions are usually made in each design level. However, the over-pessimistic worst-case margin leads to unnecessary power waste and performance loss as high as 2.2x. Since optimizations are traditionally confined to each specific level, those safe margins can hardly be properly exploited. To tackle the challenge, it is therefore advised in this Ph.D. thesis to perform a cross-layer optimization for digital signal processing circuits and systems, to achieve a global balance of power consumption and output quality. To conclude, the traditional over-pessimistic worst-case approach leads to huge power waste. In contrast, the adaptive voltage scaling approach saves power (25% for the CORDIC application) by providing a just-needed supply voltage. The power saving is maximized (46% for CORDIC) when a more aggressive voltage over-scaling scheme is applied. These sparsely occurred circuit errors produced by aggressive voltage over-scaling are mitigated by higher level error resilient designs. For functions like FFT and CORDIC, smart error mitigation schemes were proposed to enhance reliability (soft-errors and timing-errors, respectively). Applications like Massive MIMO systems are robust against lower level errors, thanks to the intrinsically redundant antennas. This property makes it applicable to embrace digital hardware that trades quality for power savings.Comment: 190 page

arXiv.org e-Print Archive

Lirias

Processing of Ex-Situ Acquired Signals from Magnetic Disks

Author: McAvoy Patrick Charles
Publication venue
Publication date: 19/11/2008
Field of study

The ubiquity and high performance of hard disk drives for nonvolatile digital data storage cannot be denied. As the magnetic recording industry continues to develop new techniques for increasing storage density and reducing cost per bit, diagnostic and forensic tools for characterizing and interpreting the magnetic patterns recorded onto disk drive media become increasingly important. Therefore, this dissertation presents developments to the uniquely suitable spin-stand-based method of imaging magnetization patterns on media extracted from commercial hard disk drives. The emphasis of the presented research is placed on the following three areas: microscopy enhancement techniques for longitudinal magnetic recording media, "drive-independent" characterization and reconstruction of disk data, and the exploration of spin-stand microscopy in the novel context of perpendicular magnetic recording. First, it is known that, while the spin-stand microscopy technique offers high-speed and massive scale imaging capabilities, the images obtained are corrupted by distortion due to the non-local sensing or finite spatial resolution of the imaging sensor. Two techniques for mitigating this distortion, one based on characterizing the head by means of its linear response function, and a new method based on spatial Hilbert transforms, are described and demonstrated. Furthermore, a two-dimensional extension of the Hilbert transform in the context of magnetic recording is derived based on physical arguments and its application to spin-stand imaging is demonstrated. Second, although magnetic media imaging is interesting in its own right, an extension of this capability is the identification of commercial hard disk drive write channels and the subsequent reconstruction of the data written to the associated disks in a "drive-independent" manner on the spin-stand. For fundamental and practical reasons, a multilayered encoding process is performed on digital data before it is written to the disk; the presented work details the theoretical and experimental results obtained in characterizing and reversing these codes. Finally, because perpendicular recording technology has recently come on the market in consumer disk drives, the spin-stand microscopy technique is extended to imaging the media employing this new mode of recording. In particular, the novel aspects of perpendicular recording are discussed and their impact on spin-stand microscopy is demonstrated

Digital Repository at the University of Maryland

Reconfigurable Antenna Systems: Platform implementation and low-power matters

Author: Ciccia Simone
Publication venue: Politecnico di Torino
Publication date: 01/01/2018
Field of study

Antennas are a necessary and often critical component of all wireless systems, of which they share the ever-increasing complexity and the challenges of present and emerging trends. 5G, massive low-orbit satellite architectures (e.g. OneWeb), industry 4.0, Internet of Things (IoT), satcom on-the-move, Advanced Driver Assistance Systems (ADAS) and Autonomous Vehicles, all call for highly flexible systems, and antenna reconfigurability is an enabling part of these advances. The terminal segment is particularly crucial in this sense, encompassing both very compact antennas or low-profile antennas, all with various adaptability/reconfigurability requirements. This thesis work has dealt with hardware implementation issues of Radio Frequency (RF) antenna reconfigurability, and in particular with low-power General Purpose Platforms (GPP); the work has encompassed Software Defined Radio (SDR) implementation, as well as embedded low-power platforms (in particular on STM32 Nucleo family of micro-controller). The hardware-software platform work has been complemented with design and fabrication of reconfigurable antennas in standard technology, and the resulting systems tested. The selected antenna technology was antenna array with continuously steerable beam, controlled by voltage-driven phase shifting circuits. Applications included notably Wireless Sensor Network (WSN) deployed in the Italian scientific mission in Antarctica, in a traffic-monitoring case study (EU H2020 project), and into an innovative Global Navigation Satellite Systems (GNSS) antenna concept (patent application submitted). The SDR implementation focused on a low-cost and low-power Software-defined radio open-source platform with IEEE 802.11 a/g/p wireless communication capability. In a second embodiment, the flexibility of the SDR paradigm has been traded off to avoid the power consumption associated to the relevant operating system. Application field of reconfigurable antenna is, however, not limited to a better management of the energy consumption. The analysis has also been extended to satellites positioning application. A novel beamforming method has presented demonstrating improvements in the quality of signals received from satellites. Regarding those who deal with positioning algorithms, this advancement help improving precision on the estimated position

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

On the Design of Future Communication Systems with Coded Transport, Storage, and Computing

Author: Cabrera Guerrero Juan Alberto
Publication venue
Publication date: 04/07/2022
Field of study

Communication systems are experiencing a fundamental change. There are novel applications that require an increased performance not only of throughput but also latency, reliability, security, and heterogeneity support from these systems. To fulfil the requirements, future systems understand communication not only as the transport of bits but also as their storage, processing, and relation. In these systems, every network node has transport storage and computing resources that the network operator and its users can exploit through virtualisation and softwarisation of the resources. It is within this context that this work presents its results. We proposed distributed coded approaches to improve communication systems. Our results improve the reliability and latency performance of the transport of information. They also increase the reliability, flexibility, and throughput of storage applications. Furthermore, based on the lessons that coded approaches improve the transport and storage performance of communication systems, we propose a distributed coded approach for the computing of novel in-network applications such as the steering and control of cyber-physical systems. Our proposed approach can increase the reliability and latency performance of distributed in-network computing in the presence of errors, erasures, and attackers

Technische Universität Dresden: Qucosa

Energy-Efficient Recurrent Neural Network Accelerators for Real-Time Inference

Author: Gao Chang
Publication venue
Publication date: 01/01/2022
Field of study

Over the past decade, Deep Learning (DL) and Deep Neural Network (DNN) have gone through a rapid development. They are now vastly applied to various applications and have profoundly changed the life of hu- man beings. As an essential element of DNN, Recurrent Neural Networks (RNN) are helpful in processing time-sequential data and are widely used in applications such as speech recognition and machine translation. RNNs are difficult to compute because of their massive arithmetic operations and large memory footprint. RNN inference workloads used to be executed on conventional general-purpose processors including Central Processing Units (CPU) and Graphics Processing Units (GPU); however, they have un- necessary hardware blocks for RNN computation such as branch predictor, caching system, making them not optimal for RNN processing. To accelerate RNN computations and outperform the performance of conventional processors, previous work focused on optimization methods on both software and hardware. On the software side, previous works mainly used model compression to reduce the memory footprint and the arithmetic operations of RNNs. On the hardware side, previous works also designed domain-specific hardware accelerators based on Field Pro- grammable Gate Arrays (FPGA) or Application Specific Integrated Circuits (ASIC) with customized hardware pipelines optimized for efficient pro- cessing of RNNs. By following this software-hardware co-design strategy, previous works achieved at least 10X speedup over conventional processors. Many previous works focused on achieving high throughput with a large batch of input streams. However, in real-time applications, such as gaming Artificial Intellegence (AI), dynamical system control, low latency is more critical. Moreover, there is a trend of offloading neural network workloads to edge devices to provide a better user experience and privacy protection. Edge devices, such as mobile phones and wearable devices, are usually resource-constrained with a tight power budget. They require RNN hard- ware that is more energy-efficient to realize both low-latency inference and long battery life. Brain neurons have sparsity in both the spatial domain and time domain. Inspired by this human nature, previous work mainly explored model compression to induce spatial sparsity in RNNs. The delta network algorithm alternatively induces temporal sparsity in RNNs and can save over 10X arithmetic operations in RNNs proven by previous works. In this work, we have proposed customized hardware accelerators to exploit temporal sparsity in Gated Recurrent Unit (GRU)-RNNs and Long Short-Term Memory (LSTM)-RNNs to achieve energy-efficient real-time RNN inference. First, we have proposed DeltaRNN, the first-ever RNN accelerator to exploit temporal sparsity in GRU-RNNs. DeltaRNN has achieved 1.2 TOp/s effective throughput with a batch size of 1, which is 15X higher than its related works. Second, we have designed EdgeDRNN to accelerate GRU-RNN edge inference. Compared to DeltaRNN, EdgeDRNN does not rely on on-chip memory to store RNN weights and focuses on reducing off-chip Dynamic Random Access Memory (DRAM) data traffic using a more scalable architecture. EdgeDRNN have realized real-time inference of large GRU-RNNs with submillisecond latency and only 2.3 W wall plug power consumption, achieving 4X higher energy efficiency than commercial edge AI platforms like NVIDIA Jetson Nano. Third, we have used DeltaRNN to realize the first-ever continuous speech recognition sys- tem with the Dynamic Audio Sensor (DAS) as the front-end. The DAS is a neuromorphic event-driven sensor that produces a stream of asyn- chronous events instead of audio data sampled at a fixed sample rate. We have also showcased how an RNN accelerator can be integrated with an event-driven sensor on the same chip to realize ultra-low-power Keyword Spotting (KWS) on the extreme edge. Fourth, we have used EdgeDRNN to control a powered robotic prosthesis using an RNN controller to replace a conventional proportional–derivative (PD) controller. EdgeDRNN has achieved 21 μs latency of running the RNN controller and could maintain stable control of the prosthesis. We have used DeltaRNN and EdgeDRNN to solve these problems to prove their value in solving real-world problems. Finally, we have applied the delta network algorithm on LSTM-RNNs and have combined it with a customized structured pruning method, called Column-Balanced Targeted Dropout (CBTD), to induce spatio-temporal sparsity in LSTM-RNNs. Then, we have proposed another FPGA-based accelerator called Spartus, the first RNN accelerator that exploits spatio- temporal sparsity. Spartus achieved 9.4 TOp/s effective throughput with a batch size of 1, the highest among present FPGA-based RNN accelerators with a power budget around 10 W. Spartus can complete the inference of an LSTM layer having 5 million parameters within 1 μs

ZORA