27 research outputs found
Implementation of arithmetic primitives using truly deep submicron technology (TDST)
The invention of the transistor in 1947 at Bell Laboratories revolutionised the electronics industry and created a powerful platform for emergence of new industries. The quest to increase the number of devices per chip over the last four decades has resulted in rapid transition from Small-Scale-Integration (SSI) and Large-Scale-lntegration (LSI), through to the Very-Large-Scale-Integration (VLSI) technologies, incorporating approximately 10 to 100 million devices per chip. The next phase in this evolution is the Ultra-Large-Scale-Integration (ULSI) aiming to realise new application domains currently not accessible to CMOS technology. Although technology is continuously evolving to produce smaller systems with minimised power dissipation, the IC industry is facing major challenges due to constraints on power density (W/cm2) and high dynamic (operating) and static (standby) power dissipation. Mobile multimedia communication and optical based technologies have rapidly become a significant area of research and development challenging a variety of technological fronts. The future emergence or 4G (4th Generation) wireless communications networks is further driving this development, requiring increasing levels of media rich content. The processing requirements for capture, conversion, compression, decompression, enhancement and display of higher quality multimedia, place heavy demands on current ULSI systems. This is also apparent for mobile applications and intelligent optical networks where silicon chip area and power dissipation become primary considerations. In addition to the requirements for very low power, compact size and real-time processing, the rapidly evolving nature of telecommunication networks means that flexible soft programmable systems capable of adaptation to support a number of different standards and/or roles become highly desirable. In order to fully realise the capabilities promised by the 4G and supporting intelligent networks, new enabling technologies arc needed to facilitate the next generation of personal communications devices. Most of the current solutions to meet these challenges are based on various implementations of conventional architectures. For decades, silicon has been the main platform of computing, however it is slow, bulky, runs too hot, and is too expensive. Thus, new approaches to architectures, driving multimedia and future telecommunications systems, are needed in order to extend the life cycle of silicon technology. The emergence of Truly Deep Submicron Technology (TDST) and related 3-D interconnection technologies have provided potential alternatives from conventional architectures to 3-D system solutions, through integration of IDST, Vertical Software Mapping and Intelligent Interconnect Technology (IIT). The concept of Soft-Chip Technology (SCT) entails integration of Soft• Processing Circuits with Soft-Configurable Circuits . This concept can effectively manipulate hardware primitives through vertical integration of control and data. Thus the notion of 3-D Soft-Chip emerges as a new design algorithm for content-rich multimedia, telecommunication and intelligent networking system applications. 3•D architectures (design algorithms used suitable for 3-D soft-chip technology), are driven by three factors. The first is development of new device technology (TDST) that can support new architectures with complexities of 100M to 1000M devices. The second is development of advanced wafer bonding techniques such as Indium bump and the more futuristic optical interconnects for 3-D soft-chip mapping. The third is related to improving the performance of silicon CMOS systems as devices continue to scale down in dimensions. One of the fundamental building blocks of any computer system is the arithmetic component. Optimum performance of the system is determined by the efficiency of each individual component, as well as the network as a whole entity. Development of configurable arithmetic primitives is the fundamental focus in 3-D architecture design where functionality can be implemented through soft configurable hardware elements. Therefore the ability to improve the performance capability of a system is of crucial importance for a successful design. Important factors that predict the efficiency of such arithmetic components are: • The propagation delay of the circuit, caused by the gate, diffusion and wire capacitances within !he circuit, minimised through transistor sizing. and • Power dissipation, which is generally based on node transition activity. [2] Although optimum performance of 3-D soft-chip systems is primarily established by the choice of basic primitives such as adders and multipliers, the interconnecting network also has significant degree of influence on !he efficiency of the system. 3-D superposition of devices can decrease interconnect delays by up to 60% compared to a similar planar architecture. This research is based on development and implementation of configurable arithmetic primitives, suitable to the 3-D architecture, and has these foci: • To develop a variety of arithmetic components such as adders and multipliers with particular emphasis on minimum area and compatible with 3-D soft-chip design paradigm. • To explore implementation of configurable distributed primitives for arithmetic processing. This entails optimisation of basic primitives, and using them as part of array processing. In this research the detailed designs of configurable arithmetic primitives are implemented using TDST O.l3µm (130nm) technology, utilising CAD software such as Mentor Graphics and Cadence in Custom design mode, carrying through design, simulation and verification steps
Optimized hardware implementations of cryptography algorithms for resource-constraint IoT devices and high-speed applications
The advent of technologies, including the Internet and smartphones, has made people’s lives easier. Nowadays, people get used to digital applications for e-business, communicating with others, and sending or receiving sensitive messages. Sending secure data across the private network or the Internet is an open concern for every person. Cryptography plays an important role in privacy, security, and confidentiality against adversaries. Public-key cryptography (PKC) is one of the cryptography techniques that provides security over a large network, such as the Internet of Things (IoT).
The classical PKCs, such as Elliptic Curve Cryptography (ECC) and Rivest-Shamir-Adleman (RSA), are based on the hardness of certain number theoretic problems. According to Shor’s algorithm, these algorithms can be solved very efficiently on a quantum computer, and cryptography algorithms will be insecure and weak as quantum computers increase in number. Based on NIST, Lattice-based cryptography (LBC) is one of the accepted quantum-resistant public-key cryptography. Different variants of LBC include Learning With Error (LWE), Ring Learning With Error (Ring-LWE), Binary Ring Learning with Error (Ring-Bin LWE), and etc. AES is also one of the secure cryptography algorithm that has been widely used in different applications and platforms. Also, AES-256 is secure against quantum attack.
It is very important to design a crypto-system based on the need and application. In general, each network has three different layers; cloud, edge, and end-node. The cloud and edge layer require to have a high-speed crypto-system, as it is used in high-traffic application to encrypt and decrypt data. Unfortunately, most of the end-node devices are resource-constraint and do not have enough area for security guard. Providing end-to-end security is vital for every network. To mitigate this issue, designing and implementing a lightweight cryto-system for resource-constraint devices is necessary.
In this thesis, a high-throughput FPGA implementation of AES algorithm for high-traffic edge applications is introduced. To achieve this goal, some part of the algorithm has been modified to balance the latency. Inner and outer pipelining techniques and loop-unrolling have been employed. The proposed high-speed implementation of AES achieves a throughput of 79.7Gbps, FPGA efficiency of 13.3 Mbps/slice, and frequency of 622.4MHz. Compared to the state-of-the-art work, the proposed design has improved data throughput by 8.02% and FPGA-Eff by 22.63%.
Moreover, a lightweight architecture of AES for resource-constraint devices is designed and implemented on FPGA and ASIC. Each module of the architecture is specified in which occupied less area; and some units are shared among different phases. To reduce the power consumption clock gating technique is applied. Application-specific integrated circuit (ASIC) implementation results show a respective improvement in the area over the previous similar works from 35% to 2.4%. Based on the results and NIST report, the proposed design is a suitable crypto-system for tiny devices and can be supplied by low-power devices.
Furthermore, two lightweight crypto-systems based on Binary Ring-LWE are presented for IoT end-node devices. For one of them, a novel column-based multiplication is introduced. To execute the column-based multiplication only one register is employed to store the intermediate results. The multiplication unit for the other Binary Ring-LWE design is optimized in which the multiplication is executed in less clock cycles. Moreover, to increase the security for end-node devices, the fault resiliency architecture has been designed and applied to the architecture of Binary Ring-LWE. Based on the implementation results and NIST report, the proposed Binary Ring-LWE designs is a suitable crypto-system form resource-constraint devices
Simulation and implementation of novel deep learning hardware architectures for resource constrained devices
Corey Lammie designed mixed signal memristive-complementary metal–oxide–semiconductor (CMOS) and field programmable gate arrays (FPGA) hardware architectures, which were used to reduce the power and resource requirements of Deep Learning (DL) systems; both during inference and training. Disruptive design methodologies, such as those explored in this thesis, can be used to facilitate the design of next-generation DL systems
Recommended from our members
Design Techniques for High-Performance SAR A/D Converters
The design of electronics needs to account for the non-ideal characteristics of the device technologies used to realize practical circuits. This is particularly important in mixed analog-digital design since the best device technologies are very different for digital compared to analog circuits. One solution for this problem is to use a calibration correction approach to remove the errors introduced by devices, but this adds complexity and power dissipation, as well as reducing operation speed, and so must be optimised. This thesis addresses such an approach to improve the performance of certain types of analog-to-digital converter (ADC) used in advanced telecommunications, where speed, accuracy and power dissipation currently limit applications. The thesis specifically focuses on the design of compensation circuits for use in successive approximation register (SAR) ADCs.
ADCs are crucial building blocks in communication systems, in general, and for mobile networks, in particular. The recently launched fifth generation of mobile networks (5G) has required new ADC circuit techniques to meet the higher speed and lower power dissipation requirements for 5G technology. The SAR has become one of the most favoured architectures for designing high-performance ADCs, but the successive nature of the circuit operation makes it difficult to reach ∼GS/s sampling rates at reasonable power consumption.
Here, two calibration techniques for high-performance SAR ADCs are presented. The first uses an on-chip stochastic-based mismatch calibration technique that is able to accurately compute and compensate for the mismatch of a capacitive DAC in a SAR ADC. The stochastic nature of the proposed calibration method enables determination of the mismatch of the CAPDAC with a resolution much better than that of the DAC. This allows the unit capacitor to scale down to as low as 280aF for a 9-bit DAC. Since the CAP-DAC causes a large part of the overall dynamic power consumption and directly determines both the sizes of the driving and sampling switches and the size of the input capacitive load of the ADC and the kT/C noise power, a small CAP-DAC helps the power efficiency. To validate the proposed calibration idea, a 10-bit asynchronous SAR ADC was fabricated in 28-nm CMOS. Measurement results show that the proposed stochastic calibration improves the ADC’s SFDR and SNDR by 14.9 dB, 11.5 dB, respectively. After calibration, the fabricated SAR ADC achieves an ENOB of 9.14 bit at a sampling rate of 85 MS/s, resulting in a Walden FoM of 10.9 fJ/c-s.
The second calibration technique is a timing-skew calibration for a time-interleaved (TI) SAR ADC that calibrates/computes the inter-channel timing and offset mismatch simultaneously. Simulation results show the effectiveness of this calibration method. When used together, the proposed mismatch calibration technique and the timing-skew
calibration technique enables a TI SAR ADC to be designed that can achieve a sampling rate of ∼GS/s with 10-bit resolution and a power consumption as low as ∼10mW; specifications that satisfy the requirements of 5G technology
Design of clock and data recovery circuits for energy-efficient short-reach optical transceivers
Nowadays, the increasing demand for cloud based computing and social media
services mandates higher throughput (at least 56 Gb/s per data lane with 400
Gb/s total capacity 1) for short reach optical links (with the reach typically less
than 2 km) inside data centres. The immediate consequences are the huge
and power hungry data centers. To address these issues the intra-data-center
connectivity by means of optical links needs continuous upgrading.
In recent years, the trend in the industry has shifted toward the use of more
complex modulation formats like PAM4 due to its spectral efficiency over the
traditional NRZ. Another advantage is the reduced number of channels count
which is more cost-effective considering the required area and the I/O density.
However employing PAM4 results in more complex transceivers circuitry due
to the presence of multilevel transitions and reduced noise budget. In addition,
providing higher speed while accommodating the stringent requirements
of higher density and energy efficiency (< 5 pJ/bit), makes the design of the
optical links more challenging and requires innovative design techniques both
at the system and circuit level.
This work presents the design of a Clock and Data Recovery Circuit (CDR) as
one of the key building blocks for the transceiver modules used in such fibreoptic
links. Capable of working with PAM4 signalling format, the new proposed
CDR architecture targets data rates of 50−56 Gb/s while achieving the required
energy efficiency (< 5 pJ/bit).
At the system level, the design proposes a new PAM4 PD which provides a better
trade-off in terms of bandwidth and systematic jitter generation in the CDR. By
using a digital loop controller (DLC), the CDR gains considerable area reduction
with flexibility to adjust the loop dynamics.
At the circuit level it focuses on applying different circuit techniques to mitigate
the circuit imperfections. It presents a wideband analog front end (AFE),
suitable for a 56 Gb/s, 28-Gbaud PAM-4 signal, by using an 8x interleaved, master/
slave based sample and hold circuit. In addition, the AFE is equipped with
a calibration scheme which corrects the errors associated with the sampling
channels’ offset voltage and gain mismatches. The presented digital to phase
converter (DPC) features a modified phase interpolator (PI), a new quadrature
phase corrector (QPC) and multi-phase output with de-skewing capabilities.The DPC (as a standalone block) and the CDR (as the main focus of this work)
were fabricated in 65-nm CMOS technology. Based on the measurements, the
DPC achieves DNL/INL of 0.7/6 LSB respectively while consuming 40.5 mW
power from 1.05 V supply. Although the CDR was not fully operational with
the PAM4 input, the results from 25-Gbaud PAM2 (NRZ) test setup were used
to estimate the performance. Under this scenario, the 1-UI JTOL bandwidth
was measured to be 2 MHz with BER threshold of 10−4. The chip consumes 236
mW of power while operating on 1 − 1.2 V supply range achieving an energyefficiency
of 4.27 pJ/bit
Radio Communications
In the last decades the restless evolution of information and communication technologies (ICT) brought to a deep transformation of our habits. The growth of the Internet and the advances in hardware and software implementations modified our way to communicate and to share information. In this book, an overview of the major issues faced today by researchers in the field of radio communications is given through 35 high quality chapters written by specialists working in universities and research centers all over the world. Various aspects will be deeply discussed: channel modeling, beamforming, multiple antennas, cooperative networks, opportunistic scheduling, advanced admission control, handover management, systems performance assessment, routing issues in mobility conditions, localization, web security. Advanced techniques for the radio resource management will be discussed both in single and multiple radio technologies; either in infrastructure, mesh or ad hoc networks
Topical Workshop on Electronics for Particle Physics
The purpose of the workshop was to present results and original concepts for electronics research and development relevant to particle physics experiments as well as accelerator and beam instrumentation at future facilities; to review the status of electronics for the LHC experiments; to identify and encourage common efforts for the development of electronics; and to promote information exchange and collaboration in the relevant engineering and physics communities
Digital System Design - Use of Microcontroller
Embedded systems are today, widely deployed in just about every piece of machinery from toasters to spacecraft. Embedded system designers face many challenges. They are asked to produce increasingly complex systems using the latest technologies, but these technologies are changing faster than ever. They are asked to produce better quality designs with a shorter time-to-market. They are asked to implement increasingly complex functionality but more importantly to satisfy numerous other constraints. To achieve the current goals of design, the designer must be aware with such design constraints and more importantly, the factors that have a direct effect on them.One of the challenges facing embedded system designers is the selection of the optimum processor for the application in hand; single-purpose, general-purpose or application specific. Microcontrollers are one member of the family of the application specific processors.The book concentrates on the use of microcontroller as the embedded system?s processor, and how to use it in many embedded system applications. The book covers both the hardware and software aspects needed to design using microcontroller.The book is ideal for undergraduate students and also the engineers that are working in the field of digital system design.Contents• Preface;• Process design metrics;• A systems approach to digital system design;• Introduction to microcontrollers and microprocessors;• Instructions and Instruction sets;• Machine language and assembly language;• System memory; Timers, counters and watchdog timer;• Interfacing to local devices / peripherals;• Analogue data and the analogue I/O subsystem;• Multiprocessor communications;• Serial Communications and Network-based interfaces
Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)
ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability