109 research outputs found
Efficient DSP and Circuit Architectures for Massive MIMO: State-of-the-Art and Future Directions
Massive MIMO is a compelling wireless access concept that relies on the use
of an excess number of base-station antennas, relative to the number of active
terminals. This technology is a main component of 5G New Radio (NR) and
addresses all important requirements of future wireless standards: a great
capacity increase, the support of many simultaneous users, and improvement in
energy efficiency. Massive MIMO requires the simultaneous processing of signals
from many antenna chains, and computational operations on large matrices. The
complexity of the digital processing has been viewed as a fundamental obstacle
to the feasibility of Massive MIMO in the past. Recent advances on
system-algorithm-hardware co-design have led to extremely energy-efficient
implementations. These exploit opportunities in deeply-scaled silicon
technologies and perform partly distributed processing to cope with the
bottlenecks encountered in the interconnection of many signals. For example,
prototype ASIC implementations have demonstrated zero-forcing precoding in real
time at a 55 mW power consumption (20 MHz bandwidth, 128 antennas, multiplexing
of 8 terminals). Coarse and even error-prone digital processing in the antenna
paths permits a reduction of consumption with a factor of 2 to 5. This article
summarizes the fundamental technical contributions to efficient digital signal
processing for Massive MIMO. The opportunities and constraints on operating on
low-complexity RF and analog hardware chains are clarified. It illustrates how
terminals can benefit from improved energy efficiency. The status of technology
and real-life prototypes discussed. Open challenges and directions for future
research are suggested.Comment: submitted to IEEE transactions on signal processin
X-SRAM: Enabling In-Memory Boolean Computations in CMOS Static Random Access Memories
Silicon-based Static Random Access Memories (SRAM) and digital Boolean logic
have been the workhorse of the state-of-art computing platforms. Despite
tremendous strides in scaling the ubiquitous metal-oxide-semiconductor
transistor, the underlying \textit{von-Neumann} computing architecture has
remained unchanged. The limited throughput and energy-efficiency of the
state-of-art computing systems, to a large extent, results from the well-known
\textit{von-Neumann bottleneck}. The energy and throughput inefficiency of the
von-Neumann machines have been accentuated in recent times due to the present
emphasis on data-intensive applications like artificial intelligence, machine
learning \textit{etc}. A possible approach towards mitigating the overhead
associated with the von-Neumann bottleneck is to enable \textit{in-memory}
Boolean computations. In this manuscript, we present an augmented version of
the conventional SRAM bit-cells, called \textit{the X-SRAM}, with the ability
to perform in-memory, vector Boolean computations, in addition to the usual
memory storage operations. We propose at least six different schemes for
enabling in-memory vector computations including NAND, NOR, IMP (implication),
XOR logic gates with respect to different bit-cell topologies the 8T cell
and the 8T Differential cell. In addition, we also present a novel
\textit{`read-compute-store'} scheme, wherein the computed Boolean function can
be directly stored in the memory without the need of latching the data and
carrying out a subsequent write operation. The feasibility of the proposed
schemes has been verified using predictive transistor models and Monte-Carlo
variation analysis.Comment: This article has been accepted in a future issue of IEEE Transactions
on Circuits and Systems-I: Regular Paper
Deep learning for cancer tumor classification using transfer learning and feature concatenation
Deep convolutional neural networks (CNNs) represent one of the state-of-the-art methods for image classification in a variety of fields. Because the number of training dataset images in biomedical image classification is limited, transfer learning with CNNs is frequently applied. Breast cancer is one of most common types of cancer that causes death in women. Early detection and treatment of breast cancer are vital for improving survival rates. In this paper, we propose a deep neural network framework based on the transfer learning concept for detecting and classifying breast cancer histopathology images. In the proposed framework, we extract features from images using three pre-trained CNN architectures: VGG-16, ResNet50, and Inception-v3, and concatenate their extracted features, and then feed them into a fully connected (FC) layer to classify benign and malignant tumor cells in the histopathology images of the breast cancer. In comparison to the other CNN architectures that use a single CNN and many conventional classification methods, the proposed framework outperformed all other deep learning architectures and achieved an average accuracy of 98.76%
Design and Evaluation of Approximate Logarithmic Multipliers for Low Power Error-Tolerant Applications
In this work, the designs of both non-iterative and iterative approximate logarithmic multipliers (LMs) are studied to further reduce power consumption and improve performance. Non-iterative approximate LMs (ALMs) that use three inexact mantissa adders, are presented. The proposed iterative approximate logarithmic multipliers (IALMs) use a set-one adder in both mantissa adders during an iteration; they also use lower-part-or adders and approximate mirror adders for the final addition. Error analysis and simulation results are also provided; it is found that the proposed approximate LMs with an appropriate number of inexact bits achieve a higher accuracy and lower power consumption than conventional LMs using exact units. Compared with conventional LMs with exact units, the normalized mean error distance (NMED) of 16-bit approximate LMs is decreased by up to 18% and the power-delay product (PDP) has a reduction of up to 37%. The proposed approximate LMs are also compared with previous approximate multipliers; it is found that the proposed approximate LMs are best suitable for applications allowing larger errors, but requiring lower energy consumption and low power. Approximate Booth multipliers fit applications with less stringent power requirements, but also requiring smaller errors. Case studies for error-tolerant computing applications are provided
A Survey of hardware protection of design data for integrated circuits and intellectual properties
International audienceThis paper reviews the current situation regarding design protection in the microelectronics industry. Over the past ten years, the designers of integrated circuits and intellectual properties have faced increasing threats including counterfeiting, reverse-engineering and theft. This is now a critical issue for the microelectronics industry, mainly for fabless designers and intellectual properties designers. Coupled with increasing pressure to decrease the cost and increase the performance of integrated circuits, the design of a secure, efficient, lightweight protection scheme for design data is a serious challenge for the hardware security community. However, several published works propose different ways to protect design data including functional locking, hardware obfuscation, and IC/IP identification. This paper presents a survey of academic research on the protection of design data. It concludes with the need to design an efficient protection scheme based on several properties
- …