2,754 research outputs found

    AI/ML Algorithms and Applications in VLSI Design and Technology

    Full text link
    An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations

    Designing energy-efficient computing systems using equalization and machine learning

    Full text link
    As technology scaling slows down in the nanometer CMOS regime and mobile computing becomes more ubiquitous, designing energy-efficient hardware for mobile systems is becoming increasingly critical and challenging. Although various approaches like near-threshold computing (NTC), aggressive voltage scaling with shadow latches, etc. have been proposed to get the most out of limited battery life, there is still no “silver bullet” to increasing power-performance demands of the mobile systems. Moreover, given that a mobile system could operate in a variety of environmental conditions, like different temperatures, have varying performance requirements, etc., there is a growing need for designing tunable/reconfigurable systems in order to achieve energy-efficient operation. In this work we propose to address the energy- efficiency problem of mobile systems using two different approaches: circuit tunability and distributed adaptive algorithms. Inspired by the communication systems, we developed feedback equalization based digital logic that changes the threshold of its gates based on the input pattern. We showed that feedback equalization in static complementary CMOS logic enabled up to 20% reduction in energy dissipation while maintaining the performance metrics. We also achieved 30% reduction in energy dissipation for pass-transistor digital logic (PTL) with equalization while maintaining performance. In addition, we proposed a mechanism that leverages feedback equalization techniques to achieve near optimal operation of static complementary CMOS logic blocks over the entire voltage range from near threshold supply voltage to nominal supply voltage. Using energy-delay product (EDP) as a metric we analyzed the use of the feedback equalizer as part of various sequential computational blocks. Our analysis shows that for near-threshold voltage operation, when equalization was used, we can improve the operating frequency by up to 30%, while the energy increase was less than 15%, with an overall EDP reduction of ≈10%. We also observe an EDP reduction of close to 5% across entire above-threshold voltage range. On the distributed adaptive algorithm front, we explored energy-efficient hardware implementation of machine learning algorithms. We proposed an adaptive classifier that leverages the wide variability in data complexity to enable energy-efficient data classification operations for mobile systems. Our approach takes advantage of varying classification hardness across data to dynamically allocate resources and improve energy efficiency. On average, our adaptive classifier is ≈100× more energy efficient but has ≈1% higher error rate than a complex radial basis function classifier and is ≈10× less energy efficient but has ≈40% lower error rate than a simple linear classifier across a wide range of classification data sets. We also developed a field of groves (FoG) implementation of random forests (RF) that achieves an accuracy comparable to Convolutional Neural Networks (CNN) and Support Vector Machines (SVM) under tight energy budgets. The FoG architecture takes advantage of the fact that in random forests a small portion of the weak classifiers (decision trees) might be sufficient to achieve high statistical performance. By dividing the random forest into smaller forests (Groves), and conditionally executing the rest of the forest, FoG is able to achieve much higher energy efficiency levels for comparable error rates. We also take advantage of the distributed nature of the FoG to achieve high level of parallelism. Our evaluation shows that at maximum achievable accuracies FoG consumes ≈1.48×, ≈24×, ≈2.5×, and ≈34.7× lower energy per classification compared to conventional RF, SVM-RBF , Multi-Layer Perceptron Network (MLP), and CNN, respectively. FoG is 6.5× less energy efficient than SVM-LR, but achieves 18% higher accuracy on average across all considered datasets

    Neuromorphic Computing Systems for Tactile Sensing Perception

    Get PDF
    Touch sensing plays an important role in humans daily life. Tasks like exploring, grasping and manipulating objects deeply rely on it. As such, Robots and hand prosthesis endowed with the sense of touch can better and more easily manipulate objects, and physically collaborate with other agents. Towards this goal, information about touched objects and surfaces has to be inferred from raw data coming from the sensors. The orientation of edges, which is employed as a pre-processing stage in both artificial vision and touch, is a key indication for object discrimination. Inspired on the encoding of edges in human first-order tactile afferents, we developed a biologically inspired, spiking models architecture that mimics human tactile perception with computational primitives that are implementable on low-power subthreshold neuromorphic hardware. The network architecture uses three layers of Leaky Integrate and Fire neurons to distinguish different edge orientations of a bar pressed on the artificial skin of the iCub robot. We demonstrated that the network architecture can learn the appropriate connectivity through unsupervised spike-based learning, and that the number and spatial distribution of sensitive areas within receptive fields are important in edge orientation discrimination. The unconstrained and random structure of the connectivity among layers can produce unbalanced activity in the output neurons, which are driven by a variable amount of synaptic inputs. We explored two different mechanisms of synaptic normalization (weights normalization and homeostasis), defining how this can be useful during the learning phase and inference phase. The network is successfully able to discriminate between 35 orientations of 36 (0 degree to 180 degree with 5 degree step increments) with homeostasis and weights normalization mechanism. Besides edge orientation discrimination, we modified the network architecture to be able to classify six different touch modalities (e.g. poke, press, grab, squeeze, push, and rolling a wheel). We demonstrated the ability of the network to learn appropriate connectivity patterns for the classification, achieving a total accuracy of 88.3 %. Furthermore, another application scenario on the tactile object shapes recognition has been considered because of its importance in robotic manipulation. We illustrated that the network architecture with 2 layers of spiking neurons was able to discriminate the tactile object shapes with accuracy 100 %, after integrating to it an array of 160 piezoresistive tactile sensors where the object shapes are applied

    Design of a CMOS-Memristive Mixed-Signal Neuromorphic System with Energy and Area Efficiency in System Level Applications

    Get PDF
    The von Neumann architecture has been the backbone of modern computers for several years. This computational framework is popular because it defines an easy, simple and cheap design for the processing unit and memory. Unfortunately, this architecture faces a huge bottleneck going forward since complexity in computations now demands increased parallelism and this architecture is not efficient at parallel processing. Moreover, the post-Moore\u27s law era brings a constant demand for energy-efficient computing with fewer resources and less area. Hence, researchers are interested in establishing alternatives to the von Neumann architecture and neuromorphic computing is one of the few aspiring computing architectures that contributes to this research effectively. Initially, neuromorphic computing attracted attention because of the parallelism found in the bio-inspired networks and they were interested in leveraging this advantage on a single chip. Moreover, the need for speed in real time performance also escalated the popularity of neuromorphic computing and different research groups started working on hardware implementations of neural networks. Also, neuroscience is consistently building a better understanding of biological networks that provides opportunities for bridging the gap between biological neuronal activities and artificial neural networks. As a consequence, the idea behind neuromorphic computing has continued to gain in popularity. In this research, a memristive neuromorphic system for improved power and area efficiency has been presented. This particular implementation introduces a mixed-signal platform to implement neural networks in a synchronous way. In addition to mixed-signal design, a nano-scale memristive device has been introduced that provides power and area efficiency for the overall system. The system design also includes synchronous digital long term plasticity (DLTP), an online learning methodology that helps train the neural networks during the operation phase, improving the efficiency in learning when considering power consumption and area overhead. This research also proposes a stochastic neuron design with a sigmoidal firing rate. The design introduces variability in the membrane capacitance to reach different membrane potential leading to a variable stochastic firing rate

    Time-domain optimization of amplifiers based on distributed genetic algorithms

    Get PDF
    Thesis presented in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the subject of Electrical and Computer EngineeringThe work presented in this thesis addresses the task of circuit optimization, helping the designer facing the high performance and high efficiency circuits demands of the market and technology evolution. A novel framework is introduced, based on time-domain analysis, genetic algorithm optimization, and distributed processing. The time-domain optimization methodology is based on the step response of the amplifier. The main advantage of this new time-domain methodology is that, when a given settling-error is reached within the desired settling-time, it is automatically guaranteed that the amplifier has enough open-loop gain, AOL, output-swing (OS), slew-rate (SR), closed loop bandwidth and closed loop stability. Thus, this simplification of the circuit‟s evaluation helps the optimization process to converge faster. The method used to calculate the step response expression of the circuit is based on the inverse Laplace transform applied to the transfer function, symbolically, multiplied by 1/s (which represents the unity input step). Furthermore, may be applied to transfer functions of circuits with unlimited number of zeros/poles, without approximation in order to keep accuracy. Thus, complex circuit, with several design/optimization degrees of freedom can also be considered. The expression of the step response, from the proposed methodology, is based on the DC bias operating point of the devices of the circuit. For this, complex and accurate device models (e.g. BSIM3v3) are integrated. During the optimization process, the time-domain evaluation of the amplifier is used by the genetic algorithm, in the classification of the genetic individuals. The time-domain evaluator is integrated into the developed optimization platform, as independent library, coded using C programming language. The genetic algorithms have demonstrated to be a good approach for optimization since they are flexible and independent from the optimization-objective. Different levels of abstraction can be optimized either system level or circuit level. Optimization of any new block is basically carried-out by simply providing additional configuration files, e.g. chromosome format, in text format; and the circuit library where the fitness value of each individual of the genetic algorithm is computed. Distributed processing is also employed to address the increasing processing time demanded by the complex circuit analysis, and the accurate models of the circuit devices. The communication by remote processing nodes is based on Message Passing interface (MPI). It is demonstrated that the distributed processing reduced the optimization run-time by more than one order of magnitude. Platform assessment is carried by several examples of two-stage amplifiers, which have been optimized and successfully used, embedded, in larger systems, such as data converters. A dedicated example of an inverter-based self-biased two-stage amplifier has been designed, laid-out and fabricated as a stand-alone circuit and experimentally evaluated. The measured results are a direct demonstration of the effectiveness of the proposed time-domain optimization methodology.Portuguese Foundation for the Science and Technology (FCT
    corecore