2,492 research outputs found

    A Survey of FPGA Optimization Methods for Data Center Energy Efficiency

    Get PDF
    This article provides a survey of academic literature about field programmable gate array (FPGA) and their utilization for energy efficiency acceleration in data centers. The goal is to critically present the existing FPGA energy optimization techniques and discuss how they can be applied to such systems. To do so, the article explores current energy trends and their projection to the future with particular attention to the requirements set out by the European Code of Conduct for Data Center Energy Efficiency. The article then proposes a complete analysis of over ten years of research in energy optimization techniques, classifying them by purpose, method of application, and impacts on the sources of consumption. Finally, we conclude with the challenges and possible innovations we expect for this sector.Comment: Accepted for publication in IEEE Transactions on Sustainable Computin

    Resource management and application customization for hardware accelerated systems

    Get PDF
    Computational demands are continuously increasing, driven by the growing resource demands of applications. At the era of big-data, big-scale applications, and real-time applications, there is an enormous need for quick processing of big amounts of data. To meet these demands, computer systems have shifted towards multi-core solutions. Technology scaling has allowed the incorporation of even larger numbers of transistors and cores into chips. Nevertheless, area constrains, power consumption limitations, and thermal dissipation limit the ability to design and sustain ever increasing chips. To overpassthese limitations, system designers have turned towards the usage of hardware accelerators. These accelerators can take the form of modules attached to each core of a multi-core system, forming a network on chip of cores with attached accelerators. Another option of hardware accelerators are Graphics Processing Units (GPUs). GPUs can be connected through a host-device model with a general purpose system, and are used to off-load parts of a workload to them. Additionally, accelerators can be functionality dedicated units. They can be part of a chip and the main processor can offload specific workloads to the hardware accelerator unit.In this dissertation we present: (a) a microcoded synchronization mechanism for systems with hardware accelerators that provide distributed shared memory, (b) a Streaming Multiprocessor (SM) allocation policy for single application execution on GPUs, (c) an SM allocation policy for concurrent applications that execute on GPUs, and (d) a framework to map neural network (NN) weights to approximate multiplier accuracy levels. Theaforementioned mechanisms coexist in the resource management domain. Specifically, the methodologies introduce ways to boost system performance by using hardware accelerators. In tandem with improved performance, the methodologies explore and balance trade-offs that the use of hardware accelerators introduce

    Cross-Layer Optimization for Power-Efficient and Robust Digital Circuits and Systems

    Full text link
    With the increasing digital services demand, performance and power-efficiency become vital requirements for digital circuits and systems. However, the enabling CMOS technology scaling has been facing significant challenges of device uncertainties, such as process, voltage, and temperature variations. To ensure system reliability, worst-case corner assumptions are usually made in each design level. However, the over-pessimistic worst-case margin leads to unnecessary power waste and performance loss as high as 2.2x. Since optimizations are traditionally confined to each specific level, those safe margins can hardly be properly exploited. To tackle the challenge, it is therefore advised in this Ph.D. thesis to perform a cross-layer optimization for digital signal processing circuits and systems, to achieve a global balance of power consumption and output quality. To conclude, the traditional over-pessimistic worst-case approach leads to huge power waste. In contrast, the adaptive voltage scaling approach saves power (25% for the CORDIC application) by providing a just-needed supply voltage. The power saving is maximized (46% for CORDIC) when a more aggressive voltage over-scaling scheme is applied. These sparsely occurred circuit errors produced by aggressive voltage over-scaling are mitigated by higher level error resilient designs. For functions like FFT and CORDIC, smart error mitigation schemes were proposed to enhance reliability (soft-errors and timing-errors, respectively). Applications like Massive MIMO systems are robust against lower level errors, thanks to the intrinsically redundant antennas. This property makes it applicable to embrace digital hardware that trades quality for power savings.Comment: 190 page

    A Ringamp-Assisted, Output Capacitor-less Analog CMOS Low-Dropout Voltage Regulator

    Get PDF
    Continued advancements in state-of-the-art integrated circuits have furthered trends toward higher computational performance and increased functionality within smaller circuit area footprints, all while improving power efficiencies to meet the demands of mobile and battery-powered applications. A significant portion of these advancements have been enabled by continued scaling of CMOS technology into smaller process node sizes, facilitating faster digital systems and power optimized computation. However, this scaling has degraded classic analog amplifying circuit structures with reduced voltage headroom and lower device output resistance; and thus, lower available intrinsic gain. This work investigates these trends and their impact for fine-grain Low-Dropout (LDO) Voltage Regulators, leading to a presented design methodology and implementation of a state-of-the-art Ringamp-Assisted, Output Capacitor-less Analog CMOS LDO Voltage Regulator capable of both power scaling and process node scaling for general SoC applications

    Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip

    Get PDF
    The sustained demand for faster, more powerful chips has been met by the availability of chip manufacturing processes allowing for the integration of increasing numbers of computation units onto a single die. The resulting outcome, especially in the embedded domain, has often been called SYSTEM-ON-CHIP (SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC). MPSoC design brings to the foreground a large number of challenges, one of the most prominent of which is the design of the chip interconnection. With a number of on-chip blocks presently ranging in the tens, and quickly approaching the hundreds, the novel issue of how to best provide on-chip communication resources is clearly felt. NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable answer to this design concern. By bringing large-scale networking concepts to the on-chip domain, they guarantee a structured answer to present and future communication requirements. The point-to-point connection and packet switching paradigms they involve are also of great help in minimizing wiring overhead and physical routing issues. However, as with any technology of recent inception, NoC design is still an evolving discipline. Several main areas of interest require deep investigation for NoCs to become viable solutions: • The design of the NoC architecture needs to strike the best tradeoff among performance, features and the tight area and power constraints of the onchip domain. • Simulation and verification infrastructure must be put in place to explore, validate and optimize the NoC performance. • NoCs offer a huge design space, thanks to their extreme customizability in terms of topology and architectural parameters. Design tools are needed to prune this space and pick the best solutions. • Even more so given their global, distributed nature, it is essential to evaluate the physical implementation of NoCs to evaluate their suitability for next-generation designs and their area and power costs. This dissertation performs a design space exploration of network-on-chip architectures, in order to point-out the trade-offs associated with the design of each individual network building blocks and with the design of network topology overall. The design space exploration is preceded by a comparative analysis of state-of-the-art interconnect fabrics with themselves and with early networkon- chip prototypes. The ultimate objective is to point out the key advantages that NoC realizations provide with respect to state-of-the-art communication infrastructures and to point out the challenges that lie ahead in order to make this new interconnect technology come true. Among these latter, technologyrelated challenges are emerging that call for dedicated design techniques at all levels of the design hierarchy. In particular, leakage power dissipation, containment of process variations and of their effects. The achievement of the above objectives was enabled by means of a NoC simulation environment for cycleaccurate modelling and simulation and by means of a back-end facility for the study of NoC physical implementation effects. Overall, all the results provided by this work have been validated on actual silicon layout

    Enhanced applicability of loop transformations

    Get PDF

    Integrated Circuits and Systems for Smart Sensory Applications

    Get PDF
    Connected intelligent sensing reshapes our society by empowering people with increasing new ways of mutual interactions. As integration technologies keep their scaling roadmap, the horizon of sensory applications is rapidly widening, thanks to myriad light-weight low-power or, in same cases even self-powered, smart devices with high-connectivity capabilities. CMOS integrated circuits technology is the best candidate to supply the required smartness and to pioneer these emerging sensory systems. As a result, new challenges are arising around the design of these integrated circuits and systems for sensory applications in terms of low-power edge computing, power management strategies, low-range wireless communications, integration with sensing devices. In this Special Issue recent advances in application-specific integrated circuits (ASIC) and systems for smart sensory applications in the following five emerging topics: (I) dedicated short-range communications transceivers; (II) digital smart sensors, (III) implantable neural interfaces, (IV) Power Management Strategies in wireless sensor nodes and (V) neuromorphic hardware

    Machine-Learning-Powered Cyber-Physical Systems

    Get PDF
    In the last few years, we witnessed the revolution of the Internet of Things (IoT) paradigm and the consequent growth of Cyber-Physical Systems (CPSs). IoT devices, which include a plethora of smart interconnected sensors, actuators, and microcontrollers, have the ability to sense physical phenomena occurring in an environment and provide copious amounts of heterogeneous data about the functioning of a system. As a consequence, the large amounts of generated data represent an opportunity to adopt artificial intelligence and machine learning techniques that can be used to make informed decisions aimed at the optimization of such systems, thus enabling a variety of services and applications across multiple domains. Machine learning processes and analyses such data to generate a feedback, which represents a status the environment is in. A feedback given to the user in order to make an informed decision is called an open-loop feedback. Thus, an open-loop CPS is characterized by the lack of an actuation directed at improving the system itself. A feedback used by the system itself to actuate a change aimed at optimizing the system itself is called a closed-loop feedback. Thus, a closed-loop CPS pairs feedback based on sensing data with an actuation that impacts the system directly. In this dissertation, we propose several applications in the context of CPS. We propose open-loop CPSs designed for the early prediction, diagnosis, and persistency detection of Bovine Respiratory Disease (BRD) in dairy calves, and for gait activity recognition in horses.These works use sensor data, such as pedometers and automated feeders, to perform valuable real-field data collection. Data are then processed by a mix of state-of-the-art approaches as well as novel techniques, before being fed to machine learning algorithms for classification, which informs the user on the status of their animals. Our work further evaluates a variety of trade-offs. In the context of BRD, we adopt optimization techniques to explore the trade-offs of using sensor data as opposed to manual examination performed by domain experts. Similarly, we carry out an extensive analysis on the cost-accuracy trade-offs, which farmers can adopt to make informed decisions on their barn investments. In the context of horse gait recognition we evaluate the benefits of lighter classifications algorithms to improve energy and storage usage, and their impact on classification accuracy. With respect to closed-loop CPS we proposes an incentive-based demand response approach for Heating Ventilation and Air Conditioning (HVAC) designed for peak load reduction in the context of smart grids. Specifically, our approach uses machine learning to process power data from smart thermostats deployed in user homes, along with their personal temperature preferences. Our machine learning models predict power savings due to thermostat changes, which are then plugged into our optimization problem that uses auction theory coupled with behavioral science. This framework selects the set of users who fulfill the power saving requirement, while minimizing financial incentives paid to the users, and, as a consequence, their discomfort. Our work on BRD has been published on IEEE DCOSS 2022 and Frontiers in Animal Science. Our work on gait recognition has been published on IEEE SMARTCOMP 2019 and Elsevier PMC 2020, and our work on energy management and energy prediction has been published on IEEE PerCom 2022 and IEEE SMARTCOMP 2022. Several other works are under submission when this thesis was written, and are included in this document as well

    Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

    Get PDF
    ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability
    corecore