67 research outputs found

    A Fuzzy Logic Reconfiguration Engine for Symmetric Chip Multiprocessors

    Get PDF
    Recent developments in reconfigurable multiprocessor system on chip (MPSoC) have offered system designers a great amount of flexibility to exploit task concurrency with higher throughput and less energy consumption. This paper presents a novel fuzzy logic reconfiguration engine (FLRE) for coarse grain MPSoC reconfiguration that facilitates to identify an optimum balance between power and performance of the system. The FLRE is composed on two levels of abstraction layers. The system selects an optimal configuration of Level 1 / Level 2 cache size and Associativity, processor operating frequency and voltage, the number of cores based on miss rate, and energy and throughput information of the system both at core and SoC level. An 8-core symmetric chip multiprocessor has been used to evaluate the proposed scheme. The results show an overall decrease of energy consumption with not more than 30% decrease in the throughput

    A fuzzy logic based dynamic reconfiguration scheme for optimal energy and throughput in symmetric chip multiprocessors

    Get PDF
    Embedded systems architectures have traditionally often been investigated and designed in order to achieve a greater throughput combined with minimum energy consumption. With the advent of reconfigurable architectures it is now possible to support algorithms to find optimal solutions for an improved energy and throughput balance. As a result of ongoing research several online and offline techniques and algorithm have been proposed for hardware adaptation. This paper presents a novel coarse-grained reconfigurable symmetric chip multiprocessor (SCMP) architecture managed by a fuzzy logic engine that balances performance and energy consumption. The architecture incorporates reconfigurable level 1 (L1) caches, power gated cores and adaptive on-chip network routers to allow minimizing leakage energy effects for inactive components. A coarse grained architecture was selected as to be a focus for this study as it typically allows for fast reconfiguration as compared to the fine-grained architectures, thus making it more feasible to be used for runtime adaption schemes. The presented architecture is analyzed using a set of OpenMP based parallel benchmarks and the results show significant improvements in performance while maintaining minimum energy consumption

    Fuzzy logic based energy and throughput aware design space exploration for MPSoCs

    Get PDF
    Multicore architectures were introduced to mitigate the issue of increase in power dissipation with clock frequency. Introduction of deeper pipelines, speculative threading etc. for single core systems were not able to bring much increase in performance as compared to their associated power overhead. However for multicore architectures performance scaling with number of cores has always been a challenge. The Amdahl's law shows that the theoretical maximum speedup of a multicore architecture is not even close to the multiple of number of cores. With less amount of code in parallel having more number of cores for an application might just contribute in greater power dissipation instead of bringing some performance advantage. Therefore there is a need of an adaptive multicore architecture that can be tailored for the application in use for higher energy efficiency. In this paper a fuzzy logic based design space exploration technique is presented that is targeted to optimize a multicore architecture according to the workload requirements in order to achieve optimum balance between throughput and energy of the system

    High-performance and hardware-aware computing: proceedings of the second International Workshop on New Frontiers in High-performance and Hardware-aware Computing (HipHaC\u2711), San Antonio, Texas, USA, February 2011 ; (in conjunction with HPCA-17)

    Get PDF
    High-performance system architectures are increasingly exploiting heterogeneity. The HipHaC workshop aims at combining new aspects of parallel, heterogeneous, and reconfigurable microprocessor technologies with concepts of high-performance computing and, particularly, numerical solution methods. Compute- and memory-intensive applications can only benefit from the full hardware potential if all features on all levels are taken into account in a holistic approach

    Self-adaptivity of applications on network on chip multiprocessors: the case of fault-tolerant Kahn process networks

    Get PDF
    Technology scaling accompanied with higher operating frequencies and the ability to integrate more functionality in the same chip has been the driving force behind delivering higher performance computing systems at lower costs. Embedded computing systems, which have been riding the same wave of success, have evolved into complex architectures encompassing a high number of cores interconnected by an on-chip network (usually identified as Multiprocessor System-on-Chip). However these trends are hindered by issues that arise as technology scaling continues towards deep submicron scales. Firstly, growing complexity of these systems and the variability introduced by process technologies make it ever harder to perform a thorough optimization of the system at design time. Secondly, designers are faced with a reliability wall that emerges as age-related degradation reduces the lifetime of transistors, and as the probability of defects escaping post-manufacturing testing is increased. In this thesis, we take on these challenges within the context of streaming applications running in network-on-chip based parallel (not necessarily homogeneous) systems-on-chip that adopt the no-remote memory access model. In particular, this thesis tackles two main problems: (1) fault-aware online task remapping, (2) application-level self-adaptation for quality management. For the former, by viewing fault tolerance as a self-adaptation aspect, we adopt a cross-layer approach that aims at graceful performance degradation by addressing permanent faults in processing elements mostly at system-level, in particular by exploiting redundancy available in multi-core platforms. We propose an optimal solution based on an integer linear programming formulation (suitable for design time adoption) as well as heuristic-based solutions to be used at run-time. We assess the impact of our approach on the lifetime reliability. We propose two recovery schemes based on a checkpoint-and-rollback and a rollforward technique. For the latter, we propose two variants of a monitor-controller- adapter loop that adapts application-level parameters to meet performance goals. We demonstrate not only that fault tolerance and self-adaptivity can be achieved in embedded platforms, but also that it can be done without incurring large overheads. In addressing these problems, we present techniques which have been realized (depending on their characteristics) in the form of a design tool, a run-time library or a hardware core to be added to the basic architecture

    Virtual Runtime Application Partitions for Resource Management in Massively Parallel Architectures

    Get PDF
    This thesis presents a novel design paradigm, called Virtual Runtime Application Partitions (VRAP), to judiciously utilize the on-chip resources. As the dark silicon era approaches, where the power considerations will allow only a fraction chip to be powered on, judicious resource management will become a key consideration in future designs. Most of the works on resource management treat only the physical components (i.e. computation, communication, and memory blocks) as resources and manipulate the component to application mapping to optimize various parameters (e.g. energy efficiency). To further enhance the optimization potential, in addition to the physical resources we propose to manipulate abstract resources (i.e. voltage/frequency operating point, the fault-tolerance strength, the degree of parallelism, and the configuration architecture). The proposed framework (i.e. VRAP) encapsulates methods, algorithms, and hardware blocks to provide each application with the abstract resources tailored to its needs. To test the efficacy of this concept, we have developed three distinct self adaptive environments: (i) Private Operating Environment (POE), (ii) Private Reliability Environment (PRE), and (iii) Private Configuration Environment (PCE) that collectively ensure that each application meets its deadlines using minimal platform resources. In this work several novel architectural enhancements, algorithms and policies are presented to realize the virtual runtime application partitions efficiently. Considering the future design trends, we have chosen Coarse Grained Reconfigurable Architectures (CGRAs) and Network on Chips (NoCs) to test the feasibility of our approach. Specifically, we have chosen Dynamically Reconfigurable Resource Array (DRRA) and McNoC as the representative CGRA and NoC platforms. The proposed techniques are compared and evaluated using a variety of quantitative experiments. Synthesis and simulation results demonstrate VRAP significantly enhances the energy and power efficiency compared to state of the art.Siirretty Doriast

    Secure Network-on-Chip Against Black Hole and Tampering Attacks

    Get PDF
    The Network-on-Chip (NoC) has become the communication heart of Multiprocessors-System-on-Chip (MPSoC). Therefore, it has been subject to a plethora of security threats to degrade the system performance or steal sensitive information. Due to the globalization of the modern semiconductor industry, many different parties take part in the hardware design of the system. As a result, the NoC could be infected with a malicious circuit, known as a Hardware Trojan (HT), to leave a back door for security breach purposes. HTs are smartly designed to be too small to be uncovered by offline circuit-level testing, so the system requires an online monitoring to detect and prevent the HT in runtime. This dissertation focuses on HTs inside the router of a NoC designed by a third party. It explores two HT-based threat models for the MPSoC, where the NoC experiences packet-loss and packet-tampering once the HT in the infected router is activated and is in the attacking state. Extensive experiments for each proposed architecture were conducted using a cycle-accurate simulator to demonstrate its effectiveness on the performance of the NoC-based system. The first threat model is the Black Hole Router (BHR) attack, where it silently discards the packets that are passing through without further announcement. The effect of the BHR is presented and analyzed to show the potency of the attack on a NoC-based system. A countermeasure protocol is proposed to detect the BHR at runtime and counteract the deliberate packet-dropping attack with a 26.9% area overhead, an average 21.31% performance overhead and a 22% energy consumption overhead. The protocol is extended to provide an efficient and power-gated scheme to enhance the NoC throughput and reduce the energy consumption by using end-to-end (e2e) approach. The power-gated e2e technique locates the BHR and avoids it with a 1% performance overhead and a 2% energy consumption overhead. The second threat model is a packet-integrity attack, where the HT tampers with the packet to apply a denial-of-service attack, steal sensitive information, gain unauthorized access, or misroute the packet to an unintended node. An authentic and secure NoC platform is proposed to detect and countermeasure the packet-tampering attack to maintain data-integrity and authenticity while keeping its secrecy with a 24.21% area overhead. The proposed NoC architecture is not only able to detect the attack, but also locates the infected router and isolates it from the network

    Application Centric Networks-On-Chip Design Solutions for Future Multicore Systems

    Get PDF
    With advances in technology, future multicore systems scaled to 100s and 1000s of cores/accelerators are being touted as an effective solution for extracting huge performance gains using parallel programming paradigms. However with the failure of Dennard Scaling all the components on the chip cannot be run simultaneously without breaking the power and thermal constraints leading to strict chip power envelops. The scaling up of the number of on chip components has also brought upon Networks-On-Chip (NoC) based interconnect designs like 2D mesh. The contribution of NoC to the total on chip power and overall performance has been increasing steadily and hence high performance power-efficient NoC designs are becoming crucial. Future multicore paradigms can be broadly classified, based on the applications they are tailored to, into traditional Chip Multi processor(CMP) based application based systems, characterized by low core and NoC utilization, and emerging big data application based systems, characterized by large amounts of data movement necessitating high throughput requirements. To this order, we propose NoC design solutions for power-savings in future CMPs tailored to traditional applications and higher effective throughput gains in multicore systems tailored to bandwidth intensive applications. First, we propose Fly-over, a light-weight distributed mechanism for power-gating routers attached to switched off cores to reduce NoC power consumption in low load CMP environment. Secondly, we plan on utilizing a promising next generation memory technology, Spin-Transfer Torque Magnetic RAM(STT-MRAM), to achieve enhanced NoC performance to satisfy the high throughput demands in emerging bandwidth intensive applications, while reducing the power consumption simultaneously. Thirdly, we present a hardware data approximation framework for NoCs, APPROX-NoC, with an online data error control mechanism, which can leverage the approximate computing paradigm in the emerging data intensive big data applications to attain higher performance per watt

    The Fifth NASA Symposium on VLSI Design

    Get PDF
    The fifth annual NASA Symposium on VLSI Design had 13 sessions including Radiation Effects, Architectures, Mixed Signal, Design Techniques, Fault Testing, Synthesis, Signal Processing, and other Featured Presentations. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The presentations share insights into next generation advances that will serve as a basis for future VLSI design
    corecore