26 research outputs found

    Architecture for dual-mode quadruple precision floating point adder

    Get PDF
    This paper presents a configurable dual-mode architecture for floating point (F.P.) adder. The architecture (named as QPdDP) works in dual-mode which can operates either for quadruple precision or dual (two-parallel) double precision. The architecture follows the standard state-of-the-art flow for floating point adder. It is aimed for the computation of normal as well as sub-normal operands, along with the support for the exceptional case handling. The key sub-components in the architecture are re-designed & optimized for on-the-fly dual-mode processing, which enables efficient resource sharing for dual precision operands. The data-path is optimized for minimal multiplexing circuitry overhead. The presented dual- mode architecture provide SIMD support for double precision operands, along with high (quadruple) precision support. The proposed architecture is synthesized using UMC 90nm technology ASIC implementation. It is compared with the best available literature works, and have shown better design metrics in terms of area, period and area × period, along with more computational support.published_or_final_versio

    José Luís Almada Güntzel

    Get PDF

    A Bandwidth Control Arbitration for SoC Interconnections Performing Applications With Task Dependencies

    Get PDF
    Current System-on-Chips (SoCs) execute applications with task dependency that compete for shared resources such as buses, memories, and accelerators. In such a structure, the arbitration policy becomes a critical part of the system to guarantee access and bandwidth suitable for the competing applications. Some strategies proposed in the literature to cope with these issues are Round-Robin, Weighted Round-Robin, Lottery, Time Division Access Multiplexing (TDMA), and combinations. However, a fine-grained bandwidth control arbitration policy is missing from the literature. We propose an innovative arbitration policy based on opportunistic access and a supervised utilization of the bus in terms of transmitted flits (transmission units) that settle the access and fine-grained control. In our proposal, every competing element has a budget. Opportunistic access grants the bus to request even if the component has spent all its flits. Supervised debt accounts a record for every transmitted flit when it has no flits to spend. Our proposal applies to interconnection systems such as buses, switches, and routers. The presented approach achieves deadlock-free behavior even with task dependency applications in the scenarios analyzed through cycle-accurate simulation models. The synergy between opportunistic and supervised debt techniques outperforms Lottery, TDMA, and Weighted Round-Robin in terms of bandwidth control in the experimental studies performed

    Exploiting memory allocations in clusterized many-core architectures

    Get PDF
    Power-efficient architectures have become the most important feature required for future embedded systems. Modern designs, like those released on mobile devices, reveal that clusterization is the way to improve energy efficiency. However, such architectures are still limited by the memory subsystem (i.e., memory latency problems). This work investigates an alternative approach that exploits on-chip data locality to a large extent, through distributed shared memory systems that permit efficient reuse of on-chip mapped data in clusterized many-core architectures. First, this work reviews the current literature on memory allocations and explore the limitations of cluster-based many-core architectures. Then, several memory allocations are introduced and benchmarked scalability, performance and energy-wise, compared to the conventional centralized shared memory solution to reveal which memory allocation is the most appropriate for future mobile architectures. Our results show that distributed shared memory allocations bring performance gains and opportunities to reduce energy consumption

    Modeling Power Consumption and Temperature in TLM Models

    No full text
    International audienceMany techniques and tools exist to estimate the power consumption and the temperature map of a chip. These tools help the hardware designers develop power efficient chips in the presence of temperature constraints. For this task, the application can be ignored or at least abstracted by some high level scenarios; at this stage, the actual embedded software is generally not available yet. However, after the hardware is defined, the embedded software can still have a significant influence on the power consumption; i.e., two implementations of the same application can consume more or less power. Moreover, the actual software powe

    Comparative Analysis of 6T, 7T, 8T, 9T, and 10T Realistic CNTFET Based SRAM

    Get PDF

    State-Of-The-Art Convolutional Neural Networks for Smart Farms: A Review

    Get PDF
    International audienceFarming has seen a number of technological transformations in the last decade, becoming more industrialized and technology-driven. This means use of Internet of Things(IoT), Cloud Computing(CC), Big Data (BD) and automation to gain better control over the process of farming. As the use of these technologies in farms has grown exponentially with massive data production, there is need to develop and use state-of-the-art tools in order to gain more insight from the data within reasonable time. In this paper, we present an initial understanding of Convolutional Neural Network (CNN), the recent architectures of state-of-the-art CNN and their underlying complexities. Then we propose a classification taxonomy tailored for agricultural application of CNN. Finally , we present a comprehensive review of research dedicated to applications of state-of-the-art CNNs in agricultural production systems. Our contribution is in twofold. First, for end users of agricultural deep learning tools, our benchmarking finding can serve as a guide to selecting appropriate architecture to use. Second, for agricultural software developers of deep learning tools, our in-depth analysis explains the state-of-the-art CNN complexities and points out possible future directions to further optimize the running performance
    corecore