18 research outputs found

    Sequential and optimal experimental design. An example of application to the synthesis of metal alloys

    No full text
    In this article we shall define and illustrate how to further optimize experimental design methodology through a real example from the field of chemistry. This depends on two principles: the first principle is to provide the experimenter with several optimal experimental designs (based on mathematically defined criteria), rather than the traditional single design. In this way several objectives can be simultaneously reconciled. The second principle, underlying the data analysis stage, is the correct use of statistical tests (in regression as well as variance analysis), particularly to check on their validity after using them. This is the methodology we shall briefly attempt to illustrate here

    Computational SRAM Design Automation using Pushed-Rule Bitcells for Energy-Efficient Vector Processing

    No full text
    International audienceThis paper presents a new methodology for automating the Computational SRAM (C-SRAM) design based on off-the-shelf memory compilers and a configurable RTL IP. The main goal is to drastically reduce the development effort compared to a full-custom design, while offering a flexibility of use and a high-yield production. The proposed C-SRAM architecture has been developed to process energy-efficient vector data coupled with a scalar processor, while limiting the data transfer on the system bus. The results obtained by post P&R simulations show that 2RW and 4RW C-SRAM configurations using the double pumping technique achieved the highest performance to process vectorized MAC operations compared to the others configurations. Moreover, it has been shown that the impact of the digital wrapper decoding and executing the instructions can be mitigated by increasing the memory cut size to represent less than 10% in area and 20% in power consumption

    Memory Sizing of a Scalable SRAM In-Memory Computing Tile Based Architecture

    No full text
    International audienceModern computing applications require more and more data to be processed. Unfortunately, the trend in memory technologies does not scale as fast as the computing performances, leading to the so called memory wall. New architectures are currently explored to solve this issue, for both embedded and off-chip memories. Recent techniques that bringing computing as close as possible to the memory array such as, In-Memory Computing (IMC), Near-Memory Computing (NMC), Processing-In-Memory (PIM), allow to reduce the cost of data movement between computing cores and memories. For embedded computing, In-Memory Computing scheme presents advantageous computing and energy gains for certain class of applications. However, current solutions are not scaling to large size memories and high amount of data to compute. In this paper, we propose a new methodology to tile a SRAM/IMC based architecture and scale the memory requirements according to an application set. By using a high level LLVM-based simulation platform, we extract IMC memory requirements for a certain class of applications. Then, we detail the physical and performance costs of tiling SRAM instances. By exploring multi-tile SRAM Place&Route in 28nm FD-SOI, we explore the respective performance, energy and cost of memory interconnect. As a result, we obtain a detailed wire cost model in order to explore memory sizing trade-offs. To achieve a large capacity IMC memory, by splitting the memory in multiple sub-tiles, we can achieve lower energy (up to 78% gain) and faster (up to 49% gain) IMC tile compared to a single large IMC memory instance

    Quantitative Assessment of Exposure to the Mycotoxin Ochratoxin A in Food

    No full text
    International audienceThis article presents the methodology and the simulation results concerning the quantitative assessment of exposure to the fungus toxin named Ochratoxin A (OA) in food, in humans in France. We show that is possible to provide reliable calculations of exposure to OA with the conjugate means of a nonparametric-type method of simulation, a parametric-type method of simulation, and the use of bootstrap confidence intervals. In the context of the Monte Carlo simulation, the nonparametric method takes into account the consumptions and the contaminations in the simulations only via the raw data whereas the parametric method depends on the random samplings from distribution functions fitted to consumption and contamination data. Our conclusions are based on eight types of food only. Nevertheless, they are meaningful due to the major importance of these foodstuffs in human nourishment in France. This methodology can be applied whatever the food contaminant (pesticides, other mycotoxins, Cadmium, etc.) when data are availabl

    A 35.6TOPS/W/mm2^2 3-Stage Pipelined Computational SRAM with Adjustable Form Factor for Highly Data-Centric Applications

    No full text
    International audience—In the context of highly data-centric applications, close reconciliation of computation and storage should significantly reduce the energy-consuming process of data movement. This paper proposes a Computational SRAM (CSRAM) combining In- and Near-Memory Computing (IMC/NMC) approaches to be used by a scalar processor as an energy-efficient vector processing unit. Parallel computing is thus performed on vectorized integer data on large words using usual logic and arithmetic operators. Furthermore, multiple rows can be advantageously activated simultaneously to increase this parallelism. The proposed C-SRAM is designed with a two-port pushed-rule foundry bitcell, available in most existing design platforms, and an adjustable form factor to facilitate physical implementation in a SoC. The 4kB C-SRAM testchip of 128-bit words manufactured in 22nm FD-SOI process technology displays a sub-array efficiency of 72% as well as an additional computing area of less than 5%. The measurements averaged on 10 dies at 0.85V and 1GHz demonstrate an energy efficiency per unit area of 35.6 and 1.48TOPS/W/mm2^2 for 8-bit additions and multiplications with 3ns and 24ns computing latency, respectively. Compared to a 128-bit SIMD processor architecture, up to 2x energy reduction and 1.8x speed-up gains are achievable for a representative set of highly data-centric application kernels
    corecore