4,351 research outputs found
Highly Efficient Twin Module Structure of 64-Bit Exponential Function Implemented on SGI RASC Platform
This paper presents an implementation of the double precision exponential function. A novel table-based architecture, together with short Taylor expansion, provides a low latency (30 clock cycles) which is comparable to 32 bit implementations. A low area consumption of a single exp() module (roughtly 4% of XC4LX200) allows that several modules can be implemented in a single FPGAs.The employment of massive parallelism results in high performance of the module. Nevertheless, because of the external memory interface limitation, only a twin module structure is presented in this paper. This implementation aims primarily to meet quantum chemistry huge and strict requirements for precision and speed. Each module is capable of processing at speed of 200MHz with max. error of 1 ulp, RMSE equals 0.6
DDC-PIM: Efficient Algorithm/Architecture Co-design for Doubling Data Capacity of SRAM-based Processing-In-Memory
Processing-in-memory (PIM), as a novel computing paradigm, provides
significant performance benefits from the aspect of effective data movement
reduction. SRAM-based PIM has been demonstrated as one of the most promising
candidates due to its endurance and compatibility. However, the integration
density of SRAM-based PIM is much lower than other non-volatile memory-based
ones, due to its inherent 6T structure for storing a single bit. Within
comparable area constraints, SRAM-based PIM exhibits notably lower capacity.
Thus, aiming to unleash its capacity potential, we propose DDC-PIM, an
efficient algorithm/architecture co-design methodology that effectively doubles
the equivalent data capacity. At the algorithmic level, we propose a
filter-wise complementary correlation (FCC) algorithm to obtain a bitwise
complementary pair. At the architecture level, we exploit the intrinsic
cross-coupled structure of 6T SRAM to store the bitwise complementary pair in
their complementary states (), thereby maximizing the data
capacity of each SRAM cell. The dual-broadcast input structure and
reconfigurable unit support both depthwise and pointwise convolution, adhering
to the requirements of various neural networks. Evaluation results show that
DDC-PIM yields about speedup on MobileNetV2 and on
EfficientNet-B0 with negligible accuracy loss compared with PIM baseline
implementation. Compared with state-of-the-art SRAM-based PIM macros, DDC-PIM
achieves up to and improvement in weight density and
area efficiency, respectively.Comment: 14 pages, to be published in IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems (TCAD
Vienna Circle and Logical Analysis of Relativity Theory
In this paper we present some of our school's results in the area of building
up relativity theory (RT) as a hierarchy of theories in the sense of logic. We
use plain first-order logic (FOL) as in the foundation of mathematics (FOM) and
we build on experience gained in FOM.
The main aims of our school are the following: We want to base the theory on
simple, unambiguous axioms with clear meanings. It should be absolutely
understandable for any reader what the axioms say and the reader can decide
about each axiom whether he likes it. The theory should be built up from these
axioms in a straightforward, logical manner. We want to provide an analysis of
the logical structure of the theory. We investigate which axioms are needed for
which predictions of RT. We want to make RT more transparent logically, easier
to understand, easier to change, modular, and easier to teach. We want to
obtain deeper understanding of RT.
Our work can be considered as a case-study showing that the Vienna Circle's
(VC) approach to doing science is workable and fruitful when performed with
using the insights and tools of mathematical logic acquired since its formation
years at the very time of the VC activity. We think that logical positivism was
based on the insight and anticipation of what mathematical logic is capable
when elaborated to some depth. Logical positivism, in great part represented by
VC, influenced and took part in the birth of modern mathematical logic. The
members of VC were brave forerunners and pioneers.Comment: 25 pages, 1 firgure
Benchmarking and modeling of analog and digital SRAM in-memory computing architectures
In-memory-computing is emerging as an efficient hardware paradigm for deep
neural network accelerators at the edge, enabling to break the memory wall and
exploit massive computational parallelism. Two design models have surged:
analog in-memory-computing (AIMC) and digital in-memory-computing (DIMC),
offering a different design space in terms of accuracy, efficiency and dataflow
flexibility. This paper targets the fair comparison and benchmarking of both
approaches to guide future designs, through a.) an overview of published
architectures; b.) an analytical cost model for energy and throughput; c.)
scheduling of workloads on a variety of modeled IMC architectures for
end-to-end network efficiency analysis, offering valuable workload-hardware
co-design insights
The C23A system, an exmaple of quantitative control of plant growth associated with a data base
The architecture of the C23A (Chambers de Culture Automatique en Atmosphere Artificielles) system for the controlled study of plant physiology is described. A modular plant growth chambers and associated instruments (I.R. CO2 analyser, Mass spectrometer and Chemical analyser); network of frontal processors controlling this apparatus; a central computer for the periodic control and the multiplex work of processors; and a network of terminal computers able to ask the data base for data processing and modeling are discussed. Examples of present results are given. A growth curve analysis study of CO2 and O2 gas exchanges of shoots and roots, and daily evolution of algal photosynthesis and of the pools of dissolved CO2 in sea water are discussed
Synthetic Aperture Radar (SAR) data processing
The available and optimal methods for generating SAR imagery for NASA applications were identified. The SAR image quality and data processing requirements associated with these applications were studied. Mathematical operations and algorithms required to process sensor data into SAR imagery were defined. The architecture of SAR image formation processors was discussed, and technology necessary to implement the SAR data processors used in both general purpose and dedicated imaging systems was addressed
- …