15,699 research outputs found
A proposed synthesis method for Application-Specific Instruction Set Processors
Due to the rapid technology advancement in integrated circuit era, the need for the high computation
performance together with increasing complexity and manufacturing costs has raised the demand for
high-performance con
fi
gurable designs; therefore, the Application-Speci
fi
c Instruction Set Processors
(ASIPs) are widely used in SoC design. The automated generation of software tools for ASIPs is a
commonly used technique, but the automated hardware model generation is less frequently applied in
terms of
fi
nal RTL implementations. Contrary to this, the
fi
nal register-transfer level models are usually
created, at least partly, manually. This paper presents a novel approach for automated hardware model
generation for ASIPs. The new solution is based on a novel abstract ASIP model and a modeling language
(Algorithmic Microarchitecture Description Language, AMDL) optimized for this architecture model. The
proposed AMDL-based pre-synthesis method is based on a set of pre-de
fi
ned VHDL implementation
schemes, which ensure the qualities of the automatically generated register-transfer level models in
terms of resource requirement and operation frequency. The design framework implementing the
algorithms required by the synthesis method is also presented
Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review
The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER
Vector processing-aware advanced clock-gating techniques for low-power fused multiply-add
The need for power efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a retailoring for the mobile market that they are entering now. Floating-point (FP) fused multiply-add (FMA), being a functional unit with high power consumption, deserves special attention. Although clock gating is a well-known method to reduce switching power in synchronous designs, there are unexplored opportunities for its application to vector processors, especially when considering active operating mode. In this research, we comprehensively identify, propose, and evaluate the most suitable clock-gating techniques for vector FMA units (VFUs). These techniques ensure power savings without jeopardizing the timing. We evaluate the proposed techniques using both synthetic and “real-world” application-based benchmarking. Using vector masking and vector multilane-aware clock gating, we report power reductions of up to 52%, assuming active VFU operating at the peak performance. Among other findings, we observe that vector instruction-based clock-gating techniques achieve power savings for all vector FP instructions. Finally, when evaluating all techniques together, using “real-world” benchmarking, the power reductions are up to 80%. Additionally, in accordance with processor design trends, we perform this research in a fully parameterizable and automated fashion.The research leading to these results has received funding from the RoMoL ERC Advanced Grant GA 321253 and is supported in part by the European Union (FEDER funds) under contract TTIN2015-65316-P.
The work of I. Ratkovic was supported by a FPU research grant from the Spanish MECD.Peer ReviewedPostprint (author's final draft
Survey on Combinatorial Register Allocation and Instruction Scheduling
Register allocation (mapping variables to processor registers or memory) and
instruction scheduling (reordering instructions to increase instruction-level
parallelism) are essential tasks for generating efficient assembly code in a
compiler. In the last three decades, combinatorial optimization has emerged as
an alternative to traditional, heuristic algorithms for these two tasks.
Combinatorial optimization approaches can deliver optimal solutions according
to a model, can precisely capture trade-offs between conflicting decisions, and
are more flexible at the expense of increased compilation time.
This paper provides an exhaustive literature review and a classification of
combinatorial optimization approaches to register allocation and instruction
scheduling, with a focus on the techniques that are most applied in this
context: integer programming, constraint programming, partitioned Boolean
quadratic programming, and enumeration. Researchers in compilers and
combinatorial optimization can benefit from identifying developments, trends,
and challenges in the area; compiler practitioners may discern opportunities
and grasp the potential benefit of applying combinatorial optimization
Radiation safety based on the sky shine effect in reactor
In the reactor operation, neutrons and gamma rays are the most dominant radiation.
As protection, lead and concrete shields are built around the reactor. However, the radiation
can penetrate the water shielding inside the reactor pool. This incident leads to the occurrence
of sky shine where a physical phenomenon of nuclear radiation sources was transmitted
panoramic that extends to the environment. The effect of this phenomenon is caused by the
fallout radiation into the surrounding area which causes the radiation dose to increase. High
doses of exposure cause a person to have stochastic effects or deterministic effects. Therefore,
this study was conducted to measure the radiation dose from sky shine effect that scattered
around the reactor at different distances and different height above the reactor platform. In this
paper, the analysis of the radiation dose of sky shine effect was measured using the
experimental metho
A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems
Recent technological advances have greatly improved the performance and
features of embedded systems. With the number of just mobile devices now
reaching nearly equal to the population of earth, embedded systems have truly
become ubiquitous. These trends, however, have also made the task of managing
their power consumption extremely challenging. In recent years, several
techniques have been proposed to address this issue. In this paper, we survey
the techniques for managing power consumption of embedded systems. We discuss
the need of power management and provide a classification of the techniques on
several important parameters to highlight their similarities and differences.
This paper is intended to help the researchers and application-developers in
gaining insights into the working of power management techniques and designing
even more efficient high-performance embedded systems of tomorrow
A Reconfigurable Vector Instruction Processor for Accelerating a Convection Parametrization Model on FPGAs
High Performance Computing (HPC) platforms allow scientists to model
computationally intensive algorithms. HPC clusters increasingly use
General-Purpose Graphics Processing Units (GPGPUs) as accelerators; FPGAs
provide an attractive alternative to GPGPUs for use as co-processors, but they
are still far from being mainstream due to a number of challenges faced when
using FPGA-based platforms. Our research aims to make FPGA-based high
performance computing more accessible to the scientific community. In this work
we present the results of investigating the acceleration of a particular
atmospheric model, Flexpart, on FPGAs. We focus on accelerating the most
computationally intensive kernel from this model. The key contribution of our
work is the architectural exploration we undertook to arrive at a solution that
best exploits the parallelism available in the legacy code, and is also
convenient to program, so that eventually the compilation of high-level legacy
code to our architecture can be fully automated. We present the three different
types of architecture, comparing their resource utilization and performance,
and propose that an architecture where there are a number of computational
cores, each built along the lines of a vector instruction processor, works best
in this particular scenario, and is a promising candidate for a generic
FPGA-based platform for scientific computation. We also present the results of
experiments done with various configuration parameters of the proposed
architecture, to show its utility in adapting to a range of scientific
applications.Comment: This is an extended pre-print version of work that was presented at
the international symposium on Highly Efficient Accelerators and
Reconfigurable Technologies (HEART2014), Sendai, Japan, June 911, 201
- …