4,289 research outputs found
Just In Time Assembly (JITA) - A Run Time Interpretation Approach for Achieving Productivity of Creating Custom Accelerators in FPGAs
The reconfigurable computing community has yet to be successful in allowing programmers to access FPGAs through traditional software development flows. Existing barriers that prevent programmers from using FPGAs include: 1) knowledge of hardware programming models, 2) the need to work within the vendor specific CAD tools and hardware synthesis. This thesis presents a series of published papers that explore different aspects of a new approach being developed to remove the barriers and enable programmers to compile accelerators on next generation reconfigurable manycore architectures. The approach is entitled Just In Time Assembly (JITA) of hardware accelerators. The approach has been defined to allow hardware accelerators to be built and run through software compilation and run time interpretation outside of CAD tools and without requiring each new accelerator to be synthesized. The approach advocates the use of libraries of pre-synthesized components that can be referenced through symbolic links in a similar fashion to dynamically linked software libraries. Synthesis still must occur but is moved out of the application programmers software flow and into the initial coding process that occurs when programming patterns that define a Domain Specific Language (DSL) are first coded. Programmers see no difference between creating software or hardware functionality when using the DSL. A new run time interpreter is introduced to assemble the individual pre-synthesized hardware accelerators that comprise the accelerator functionality within a configurable tile array of partially reconfigurable slots at run time. Quantitative results are presented that compares utilization, performance, and productivity of the approach to what would be achieved by full custom accelerators created through traditional CAD flows using hardware programming models and passing through synthesis
A Hybrid Partially Reconfigurable Overlay Supporting Just-In-Time Assembly of Custom Accelerators on FPGAs
The state of the art in design and development flows for FPGAs are not sufficiently mature to allow programmers to implement their applications through traditional software development flows. The stipulation of synthesis as well as the requirement of background knowledge on the FPGAs\u27 low-level physical hardware structure are major challenges that prevent programmers from using FPGAs. The reconfigurable computing community is seeking solutions to raise the level of design abstraction at which programmers must operate, and move the synthesis process out of the programmers\u27 path through the use of overlays. A recent approach, Just-In-Time Assembly (JITA), was proposed that enables hardware accelerators to be assembled at runtime, all from within a traditional software compilation flow. The JITA approach presents a promising path to constructing hardware designs on FPGAs using pre-synthesized parallel programming patterns, but suffers from two major limitations. First, all variant programming patterns must be pre-synthesized. Second, conditional operations are not supported.
In this thesis, I present a new reconfigurable overlay, URUK, that overcomes the two limitations imposed by the JITA approach. Similar to the original JITA approach, the proposed URUK overlay allows hardware accelerators to be constructed on FPGAs through software compilation flows. To this basic capability, URUK adds additional support to enable the assembly of presynthesized fine-grained computational operators to be assembled within the FPGA.
This thesis provides analysis of URUK from three different perspectives; utilization, performance, and productivity. The analysis includes comparisons against High-Level Synthesis (HLS) and the state of the art approach to creating static overlays. The tradeoffs conclude that URUK can achieve approximately equivalent performance for algebra operations compared to HLS custom accelerators, which are designed with simple experience on FPGAs. Further, URUK shows a high degree of flexibility for runtime placement and routing of the primitive operations. The analysis shows how this flexibility can be leveraged to reduce communication overhead among tiles, compared to traditional static overlays. The results also show URUK can enable software programmers without any hardware skills to create hardware accelerators at productivity levels consistent with software development and compilation
Actors: The Ideal Abstraction for Programming Kernel-Based Concurrency
GPU and multicore hardware architectures are commonly
used in many different application areas to accelerate problem solutions
relative to single CPU architectures. The typical approach to accessing
these hardware architectures requires embedding logic into the programming
language used to construct the application; the two primary forms
of embedding are: calls to API routines to access the concurrent functionality,
or pragmas providing concurrency hints to a language compiler
such that particular blocks of code are targeted to the concurrent functionality.
The former approach is verbose and semantically bankrupt,
while the success of the latter approach is restricted to simple, static
uses of the functionality.
Actor-based applications are constructed from independent, encapsulated
actors that interact through strongly-typed channels. This paper
presents a first attempt at using actors to program kernels targeted at
such concurrent hardware. Besides the glove-like fit of a kernel to the actor
abstraction, quantitative code analysis shows that actor-based kernels
are always significantly simpler than API-based coding, and generally
simpler than pragma-based coding. Additionally, performance measurements
show that the overheads of actor-based kernels are commensurate
to API-based kernels, and range from equivalent to vastly improved for
pragma-based annotations, both for sample and real-world applications
ASCR/HEP Exascale Requirements Review Report
This draft report summarizes and details the findings, results, and
recommendations derived from the ASCR/HEP Exascale Requirements Review meeting
held in June, 2015. The main conclusions are as follows. 1) Larger, more
capable computing and data facilities are needed to support HEP science goals
in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of
the demand at the 2025 timescale is at least two orders of magnitude -- and in
some cases greater -- than that available currently. 2) The growth rate of data
produced by simulations is overwhelming the current ability, of both facilities
and researchers, to store and analyze it. Additional resources and new
techniques for data analysis are urgently needed. 3) Data rates and volumes
from HEP experimental facilities are also straining the ability to store and
analyze large and complex data volumes. Appropriately configured
leadership-class facilities can play a transformational role in enabling
scientific discovery from these datasets. 4) A close integration of HPC
simulation and data analysis will aid greatly in interpreting results from HEP
experiments. Such an integration will minimize data movement and facilitate
interdependent workflows. 5) Long-range planning between HEP and ASCR will be
required to meet HEP's research needs. To best use ASCR HPC resources the
experimental HEP program needs a) an established long-term plan for access to
ASCR computational and data resources, b) an ability to map workflows onto HPC
resources, c) the ability for ASCR facilities to accommodate workflows run by
collaborations that can have thousands of individual members, d) to transition
codes to the next-generation HPC platforms that will be available at ASCR
facilities, e) to build up and train a workforce capable of developing and
using simulations and analysis to support HEP scientific research on
next-generation systems.Comment: 77 pages, 13 Figures; draft report, subject to further revisio
Concurrent in situ ion irradiation transmission electron microscope
AbstractAn in situ ion irradiation transmission electron microscope has been developed and is operational at Sandia National Laboratories. This facility permits high spatial resolution, real time observation of electron transparent samples under ion irradiation, implantation, mechanical loading, corrosive environments, and combinations thereof. This includes the simultaneous implantation of low-energy gas ions (0.8ā30keV) during high-energy heavy ion irradiation (0.8ā48MeV). Initial results in polycrystalline gold foils are provided to demonstrate the range of capabilities
Managing Research Data in Big Science
The project which led to this report was funded by JISC in 2010--2011 as part of its 'Managing Research Data' programme, to examine the way in which Big Science data is managed, and produce any recommendations which may be appropriate. Big science data is different: it comes in large volumes, and it is shared and exploited in ways which may differ from other disciplines. This project has explored these differences using as a case-study Gravitational Wave data generated by the LSC, and has produced recommendations intended to be useful variously to JISC, the funding council (STFC) and the LSC community. In Sect. 1 we define what we mean by 'big science', describe the overall data culture there, laying stress on how it necessarily or contingently differs from other disciplines. In Sect. 2 we discuss the benefits of a formal data-preservation strategy, and the cases for open data and for well-preserved data that follow from that. This leads to our recommendations that, in essence, funders should adopt rather light-touch prescriptions regarding data preservation planning: normal data management practice, in the areas under study, corresponds to notably good practice in most other areas, so that the only change we suggest is to make this planning more formal, which makes it more easily auditable, and more amenable to constructive criticism. In Sect. 3 we briefly discuss the LIGO data management plan, and pull together whatever information is available on the estimation of digital preservation costs. The report is informed, throughout, by the OAIS reference model for an open archive
The Desktop Muon Detector: A simple, physics-motivated machine- and electronics-shop project for university students
This paper describes an undergraduate-level physics project that incorporates
various aspects of machine- and electronics-shop technical development. The
desktop muon detector is a self-contained apparatus that employs plastic
scintillator as a detection medium and a silicon photomultiplier for light
collection. These detectors can be used in conjunction with the provided
software to make interesting physics measurements. The total cost of each
counter is approximately $100.Comment: 29 pages, 14 figure
Managing Research Data: Gravitational Waves
The project which led to this report was funded by JISC in 2010ā2011 as part of its
āManaging Research Dataā programme, to examine the way in which Big Science data
is managed, and produce any recommendations which may be appropriate.
Big science data is different: it comes in large volumes, and it is shared and
exploited in ways which may differ from other disciplines. This project has explored
these differences using as a case-study Gravitational Wave data generated by the LSC,
and has produced recommendations intended to be useful variously to JISC, the funding
council (STFC) and the LSC community.
In Sect. 1 we deļ¬ne what we mean by ābig scienceā, describe the overall data
culture there, laying stress on how it necessarily or contingently differs from other
disciplines.
In Sect. 2 we discuss the beneļ¬ts of a formal data-preservation strategy, and the
cases for open data and for well-preserved data that follow from that. This leads to our
recommendations that, in essence, funders should adopt rather light-touch prescriptions
regarding data preservation planning: normal data management practice, in the areas
under study, corresponds to notably good practice in most other areas, so that the only
change we suggest is to make this planning more formal, which makes it more easily
auditable, and more amenable to constructive criticism.
In Sect. 3 we brieļ¬y discuss the LIGO data management plan, and pull together
whatever information is available on the estimation of digital preservation costs.
The report is informed, throughout, by the OAIS reference model for an open
archive. Some of the reportās ļ¬ndings and conclusions were summarised in [1].
See the document history on page 37
- ā¦