9 research outputs found
On Timing Model Extraction and Hierarchical Statistical Timing Analysis
In this paper, we investigate the challenges to apply Statistical Static
Timing Analysis (SSTA) in hierarchical design flow, where modules supplied by
IP vendors are used to hide design details for IP protection and to reduce the
complexity of design and verification. For the three basic circuit types,
combinational, flip-flop-based and latch-controlled, we propose methods to
extract timing models which contain interfacing as well as compressed internal
constraints. Using these compact timing models the runtime of full-chip timing
analysis can be reduced, while circuit details from IP vendors are not exposed.
We also propose a method to reconstruct the correlation between modules during
full-chip timing analysis. This correlation can not be incorporated into timing
models because it depends on the layout of the corresponding modules in the
chip. In addition, we investigate how to apply the extracted timing models with
the reconstructed correlation to evaluate the performance of the complete
design. Experiments demonstrate that using the extracted timing models and
reconstructed correlation full-chip timing analysis can be several times faster
than applying the flattened circuit directly, while the accuracy of statistical
timing analysis is still well maintained
Rapid SoC Design: On Architectures, Methodologies and Frameworks
Modern applications like machine learning, autonomous vehicles, and 5G networking require an order of magnitude boost in processing capability. For several decades, chip designers have relied on Moore’s Law - the doubling of transistor count every two years to deliver improved performance, higher energy efficiency, and an increase in transistor density. With the end of Dennard’s scaling and a slowdown in Moore’s Law, system architects have developed several techniques to deliver on the traditional performance and power improvements we have come to expect. More recently, chip designers have turned towards heterogeneous systems comprised of more specialized processing units to buttress the traditional processing units. These specialized units improve the overall performance, power, and area (PPA) metrics across a wide variety of workloads and applications. While the GPU serves as a classical example, accelerators for machine learning, approximate computing, graph processing, and database applications have become commonplace. This has led to an exponential growth in the variety (and count) of these compute units found in modern embedded and high-performance computing platforms.
The various techniques adopted to combat the slowing of Moore’s Law directly translates to an increase in complexity for modern system-on-chips (SoCs). This increase in complexity in turn leads to an increase in design effort and validation time for hardware and the accompanying software stacks. This is further aggravated by fabrication challenges (photo-lithography, tooling, and yield) faced at advanced technology nodes (below 28nm). The inherent complexity in modern SoCs translates into increased costs and time-to-market delays. This holds true across the spectrum, from mobile/handheld processors to high-performance data-center appliances.
This dissertation presents several techniques to address the challenges of rapidly birthing complex SoCs. The first part of this dissertation focuses on foundations and architectures that aid in rapid SoC design. It presents a variety of architectural techniques that were developed and leveraged to rapidly construct complex SoCs at advanced process nodes. The next part of the dissertation focuses on the gap between a completed design model (in RTL form) and its physical manifestation (a GDS file that will be sent to the foundry for fabrication). It presents methodologies and a workflow for rapidly walking a design through to completion at arbitrary technology nodes. It also presents progress on creating tools and a flow that is entirely dependent on open-source tools. The last part presents a framework that not only speeds up the integration of a hardware accelerator into an SoC ecosystem, but emphasizes software adoption and usability.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/168119/1/ajayi_1.pd
First International Conference on Ada (R) Programming Language Applications for the NASA Space Station, volume 2
Topics discussed include: reusability; mission critical issues; run time; expert systems; language issues; life cycle issues; software tools; and computers for Ada
Recommended from our members
Laboratory Directed Research and Development Program FY 2004 Annual Report
The Oak Ridge National Laboratory (ORNL) Laboratory Directed Research and Development (LDRD) Program reports its status to the U.S. Department of Energy (DOE) in March of each year. The program operates under the authority of DOE Order 413.2A, 'Laboratory Directed Research and Development' (January 8, 2001), which establishes DOE's requirements for the program while providing the Laboratory Director broad flexibility for program implementation. LDRD funds are obtained through a charge to all Laboratory programs. This report describes all ORNL LDRD research activities supported during FY 2004 and includes final reports for completed projects and shorter progress reports for projects that were active, but not completed, during this period. The FY 2004 ORNL LDRD Self-Assessment (ORNL/PPA-2005/2) provides financial data about the FY 2004 projects and an internal evaluation of the program's management process. ORNL is a DOE multiprogram science, technology, and energy laboratory with distinctive capabilities in materials science and engineering, neutron science and technology, energy production and end-use technologies, biological and environmental science, and scientific computing. With these capabilities ORNL conducts basic and applied research and development (R&D) to support DOE's overarching national security mission, which encompasses science, energy resources, environmental quality, and national nuclear security. As a national resource, the Laboratory also applies its capabilities and skills to the specific needs of other federal agencies and customers through the DOE Work For Others (WFO) program. Information about the Laboratory and its programs is available on the Internet at <http://www.ornl.gov/>. LDRD is a relatively small but vital DOE program that allows ORNL, as well as other multiprogram DOE laboratories, to select a limited number of R&D projects for the purpose of: (1) maintaining the scientific and technical vitality of the Laboratory; (2) enhancing the Laboratory's ability to address future DOE missions; (3) fostering creativity and stimulating exploration of forefront science and technology; (4) serving as a proving ground for new research; and (5) supporting high-risk, potentially high-value R&D. Through LDRD the Laboratory is able to improve its distinctive capabilities and enhance its ability to conduct cutting-edge R&D for its DOE and WFO sponsors. To meet the LDRD objectives and fulfill the particular needs of the Laboratory, ORNL has established a program with two components: the Director's R&D Fund and the Seed Money Fund. As outlined in Table 1, these two funds are complementary. The Director's R&D Fund develops new capabilities in support of the Laboratory initiatives, while the Seed Money Fund is open to all innovative ideas that have the potential for enhancing the Laboratory's core scientific and technical competencies. Provision for multiple routes of access to ORNL LDRD funds maximizes the likelihood that novel and seminal ideas with scientific and technological merit will be recognized and supported
Online learning on the programmable dataplane
This thesis makes the case for managing computer networks with datadriven methods automated statistical inference and control based on measurement data and runtime observations—and argues for their tight integration with programmable dataplane hardware to make management decisions faster and from more precise data. Optimisation, defence, and measurement of networked infrastructure are each challenging tasks in their own right, which are currently dominated by the use of hand-crafted heuristic methods. These become harder to reason about and deploy as networks scale in rates and number of forwarding elements, but their design requires expert knowledge and care around unexpected protocol interactions. This makes tailored, per-deployment or -workload solutions infeasible to develop. Recent advances in machine learning offer capable function approximation and closed-loop control which suit many of these tasks. New, programmable dataplane hardware enables more agility in the network— runtime reprogrammability, precise traffic measurement, and low latency on-path processing. The synthesis of these two developments allows complex decisions to be made on previously unusable state, and made quicker by offloading inference to the network.
To justify this argument, I advance the state of the art in data-driven defence of networks, novel dataplane-friendly online reinforcement learning algorithms, and in-network data reduction to allow classification of switchscale data. Each requires co-design aware of the network, and of the failure modes of systems and carried traffic. To make online learning possible in the dataplane, I use fixed-point arithmetic and modify classical (non-neural) approaches to take advantage of the SmartNIC compute model and make use of rich device local state. I show that data-driven solutions still require great care to correctly design, but with the right domain expertise they can improve on pathological cases in DDoS defence, such as protecting legitimate UDP traffic. In-network aggregation to histograms is shown to enable accurate classification from fine temporal effects, and allows hosts to scale such classification to far larger flow counts and traffic volume. Moving reinforcement learning to the dataplane is shown to offer substantial benefits to stateaction latency and online learning throughput versus host machines; allowing policies to react faster to fine-grained network events. The dataplane environment is key in making reactive online learning feasible—to port further algorithms and learnt functions, I collate and analyse the strengths of current and future hardware designs, as well as individual algorithms