304 research outputs found
High-order incompressible computational fluid dynamics on modern hardware architectures
In this thesis, a high-order incompressible Navier-Stokes solver is developed in the
Python-based PyFR framework. The solver is based on the artificial compressibility
formulation with a Flux Reconstruction (FR) discretisation in space and explicit
dual time stepping in time. In order to reduce time to solution, explicit convergence
acceleration techniques are developed and implemented. These techniques include
polynomial multigrid, a novel locally adaptive pseudo-time stepping approach and
novel stability-optimised Runge-Kutta schemes.
Choices regarding the numerical methods and implementation are motivated as
follows. Firstly, high-order FR is selected as the spatial discretisation due to its low
dissipation and ability to work with unstructured meshes of complex geometries. Be-
ing discontinuous, it also allows the majority of computation to be performed locally.
Secondly, convergence acceleration techniques are restricted to explicit methods in
order to retain the spatial locality provided by FR, which allows efficient harnessing
of the massively parallel compute capability of modern hardware. Thirdly, the solver
is implemented in the PyFR framework with cross-platform support such that it can
run on modern heterogeneous systems via an MPI + X model, with X being CUDA,
OpenCL or OpenMP. As such, it is well-placed to remain relevant in an era of rapidly
evolving hardware architectures.
The new software constitutes the first high-order accurate cross-platform imple-
mentation of an incompressible Navier-Stokes solver via artificial compressibility. The
solver and the convergence acceleration techniques are validated for a range of turbu-
lent test cases. Furthermore, performance of the convergence acceleration techniques
is assessed with a 2D cylinder test case, showing speed-up factors of over 20 relative
to global RK4 pseudo-time stepping when all of the technologies are combined. Fi-
nally, a simulation of the DARPA SUBOFF submarine model is undertaken using the
solver and all convergence acceleration techniques. Excellent agreement with previ-
ous studies is obtained, demonstrating that the technology can be used to conduct
high-fidelity implicit Large Eddy Simulation of industrially relevant problems at scale
using hundreds of GPUs.Open Acces
Layout regularity metric as a fast indicator of process variations
Integrated circuits design faces increasing challenge as we scale down due to the increase of the effect of sensitivity to process variations. Systematic variations induced by different steps in the lithography process affect both parametric and functional yields of the designs. These variations are known, themselves, to be affected by layout topologies. Design for Manufacturability (DFM) aims at defining techniques that mitigate variations and improve yield. Layout regularity is one of the trending techniques suggested by DFM to mitigate process variations effect. There are several solutions to create regular designs, like restricted design rules and regular fabrics. These regular solutions raised the need for a regularity metric. Metrics in literature are insufficient for different reasons; either because they are qualitative or computationally intensive. Furthermore, there is no study relating either lithography or electrical variations to layout regularity. In this work, layout regularity is studied in details and a new geometrical-based layout regularity metric is derived. This metric is verified against lithographic simulations and shows good correlation. Calculation of the metric takes only few minutes on 1mm x 1mm design, which is considered fast compared to the time taken by simulations. This makes it a good candidate for pre-processing the layout data and selecting certain areas of interest for lithographic simulations for faster throughput. The layout regularity metric is also compared against a model that measures electrical variations due to systematic lithographic variations. The validity of using the regularity metric to flag circuits that have high variability using the developed electrical variations model is shown. The regularity metric results compared to the electrical variability model results show matching percentage that can reach 80%, which means that this metric can be used as a fast indicator of designs more susceptible to lithography and hence electrical variations
Critical appraisal of product development expertise in Irish SMEs
The focus of this research was on the product development expertise of Irish SMEs. In particular, SMEs developing physical products (a physical product is defined as an electronic, medical device, plastic or general engineering product). A survey of Irish SMEs was conducted across industry sectors developing physical products with the objective of understanding how indigenous SMEs and therefore Ireland is progressing towards becoming a knowledge economy. SME characteristics (customers and markets, organisational structures, systems, processes and procedures, human and financial resources, culture and behaviour) were researched and used to understand the issues SMEs have with product development (PD research is mostly considered from the perspective of large companies). In relation to product development: strategy, innovation and learning, strategic techniques, organisational structure, product development process design, types of product development processes, tools and methodologies, technology, intellectual property, change management, marketing and branding and performance measurement were all examined. Survey items (variables) were identified from the literature review and used to create a survey designed based on ‘best practice’ PD and SME characteristics. This survey was conducted based on identified survey best practice in order to increase response rate and went through two pre-tests and a pilot before the final study. Descriptive analysis, reliability/consistency analysis and regression analysis were conducted on the constructs of product development. Specific relationships identified in the literature review were examined. The results of this analysis revealed that Irish SMEs are operating in a ‘Knowledge Based Development’ or learning environment. They carry out many of the techniques associated with various tools and methodologies but reported no use of these T&M which could aid their approach. There is a high use of technology, especially CAD and technology is mostly developed within the product development process. There was a high use of Cross Functional Teams and in general strategy and fuzzy front end/voice of the customer usage was carried out well. There were no issues with change management and in relation to intellectual property the use of an IP policy,
strategy and portfolios was low. Generally, Irish SMEs are ready to reach the next stage of company evolution by linking ‘organisational (innovation) processes’
Recommended from our members
EDA design for Microscale Modular Assembled ASIC (M2A2) circuits
As the semiconductor industry has driven down the minimum feature size to well below 50nm, the mask cost to make devices has skyrocketed. The cost for a full set of masks is estimated to be about 2M for 65nm lithography nodes. According to some estimates, mask writing time goes up as a power of five as feature sizes are decreased below 50nm. In addition, higher complexity of large designs increases the number of design re-spins. The above two factors lead to considerable increase in the nonrecurring engineering cost (NRE) for standard cell ASICs, which has become prohibitively expensive for low to mid volume applications. Field programmable gate array (FPGAs) offer an acceptable solution for fast prototyping and ultra-low volume applications, but are generally not seen as a replacement for ASICs because of their highly inefficient space utilization, lower performance/speed and high power consumption. This is particularly the case as mobility has driven expectations for small form factor and low power consumption. In this work, a new type of ASICs named as Microscale Modular Assembled ASIC (M2A2) is proposed. This technology is a novel application of the high-speed, precision assembly technique for fabrication of ASICs using a limited number of mass-produced feedstock logic circuits. The idea is to share the mask cost for sub-100nm feature sizes across a large number of ASIC designs, decreasing the NRE for individual designs. The concept of constructing ASICs using repeating logic elements is based on previous works where it has been shown that ASICs made of via/metal configured structured elements can achieve space utilization and performance comparable to cell based ASICs. However, in the proposed technique, we provide significantly more choice in the transistor layer, in terms of feedstock types and their configuration. This thesis document deals with the electronic design automation (EDA) design for microscale modular assembled ASIC based circuits. The document discusses the design of feedstock cells, generation of feedstock preplaced design, generation of design collaterals to support M2A2 EDA flow, and front end M2A2 synthesis flow to meet the required functionality of design and achieve optimal quality of results (QoR) metrics in terms of circuit performance/speed, power and areaElectrical and Computer Engineerin
Methodology and Ecosystem for the Design of a Complex Network ASIC
Performance of HPC systems has risen steadily. While the 10 Petaflop/s barrier has been breached in the year 2011 the next large step into the exascale era is expected sometime between the years 2018 and 2020. The EXTOLL project will be an integral part in this venture. Originally designed as a research project on FPGA basis it will make the transition to an ASIC to improve its already excelling performance even further. This transition poses many challenges that will be presented in this thesis. Nowadays, it is not enough to look only at single components in a system. EXTOLL is part of complex ecosystem which must be optimized overall since everything is tightly interwoven and disregarding some aspects can cause the whole system either to work with limited performance or even to fail.
This thesis examines four different aspects in the design hierarchy and proposes efficient solutions or improvements for each of them. At first it takes a look at the design implementation and the differences between FPGA and ASIC design. It introduces a methodology to equip all on-chip memory with ECC logic automatically without the user’s input and in a transparent way so that the underlying code that uses the memory does not have to be changed. In the next step the floorplanning process is analyzed and an iterative solution is worked out based on physical and logical constraints of the EXTOLL design. Besides, a work flow for collaborative design is presented that allows multiple users to work on the design concurrently. The third part concentrates on the high-speed signal path from the chip to the connector and how it is affected by technological limitations. All constraints are analyzed and a package layout for the EXTOLL chip is proposed that is seen as the optimal solution. The last part develops a cost model for wafer and package level test and raises technological concerns that will affect the testing methodology. In order to run testing internally it proposes the development of a stand-alone test platform that is able to test packaged EXTOLL chips in every aspect
A Dynamically Scheduled HLS Flow in MLIR
In High-Level Synthesis (HLS), we consider abstractions that span from software to hardware and target heterogeneous architectures. Therefore, managing the complexity introduced by this is key to implementing good, maintainable, and extendible HLS compilers. Traditionally, HLS flows have been built on top of software compilation infrastructure such as LLVM, with hardware aspects of the flow existing peripherally to the core of the compiler. Through this work, we aim to show that MLIR, a compiler infrastructure with a focus on domain-specific intermediate representations (IR), is a better infrastructure for HLS compilers. Using MLIR, we define HLS and hardware abstractions as first-class citizens of the compiler, simplifying analysis, transformations, and optimization. To demonstrate this, we present a C-to-RTL, dynamically scheduled HLS flow. We find that our flow generates circuits comparable to those of an equivalent LLVM-based HLS compiler. Notably, we achieve this while lacking key optimization passes typically found in HLS compilers and through the use of an experimental front-end. To this end, we show that significant improvements in the generated RTL are but low-hanging fruit, requiring engineering effort to attain. We believe that our flow is more modular and more extendible than comparable open-source HLS compilers and is thus a good candidate as a basis for future research. Apart from the core HLS flow, we provide MLIR-based tooling for C-to-RTL cosimulation and visual debugging, with the ultimate goal of building an MLIR-based HLS infrastructure that will drive innovation in the field
- …