Search CORE

5 research outputs found

The Development of Hardware Multi-core Test-bed on Field Programmable Gate Array

Author: Shivashanker Mohan
Publication venue: FIU Digital Commons
Publication date: 24/03/2011
Field of study

The goal of this project is to develop a flexible multi-core hardware test-bed on field programmable gate array (FPGA) that can be used to effectively validate the theoretical research on multi-core computing, especially for the power/thermal aware computing. Based on a commercial FPGA test platform, i.e. Xilinx Virtex5 XUPV5 LX110T, we develop a homogeneous multi-core test-bed with four software cores, each of which can dynamically adjust its performance using software. We also enhance the operating system support for this test platform with the development of hardware and software primitives that are useful in dealing with inter-process communication, synchronization, and scheduling for processes on multiple cores. An application based on matrix addition and multiplication on multi-core is implemented to validate the applicability of the test bed

DigitalCommons@Florida International University

Design and implementation of an array language for computational science on a heterogeneous multicore architecture

Author: Keir Paul
Publication venue
Publication date: 01/01/2012
Field of study

The packing of multiple processor cores onto a single chip has become a mainstream solution to fundamental physical issues relating to the microscopic scales employed in the manufacture of semiconductor components. Multicore architectures provide lower clock speeds per core, while aggregate floating-point capability continues to increase. Heterogeneous multicore chips, such as the Cell Broadband Engine (CBE) and modern graphics chips, also address the related issue of an increasing mismatch between high processor speeds, and huge latency to main memory. Such chips tackle this memory wall by the provision of addressable caches; increased bandwidth to main memory; and fast thread context switching. An associated cost is often reduced functionality of the individual accelerator cores; and the increased complexity involved in their programming. This dissertation investigates the application of a programming language supporting the first-class use of arrays; and capable of automatically parallelising array expressions; to the heterogeneous multicore domain of the CBE, as found in the Sony PlayStation 3 (PS3). The language is a pre-existing and well-documented proper subset of Fortran, known as the ‘F’ programming language. A bespoke compiler, referred to as E , is developed to support this aim, and written in the Haskell programming language. The output of the compiler is in an extended C++ dialect known as Offload C++, which targets the PS3. A significant feature of this language is its use of multiple, statically typed, address spaces. By focusing on generic, polymorphic interfaces for both the generated and hand constructed code, a number of interesting design patterns relating to the memory locality are introduced. A suite of medium-sized (100-700 lines), real-world benchmark programs are used to evaluate the performance, correctness, and scalability of the compiler technology. Absolute speedup values, well in excess of one, are observed for all of the programs. The work ultimately demonstrates that an array language can significantly reduce the effort expended to utilise a parallel heterogeneous multicore architecture, while retaining high performance. A substantial, related advantage in using standard ‘F’ is that any Fortran compiler can create debuggable, and competitively performing serial programs

Glasgow Theses Service

Automated Debugging Methodology for FPGA-based Systems

Author: Khan Habib ul Hasan
Publication venue
Publication date: 30/12/2019
Field of study

Electronic devices make up a vital part of our lives. These are seen from mobiles, laptops, computers, home automation, etc. to name a few. The modern designs constitute billions of transistors. However, with this evolution, ensuring that the devices fulfill the designer’s expectation under variable conditions has also become a great challenge. This requires a lot of design time and effort. Whenever an error is encountered, the process is re-started. Hence, it is desired to minimize the number of spins required to achieve an error-free product, as each spin results in loss of time and effort. Software-based simulation systems present the main technique to ensure the verification of the design before fabrication. However, few design errors (bugs) are likely to escape the simulation process. Such bugs subsequently appear during the post-silicon phase. Finding such bugs is time-consuming due to inherent invisibility of the hardware. Instead of software simulation of the design in the pre-silicon phase, post-silicon techniques permit the designers to verify the functionality through the physical implementations of the design. The main benefit of the methodology is that the implemented design in the post-silicon phase runs many order-of-magnitude faster than its counterpart in pre-silicon. This allows the designers to validate their design more exhaustively. This thesis presents five main contributions to enable a fast and automated debugging solution for reconfigurable hardware. During the research work, we used an obstacle avoidance system for robotic vehicles as a use case to illustrate how to apply the proposed debugging solution in practical environments. The first contribution presents a debugging system capable of providing a lossless trace of debugging data which permits a cycle-accurate replay. This methodology ensures capturing permanent as well as intermittent errors in the implemented design. The contribution also describes a solution to enhance hardware observability. It is proposed to utilize processor-configurable concentration networks, employ debug data compression to transmit the data more efficiently, and partially reconfiguring the debugging system at run-time to save the time required for design re-compilation as well as preserve the timing closure. The second contribution presents a solution for communication-centric designs. Furthermore, solutions for designs with multi-clock domains are also discussed. The third contribution presents a priority-based signal selection methodology to identify the signals which can be more helpful during the debugging process. A connectivity generation tool is also presented which can map the identified signals to the debugging system. The fourth contribution presents an automated error detection solution which can help in capturing the permanent as well as intermittent errors without continuous monitoring of debugging data. The proposed solution works for designs even in the absence of golden reference. The fifth contribution proposes to use artificial intelligence for post-silicon debugging. We presented a novel idea of using a recurrent neural network for debugging when a golden reference is present for training the network. Furthermore, the idea was also extended to designs where golden reference is not present

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

Android Application Development for the Intel Platform

Author: Cohen Ryan
Wang Tao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Computer scienc

OAPEN Library

Enabling modeling framework with surrogate modeling capabilities and complex networks

Author: Serafin Francesco
Publication venue: University of Trento
Publication date: 10/05/2019
Field of study

Conceptual and physically based environmental simulation models as products of research environments efforts became complex software over time in order to allow describing the behaviour of natural phenomena more accurately. Results from these models are considered accurate but often require to operate an entire system of modeling resources with dedicated knowledge, an extensive set up, and sometimes significant computational time. Model complexity limits wide model adaptation among consultants because of lower available technical resources and capabilities. However, models should be ubiquitous to use in both research and consulting environments. This dissertation aims to address and alleviate two aspects of research model complexity: 1) for researchers, the model design complexity with respect to its internal software structure and 2) for consultants, the model application complexity with respect to data and parameter setup, runtime requirements, and proper model infrastructure setup. The first contribution provides modeling design and implementation support by managing interacting modeling solutions as “Directed Acyclic Graph”, while the second one helps to create surrogate models of complex physical models as a streamlined process. Both contributions are implemented within the OMS/CSIP modeling framework and infrastructure and were applied in various studies. First, a machine learning (ML)-based surrogate model approach is presented to respond to field application requirements to get quick but “accurate enough” model results with limited input and limited a-priori knowledge of the internal physical processes involved. The surrogate model aims to capture the behaviour of a physical model as an ensemble system of artificial neural networks (ANN). Here, the NeuroEvolution of Augmenting Topology (NEAT) technique has been leveraged because of its integration of a genetic approach to build and evolve its ANNs during supervised training. Throughout this phase, the thorough design of the services facilitate seamless monitoring of structural mutations of the artificial neural network and its performances with respect to behavioural emulation of the original model response. This results in a streamlined surrogate model generation. Furthermore, the stochasticity inherent to the evolutionary genetic algorithm combined with a specially designed cross-validation approach allows for straightforward use of the ensemble application. Several, slightly different artificial neural networks are concurrently trained. The ensemble system is built upon the selection of the utmost performant surrogate models and is used collectively to provide uncertainty quantified results when applied against new data. Secondly, a Directed Acyclic Graph (DAG) modeling structure NET3 was developed. NET3 provides appropriate data structures to represent modeling states interactions as relationships based on network topologies. The inherent structure of the DAG commands the execution of modeling tasks. NET3 implicitly manages the parallel computation depending on the network topology. A node of a NET3 modeling structure encapsulates any sort of modeling solution such as a system of ordinary differential equations, a set of statistical rules, or a system of partial differential equations. Each link connects these modeling solutions by handling their data flow. As a result, NET3 simplifies 1) the translation of physical mathematical concepts into model components, and 2) the management of complex interactions of modeling solutions. NET3 also pushes forward the idea of separating concerns between software architecture and scientific model codebase. It manages aspects that relate to the architectural design of the graph modeling structure and lets research scientist focus on their model’s domain. NET3 improves encapsulation and reusability of scientific/mathematical concepts. It avoids code duplication by allowing the same modeling solution to be adopted in different nodes and finely adapted to specific requirements. In summary, NET3 enables a new level of modeling flexibility by allowing to quickly change model representations to explore new modeling solutions. The two presented contributions were integrated into the Object Modeling System/Cloud Services Integrated Platform (OMS/CSIP) environmental modeling framework (EMF). EMFs are standard practice in environmental modeling because they represent a software solution of separating the burden of software architectural design management from scientific research. Here, OMS/CSIP has been identified “advanced” in terms of EMFs design. It offers high flexibility, low language invasiveness, fine and thorough architectural design, and innovative cloud computing deployment infrastructure. These aspects make OMS/CSIP infrastructure the suitable platform to host NEAT based surrogate modeling and NET3 extensions. Framework-enabled NEAT based Surrogate modeling (FeNS) results from the full integration of NEAT based surrogate modeling approach with OMS/CSIP platform. Here, the surrogate model approach was developed as CSIP services to help transitioning from research models to “field models” by enabling the modeling framework to interact with CSIP services, ML libraries, and a NoSQL database to emerge model surrogates for a(ny) modelling solution. OMS/CSIP was extended to harvest data from each model run and automatically derive the surrogate model at the modeling framework level. NET3 extends OMS modeling simulations to run as a graph network of interconnected modeling solutions. Furthermore, it enhances available OMS calibration algorithms to become multi-site calibration procedures. OMS already provided implicit parallel computation of independent components in a modeling solution. NET3 now adds a further layer of implicit parallelism by concurrently running independent modeling solutions. Two studies were carried out to develop and test FeSN while three applications supported the development and testing of NET3. Surrogate models of the Revised Universal Soil Loss Equation, Version 2 (R2) were generated to scale up from simple test cases with a constrained input space to more generic applications including a larger variety of input parameters. The main goal of the surrogate model was to streamline and simplify access to the R2 model behaviour. We performed sensitivity analysis of R2 to limit the input space to only relevant parameters (e.g. soil properties, climate parameter, field geometries, crop rotation description). The main study area was the State of Iowa starting from a single county (Clay county) ending up to four counties (Buena Vista, Cherokee, Clay, and Wright). Clustering methodologies were applied to improve surrogate model accuracy and to accelerate the training process by reducing the dataset size. The overall “goodness-of-fit” against the testing dataset estimated on the median of the uncertainty quantified result of the surrogate models ensemble was always above 0.95 Nash-Sutcliffe (NS), root mean squared error (RMSE) between 0.13 and 0.36, and bias between -0.07 and 0.02. In many cases, accuracy of the surrogate model with respect to testing dataset was above 0.98 NS. Surrogate models of the AgroEcoSystem (AgES) were generated to apply and test FeNS methodology to a semi-distributed hydrologic model. The main goal of the surrogate model was to streamline and simplify access to the AgES model behaviour. Only relevant lumped parameters on watershed centroid were used to train the surrogate models and limit the input space to only relevant parameters (e.g. precipitation, groundwater level, LAI, and potential evapotranspiration). The main study area was the South Fork Iowa River (SFIR) watershed in the State of Iowa across Wright, Franklin, Hamilton, and Hardin counties. The overall “goodness-of-fit” against the testing dataset estimated on the median of the uncertainty quantified result of the surrogate models ensemble was above 0.97 Nash-Sutcliffe (NS), root mean squared error (RMSE) of 2.24, and bias of -0.0794. With respect to NET3, the first application is the real-time modeling of flood forecasting through GEOframe system for the Civil Protection of Regione Basilicata implemented by PhD Bancheri. To scale the computation and finely tune calibration parameters, the Basilicata river basins were split into subcatchments where each was represented by a different NET3 node. The second application was part of Mr. Dalla Torre’s master thesis where the computational core of the rainfall-runoff model of Storm Water Management Model (SWMM by EPA) was componentized. NET3 now allows for reimplementing a concise and lightweight SWMM modeling core and highly parallel model runs. Software architectural design of rainfall-runoff, routing and sewer pipe design components targeted separation of concerns, single responsibility, and encapsulation principles. It resulted in clean and minimized code base. NET3 manages component connections and scalable computation by hosting rainfall-runoff modeling solution into separated nodes from routing and sewer pipe design modeling solution. It also enables each node of the modeling structure to 1) access a shared data structure to fetch input data from and push results to (SWMMobject), and 2) internally analyze the upstream subtree in order to adjust sewer pipe design parameters. The third test case is the application of a “system of systems” of urban models where each node of the graph modeling structure encapsulates a single responsibility system of models. Because of the stochasticity involved in each system of models, the entire graph modeling solution was required to run several times and generate independent realizations. Hence, NET3 was enabled to run a “graph of graphs” modeling structure

Unitn-eprints PhD