4,441 research outputs found
Closing the loop of design and analysis: Parametric modelling tools for early decision support
There is a growing need for parametric design software that communicates building performance feedback in early architectural exploration to support decision-making. This paper examines how the circuit of design and analysis process can be closed to provide active and concurrent feedback between architecture and services engineering domains. It presents the structure for an openly customisable design system that couples parametric modelling and energy analysis software to allow designers to assess the performance of early design iterations quickly. Finally, it discusses how user interactions with the system foster information exchanges that facilitate the sharing of design intelligence across disciplines
A load-sharing architecture for high performance optimistic simulations on multi-core machines
In Parallel Discrete Event Simulation (PDES), the simulation model is partitioned into a set of distinct Logical Processes (LPs) which are allowed to concurrently execute simulation events. In this work we present an innovative approach to load-sharing on multi-core/multiprocessor machines, targeted at the optimistic PDES paradigm, where LPs are speculatively allowed to process simulation events with no preventive verification of causal consistency, and actual consistency violations (if any) are recovered via rollback techniques. In our approach, each simulation kernel instance, in charge of hosting and executing a specific set of LPs, runs a set of worker threads, which can be dynamically activated/deactivated on the basis of a distributed algorithm. The latter relies in turn on an analytical model that provides indications on how to reassign processor/core usage across the kernels in order to handle the simulation workload as efficiently as possible. We also present a real implementation of our load-sharing architecture within the ROme OpTimistic Simulator (ROOT-Sim), namely an open-source C-based simulation platform implemented according to the PDES paradigm and the optimistic synchronization approach. Experimental results for an assessment of the validity of our proposal are presented as well
Methodology for object-oriented real-time systems analysis and design: Software engineering
Successful application of software engineering methodologies requires an integrated analysis and design life-cycle in which the various phases flow smoothly 'seamlessly' from analysis through design to implementation. Furthermore, different analysis methodologies often lead to different structuring of the system so that the transition from analysis to design may be awkward depending on the design methodology to be used. This is especially important when object-oriented programming is to be used for implementation when the original specification and perhaps high-level design is non-object oriented. Two approaches to real-time systems analysis which can lead to an object-oriented design are contrasted: (1) modeling the system using structured analysis with real-time extensions which emphasizes data and control flows followed by the abstraction of objects where the operations or methods of the objects correspond to processes in the data flow diagrams and then design in terms of these objects; and (2) modeling the system from the beginning as a set of naturally occurring concurrent entities (objects) each having its own time-behavior defined by a set of states and state-transition rules and seamlessly transforming the analysis models into high-level design models. A new concept of a 'real-time systems-analysis object' is introduced and becomes the basic building block of a series of seamlessly-connected models which progress from the object-oriented real-time systems analysis and design system analysis logical models through the physical architectural models and the high-level design stages. The methodology is appropriate to the overall specification including hardware and software modules. In software modules, the systems analysis objects are transformed into software objects
Benchmarking Memory Management Capabilities within ROOT-Sim
In parallel discrete event simulation techniques, the simulation model is partitioned into objects, concurrently executing events on different CPUs and/or multiple CPUCores. In such a context, run-time supports for logical time synchronization across the different simulation objects play a central role in determining the effectiveness of the specific parallel simulation environment. In this paper we present an experimental evaluation of the memory management capabilities offered by the ROme OpTimistic Simulator (ROOT-Sim). This is an open source parallel simulation environment transparently supporting optimistic synchronization via recoverability (based on incremental log/restore techniques) of any type of memory operation affecting the state of simulation objects, i.e., memory allocation, deallocation and update operations. The experimental study is based on a synthetic benchmark which mimics different read/write patterns inside the dynamic memory map associated with the state of simulation objects. This allows sensibility analysis of time and space effects due to the memory management subsystem while varying the type and the locality of the accesses associated with event processin
Report from GI-Dagstuhl Seminar 16394: Software Performance Engineering in the DevOps World
This report documents the program and the outcomes of GI-Dagstuhl Seminar
16394 "Software Performance Engineering in the DevOps World".
The seminar addressed the problem of performance-aware DevOps. Both, DevOps
and performance engineering have been growing trends over the past one to two
years, in no small part due to the rise in importance of identifying
performance anomalies in the operations (Ops) of cloud and big data systems and
feeding these back to the development (Dev). However, so far, the research
community has treated software engineering, performance engineering, and cloud
computing mostly as individual research areas. We aimed to identify
cross-community collaboration, and to set the path for long-lasting
collaborations towards performance-aware DevOps.
The main goal of the seminar was to bring together young researchers (PhD
students in a later stage of their PhD, as well as PostDocs or Junior
Professors) in the areas of (i) software engineering, (ii) performance
engineering, and (iii) cloud computing and big data to present their current
research projects, to exchange experience and expertise, to discuss research
challenges, and to develop ideas for future collaborations
Astrophysical Data Analytics based on Neural Gas Models, using the Classification of Globular Clusters as Playground
In Astrophysics, the identification of candidate Globular Clusters through
deep, wide-field, single band HST images, is a typical data analytics problem,
where methods based on Machine Learning have revealed a high efficiency and
reliability, demonstrating the capability to improve the traditional
approaches. Here we experimented some variants of the known Neural Gas model,
exploring both supervised and unsupervised paradigms of Machine Learning, on
the classification of Globular Clusters, extracted from the NGC1399 HST data.
Main focus of this work was to use a well-tested playground to scientifically
validate such kind of models for further extended experiments in astrophysics
and using other standard Machine Learning methods (for instance Random Forest
and Multi Layer Perceptron neural network) for a comparison of performances in
terms of purity and completeness.Comment: Proceedings of the XIX International Conference "Data Analytics and
Management in Data Intensive Domains" (DAMDID/RCDL 2017), Moscow, Russia,
October 10-13, 2017, 8 pages, 4 figure
Mixing multi-core CPUs and GPUs for scientific simulation software
Recent technological and economic developments have led to widespread availability of
multi-core CPUs and specialist accelerator processors such as graphical processing units
(GPUs). The accelerated computational performance possible from these devices can be very
high for some applications paradigms. Software languages and systems such as NVIDIA's
CUDA and Khronos consortium's open compute language (OpenCL) support a number of
individual parallel application programming paradigms. To scale up the performance of some
complex systems simulations, a hybrid of multi-core CPUs for coarse-grained parallelism and
very many core GPUs for data parallelism is necessary. We describe our use of hybrid applica-
tions using threading approaches and multi-core CPUs to control independent GPU devices.
We present speed-up data and discuss multi-threading software issues for the applications
level programmer and o er some suggested areas for language development and integration
between coarse-grained and ne-grained multi-thread systems. We discuss results from three
common simulation algorithmic areas including: partial di erential equations; graph cluster
metric calculations and random number generation. We report on programming experiences
and selected performance for these algorithms on: single and multiple GPUs; multi-core CPUs;
a CellBE; and using OpenCL. We discuss programmer usability issues and the outlook and
trends in multi-core programming for scienti c applications developers
Load sharing for optimistic parallel simulations on multicore machines
Parallel Discrete Event Simulation (PDES) is based on the partitioning of the simulation model into distinct Logical Processes (LPs), each one modeling a portion of the entire system, which are allowed to execute simulation events concurrently. This allows exploiting parallel computing architectures to speedup model execution, and to make very large models tractable. In this article we cope with the optimistic approach to PDES, where LPs are allowed to concurrently process their events in a speculative fashion, and rollback/ recovery techniques are used to guarantee state consistency in case of causality violations along the speculative execution path. Particularly, we present an innovative load sharing approach targeted at optimizing resource usage for fruitful simulation work when running an optimistic PDES environment on top of multi-processor/multi-core machines. Beyond providing the load sharing model, we also define a load sharing oriented architectural scheme, based on a symmetric multi-threaded organization of the simulation platform. Finally, we present a real implementation of the load sharing architecture within the open source ROme OpTimistic Simulator (ROOT-Sim) package. Experimental data for an assessment of both viability and effectiveness of our proposal are presented as well. Copyright is held by author/owner(s)
Extensible Component Based Architecture for FLASH, A Massively Parallel, Multiphysics Simulation Code
FLASH is a publicly available high performance application code which has
evolved into a modular, extensible software system from a collection of
unconnected legacy codes. FLASH has been successful because its capabilities
have been driven by the needs of scientific applications, without compromising
maintainability, performance, and usability. In its newest incarnation, FLASH3
consists of inter-operable modules that can be combined to generate different
applications. The FLASH architecture allows arbitrarily many alternative
implementations of its components to co-exist and interchange with each other,
resulting in greater flexibility. Further, a simple and elegant mechanism
exists for customization of code functionality without the need to modify the
core implementation of the source. A built-in unit test framework providing
verifiability, combined with a rigorous software maintenance process, allow the
code to operate simultaneously in the dual mode of production and development.
In this paper we describe the FLASH3 architecture, with emphasis on solutions
to the more challenging conflicts arising from solver complexity, portable
performance requirements, and legacy codes. We also include results from user
surveys conducted in 2005 and 2007, which highlight the success of the code.Comment: 33 pages, 7 figures; revised paper submitted to Parallel Computin
- …