233,374 research outputs found
Building Near-Real-Time Processing Pipelines with the Spark-MPI Platform
Advances in detectors and computational technologies provide new
opportunities for applied research and the fundamental sciences. Concurrently,
dramatic increases in the three Vs (Volume, Velocity, and Variety) of
experimental data and the scale of computational tasks produced the demand for
new real-time processing systems at experimental facilities. Recently, this
demand was addressed by the Spark-MPI approach connecting the Spark
data-intensive platform with the MPI high-performance framework. In contrast
with existing data management and analytics systems, Spark introduced a new
middleware based on resilient distributed datasets (RDDs), which decoupled
various data sources from high-level processing algorithms. The RDD middleware
significantly advanced the scope of data-intensive applications, spreading from
SQL queries to machine learning to graph processing. Spark-MPI further extended
the Spark ecosystem with the MPI applications using the Process Management
Interface. The paper explores this integrated platform within the context of
online ptychographic and tomographic reconstruction pipelines.Comment: New York Scientific Data Summit, August 6-9, 201
BIM-to-BRICK: Using graph modeling for IoT/BMS and spatial semantic data interoperability within digital data models of buildings
The holistic management of a building requires data from heterogeneous
sources such as building management systems (BMS), Internet-of-Things (IoT)
sensor networks, and building information models. Data interoperability is a
key component to eliminate silos of information, and using semantic web
technologies like the BRICK schema, an effort to standardize semantic
descriptions of the physical, logical, and virtual assets in buildings and the
relationships between them, is a suitable approach. However, current data
integration processes can involve significant manual interventions. This paper
presents a methodology to automatically collect, assemble, and integrate
information from a building information model to a knowledge graph. The
resulting application, called BIM-to-BRICK, is run on the SDE4 building located
in Singapore. BIM-to-BRICK generated a bidirectional link between a BIM model
of 932 instances and experimental data collected for 17 subjects into 458 BRICK
objects and 1219 relationships in 17 seconds. The automation of this approach
can be compared to traditional manual mapping of data types. This scientific
innovation incentivizes the convergence of disparate data types and structures
in built-environment applications
A Taxonomy of Workflow Management Systems for Grid Computing
With the advent of Grid and application technologies, scientists and
engineers are building more and more complex applications to manage and process
large data sets, and execute scientific experiments on distributed resources.
Such application scenarios require means for composing and executing complex
workflows. Therefore, many efforts have been made towards the development of
workflow management systems for Grid computing. In this paper, we propose a
taxonomy that characterizes and classifies various approaches for building and
executing workflows on Grids. We also survey several representative Grid
workflow systems developed by various projects world-wide to demonstrate the
comprehensiveness of the taxonomy. The taxonomy not only highlights the design
and engineering similarities and differences of state-of-the-art in Grid
workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure
Fuzzy investment decision support for brownfield redevelopment
Tato disertační práce se zaměřuje na problematiku investování a podporu rozhodování pomocí moderních metod. Zejména pokud jde o analýzu, hodnocení a výběr tzv. brownfieldů pro jejich redevelopment (revitalizaci). Cílem této práce je navrhnout univerzální metodu, která usnadní rozhodovací proces. Proces rozhodování je v praxi komplikován též velkým počet relevantních parametrů ovlivňujících konečné rozhodnutí. Navržená metoda je založena na využití fuzzy logiky, modelování, statistické analýzy, shlukové analýzy, teorie grafů a na sofistikovaných metodách sběru a zpracování informací. Nová metoda umožňuje zefektivnit proces analýzy a porovnávání alternativních investic a přesněji zpracovat velký objem informací. Ve výsledku tak bude zmenšen počet prvků množiny nejvhodnějších alternativních investic na základě hierarchie parametrů stanovených investorem.This dissertation focuses on decision making, investing and brownfield redevelopment. Especially on the analysis, evaluation and selection of previously used real estates suitable for commercial use. The objective of this dissertation is to design a method that facilitates the decision making process with many possible alternatives and large number of relevant parameters influencing the decision. The proposed method is based on the use of fuzzy logic, modeling, statistic analysis, cluster analysis, graph theory and sophisticated methods of information collection and processing. New method allows decision makers to process much larger amount of information and evaluate possible investment alternatives efficiently.
Metadata And Data Management In High Performance File And Storage Systems
With the advent of emerging e-Science applications, today\u27s scientific research increasingly relies on petascale-and-beyond computing over large data sets of the same magnitude. While the computational power of supercomputers has recently entered the era of petascale, the performance of their storage system is far lagged behind by many orders of magnitude. This places an imperative demand on revolutionizing their underlying I/O systems, on which the management of both metadata and data is deemed to have significant performance implications. Prefetching/caching and data locality awareness optimizations, as conventional and effective management techniques for metadata and data I/O performance enhancement, still play their crucial roles in current parallel and distributed file systems. In this study, we examine the limitations of existing prefetching/caching techniques and explore the untapped potentials of data locality optimization techniques in the new era of petascale computing. For metadata I/O access, we propose a novel weighted-graph-based prefetching technique, built on both direct and indirect successor relationship, to reap performance benefit from prefetching specifically for clustered metadata serversan arrangement envisioned necessary for petabyte scale distributed storage systems. For data I/O access, we design and implement Segment-structured On-disk data Grouping and Prefetching (SOGP), a combined prefetching and data placement technique to boost the local data read performance for parallel file systems, especially for those applications with partially overlapped access patterns. One high-performance local I/O software package in SOGP work for Parallel Virtual File System in the number of about 2000 C lines was released to Argonne National Laboratory in 2007 for potential integration into the production mode
Mixing multi-core CPUs and GPUs for scientific simulation software
Recent technological and economic developments have led to widespread availability of
multi-core CPUs and specialist accelerator processors such as graphical processing units
(GPUs). The accelerated computational performance possible from these devices can be very
high for some applications paradigms. Software languages and systems such as NVIDIA's
CUDA and Khronos consortium's open compute language (OpenCL) support a number of
individual parallel application programming paradigms. To scale up the performance of some
complex systems simulations, a hybrid of multi-core CPUs for coarse-grained parallelism and
very many core GPUs for data parallelism is necessary. We describe our use of hybrid applica-
tions using threading approaches and multi-core CPUs to control independent GPU devices.
We present speed-up data and discuss multi-threading software issues for the applications
level programmer and o er some suggested areas for language development and integration
between coarse-grained and ne-grained multi-thread systems. We discuss results from three
common simulation algorithmic areas including: partial di erential equations; graph cluster
metric calculations and random number generation. We report on programming experiences
and selected performance for these algorithms on: single and multiple GPUs; multi-core CPUs;
a CellBE; and using OpenCL. We discuss programmer usability issues and the outlook and
trends in multi-core programming for scienti c applications developers
Integrated Safety and Security Risk Assessment Methods: A Survey of Key Characteristics and Applications
Over the last years, we have seen several security incidents that compromised
system safety, of which some caused physical harm to people. Meanwhile, various
risk assessment methods have been developed that integrate safety and security,
and these could help to address the corresponding threats by implementing
suitable risk treatment plans. However, an overarching overview of these
methods, systematizing the characteristics of such methods, is missing. In this
paper, we conduct a systematic literature review, and identify 7 integrated
safety and security risk assessment methods. We analyze these methods based on
5 different criteria, and identify key characteristics and applications. A key
outcome is the distinction between sequential and non-sequential integration of
safety and security, related to the order in which safety and security risks
are assessed. This study provides a basis for developing more effective
integrated safety and security risk assessment methods in the future
- …