29,315 research outputs found
Open TURNS: An industrial software for uncertainty quantification in simulation
The needs to assess robust performances for complex systems and to answer
tighter regulatory processes (security, safety, environmental control, and
health impacts, etc.) have led to the emergence of a new industrial simulation
challenge: to take uncertainties into account when dealing with complex
numerical simulation frameworks. Therefore, a generic methodology has emerged
from the joint effort of several industrial companies and academic
institutions. EDF R&D, Airbus Group and Phimeca Engineering started a
collaboration at the beginning of 2005, joined by IMACS in 2014, for the
development of an Open Source software platform dedicated to uncertainty
propagation by probabilistic methods, named OpenTURNS for Open source Treatment
of Uncertainty, Risk 'N Statistics. OpenTURNS addresses the specific industrial
challenges attached to uncertainties, which are transparency, genericity,
modularity and multi-accessibility. This paper focuses on OpenTURNS and
presents its main features: openTURNS is an open source software under the LGPL
license, that presents itself as a C++ library and a Python TUI, and which
works under Linux and Windows environment. All the methodological tools are
described in the different sections of this paper: uncertainty quantification,
uncertainty propagation, sensitivity analysis and metamodeling. A section also
explains the generic wrappers way to link openTURNS to any external code. The
paper illustrates as much as possible the methodological tools on an
educational example that simulates the height of a river and compares it to the
height of a dyke that protects industrial facilities. At last, it gives an
overview of the main developments planned for the next few years
Evolutionary model type selection for global surrogate modeling
Due to the scale and computational complexity of currently used simulation codes, global surrogate (metamodels) models have become indispensable tools for exploring and understanding the design space. Due to their compact formulation they are cheap to evaluate and thus readily facilitate visualization, design space exploration, rapid prototyping, and sensitivity analysis. They can also be used as accurate building blocks in design packages or larger simulation environments. Consequently, there is great interest in techniques that facilitate the construction of such approximation models while minimizing the computational cost and maximizing model accuracy. Many surrogate model types exist ( Support Vector Machines, Kriging, Neural Networks, etc.) but no type is optimal in all circumstances. Nor is there any hard theory available that can help make this choice. In this paper we present an automatic approach to the model type selection problem. We describe an adaptive global surrogate modeling environment with adaptive sampling, driven by speciated evolution. Different model types are evolved cooperatively using a Genetic Algorithm ( heterogeneous evolution) and compete to approximate the iteratively selected data. In this way the optimal model type and complexity for a given data set or simulation code can be dynamically determined. Its utility and performance is demonstrated on a number of problems where it outperforms traditional sequential execution of each model type
Parallel Implementation of Lossy Data Compression for Temporal Data Sets
Many scientific data sets contain temporal dimensions. These are the data
storing information at the same spatial location but different time stamps.
Some of the biggest temporal datasets are produced by parallel computing
applications such as simulations of climate change and fluid dynamics. Temporal
datasets can be very large and cost a huge amount of time to transfer among
storage locations. Using data compression techniques, files can be transferred
faster and save storage space. NUMARCK is a lossy data compression algorithm
for temporal data sets that can learn emerging distributions of element-wise
change ratios along the temporal dimension and encodes them into an index table
to be concisely represented. This paper presents a parallel implementation of
NUMARCK. Evaluated with six data sets obtained from climate and astrophysics
simulations, parallel NUMARCK achieved scalable speedups of up to 8788 when
running 12800 MPI processes on a parallel computer. We also compare the
compression ratios against two lossy data compression algorithms, ISABELA and
ZFP. The results show that NUMARCK achieved higher compression ratio than
ISABELA and ZFP.Comment: 10 pages, HiPC 201
A new and efficient intelligent collaboration scheme for fashion design
Technology-mediated collaboration process has been extensively studied for over a decade. Most applications with collaboration concepts reported in the literature focus on enhancing efficiency and effectiveness of the decision-making processes in objective and well-structured workflows. However, relatively few previous studies have investigated the applications of collaboration schemes to problems with subjective and unstructured nature. In this paper, we explore a new intelligent collaboration scheme for fashion design which, by nature, relies heavily on human judgment and creativity. Techniques such as multicriteria decision making, fuzzy logic, and artificial neural network (ANN) models are employed. Industrial data sets are used for the analysis. Our experimental results suggest that the proposed scheme exhibits significant improvement over the traditional method in terms of the time–cost effectiveness, and a company interview with design professionals has confirmed its effectiveness and significance
GreedyDual-Join: Locality-Aware Buffer Management for Approximate Join Processing Over Data Streams
We investigate adaptive buffer management techniques for approximate evaluation of sliding window joins over multiple data streams. In many applications, data stream processing systems have limited memory or have to deal with very high speed data streams. In both cases, computing the exact results of joins between these streams may not be feasible, mainly because the buffers used to compute the joins contain much smaller number of tuples than the tuples contained in the sliding windows. Therefore, a stream buffer management policy is needed in that case. We show that the buffer replacement policy is an important determinant of the quality of the produced results. To that end, we propose GreedyDual-Join (GDJ) an adaptive and locality-aware buffering technique for managing these buffers. GDJ exploits the temporal correlations (at both long and short time scales), which we found to be prevalent in many real data streams. We note that our algorithm is readily applicable to multiple data streams and multiple joins and requires almost no additional system resources. We report results of an experimental study using both synthetic and real-world data sets. Our results demonstrate the superiority and flexibility of our approach when contrasted to other recently proposed techniques
Direct optimisation of the discovery significance when training neural networks to search for new physics in particle colliders
We introduce two new loss functions designed to directly optimise the
statistical significance of the expected number of signal events when training
neural networks to classify events as signal or background in the scenario of a
search for new physics at a particle collider. The loss functions are designed
to directly maximise commonly used estimates of the statistical significance,
, and the Asimov estimate, . We consider their use in a toy
SUSY search with 30~fb of 14~TeV data collected at the LHC. In the case
that the search for the SUSY model is dominated by systematic uncertainties, it
is found that the loss function based on can outperform the binary cross
entropy in defining an optimal search region
- …