469,024 research outputs found
Indexing Metric Spaces for Exact Similarity Search
With the continued digitalization of societal processes, we are seeing an
explosion in available data. This is referred to as big data. In a research
setting, three aspects of the data are often viewed as the main sources of
challenges when attempting to enable value creation from big data: volume,
velocity and variety. Many studies address volume or velocity, while much fewer
studies concern the variety. Metric space is ideal for addressing variety
because it can accommodate any type of data as long as its associated distance
notion satisfies the triangle inequality. To accelerate search in metric space,
a collection of indexing techniques for metric data have been proposed.
However, existing surveys each offers only a narrow coverage, and no
comprehensive empirical study of those techniques exists. We offer a survey of
all the existing metric indexes that can support exact similarity search, by i)
summarizing all the existing partitioning, pruning and validation techniques
used for metric indexes, ii) providing the time and storage complexity analysis
on the index construction, and iii) report on a comprehensive empirical
comparison of their similarity query processing performance. Here, empirical
comparisons are used to evaluate the index performance during search as it is
hard to see the complexity analysis differences on the similarity query
processing and the query performance depends on the pruning and validation
abilities related to the data distribution. This article aims at revealing
different strengths and weaknesses of different indexing techniques in order to
offer guidance on selecting an appropriate indexing technique for a given
setting, and directing the future research for metric indexes
An empirical learning-based validation procedure for simulation workflow
Simulation workflow is a top-level model for the design and control of
simulation process. It connects multiple simulation components with time and
interaction restrictions to form a complete simulation system. Before the
construction and evaluation of the component models, the validation of
upper-layer simulation workflow is of the most importance in a simulation
system. However, the methods especially for validating simulation workflow is
very limit. Many of the existing validation techniques are domain-dependent
with cumbersome questionnaire design and expert scoring. Therefore, this paper
present an empirical learning-based validation procedure to implement a
semi-automated evaluation for simulation workflow. First, representative
features of general simulation workflow and their relations with validation
indices are proposed. The calculation process of workflow credibility based on
Analytic Hierarchy Process (AHP) is then introduced. In order to make full use
of the historical data and implement more efficient validation, four learning
algorithms, including back propagation neural network (BPNN), extreme learning
machine (ELM), evolving new-neuron (eNFN) and fast incremental gaussian mixture
model (FIGMN), are introduced for constructing the empirical relation between
the workflow credibility and its features. A case study on a landing-process
simulation workflow is established to test the feasibility of the proposed
procedure. The experimental results also provide some useful overview of the
state-of-the-art learning algorithms on the credibility evaluation of
simulation models
Bayesian leave-one-out cross-validation for large data
Model inference, such as model comparison, model checking, and model
selection, is an important part of model development. Leave-one-out
cross-validation (LOO) is a general approach for assessing the generalizability
of a model, but unfortunately, LOO does not scale well to large datasets. We
propose a combination of using approximate inference techniques and
probability-proportional-to-size-sampling (PPS) for fast LOO model evaluation
for large datasets. We provide both theoretical and empirical results showing
good properties for large data.Comment: Accepted to ICML 2019. This version is the submitted pape
Empirical validation of dynamic thermal computer models of buildings
A methodology for the validation of dynamic thermal models of buildings has been presented. The three techniques, analytical verification, inter-model comparisons and empirical validation have been described and their relative merits assessed by reference to previous validation work on ESP, SERIR'S, DEROB and BLAST. Previous empirical validation work on these models has been reviewed. This research has shown that numerous sources of error have existed in previous studies leading to uncertainty in model predictions. The effects of these errors has meant that none of the previous empirical validation studies would have produced conclusive evidence of internal errors in the models themselves. An approach towards developing tests to empirically validate dynamic thermal models is given
METAPHOR: Probability density estimation for machine learning based photometric redshifts
We present METAPHOR (Machine-learning Estimation Tool for Accurate
PHOtometric Redshifts), a method able to provide a reliable PDF for photometric
galaxy redshifts estimated through empirical techniques. METAPHOR is a modular
workflow, mainly based on the MLPQNA neural network as internal engine to
derive photometric galaxy redshifts, but giving the possibility to easily
replace MLPQNA with any other method to predict photo-z's and their PDF. We
present here the results about a validation test of the workflow on the
galaxies from SDSS-DR9, showing also the universality of the method by
replacing MLPQNA with KNN and Random Forest models. The validation test include
also a comparison with the PDF's derived from a traditional SED template
fitting method (Le Phare).Comment: proceedings of the International Astronomical Union, IAU-325
symposium, Cambridge University pres
Empirical Studies in End-User Software Engineering and Viewing Scientific Programmers as End-Users -- POSITION STATEMENT --
My work has two relationships with End User Software Engineering. First, as an Empirical Software Engineer, I am interested in meeting with people who do research into techniques for improving end-user software engineering. All of these techniques need to have some type of empirical validation. In many cases this validation is performed by the researcher, but in other cases it is not. Regardless, an independent validation of a new approach is vital. Second, an area where I have done a fair amount of work is in software engineering for scientific software (typically written for a parallel supercomputer). These programmers are typically scientists who have little or no training in formal software engineering. Yet, to accomplish their work, they often write very complex simulation and computation software. I believe these programmers are a unique class of End-Users that must be addresse
Introductory remarks
Suggested fluid mechanics research to be conducted in the National Transonic Facility include: wind tunnel calibration; flat plate skin friction, flow visualization and measurement techniques; leading edge separation; high angle of attack separation; shock-boundary layer interaction; submarine shapes; low speed studies of cylinder normal to flow; and wall interference effects. These theoretical aerodynamic investigations will provide empirical inputs or validation data for computational aerodynamics, and increase the usefulness of existing wind tunnels
From Bioeconomic Farm Models to Multi-Agent Systems: Challenges for Parameterization and Validation
Bioeconomic farm models have been very instrumental in capturing the technical aspects of human-nature interactions and in highlighting the economic consequences of resource use changes. They may elucidate the tradeoffs that farm households face in crop choice and farming practices, assess the profitability of various land-use options and capture the internal costs of adjusting to changes in environmental and market conditions. But they face also limitations when it comes to analyzing situations, in which heterogeneity of households and landscapes is large and increasing. Multi-agent models building on the bioeconomic farm approach hold the promise of capturing more fully the heterogeneity and interactions of farm households. The fulfillment of this promise, however, depends on the empirical parameterization and validation of multi-agent models. Although multi-agent models have been widely applied in experimental and hypothetical settings, only few studies have tried to build empirical multi-agent models and the literature on methods of parameterization and validation is therefore limited. This paper suggests novel methods for the empirical parameterization and validation of multiagent models that may comply with the high standards established in bioeconomic farm modeling. The biophysical measurements (here: soil properties) are extrapolated over the landscape using multiple regressions and a Digital Elevation Model. The socioeconomic surveys are used to estimate probability functions for key characteristics of human actors, which are then assigned to the model agents with Monte-Carlo techniques. This approach generates a landscape and agent populations that are robust and statistically consistent with empirical observations.Farm Management,
Alternative Approaches to the Empirical Validation of Agent-Based Models
This paper draws on the metaphor of a spectrum of models ranging from the most theory-driven to the most evidence-driven. The issue of concern is the practice and criteria that will be appro- priate to validation of different models. In order to address this concern, two modelling approaches are investigated in some detailed – one from each end of our metaphorical spectrum. Windrum et al. (2007) (http://jasss.soc.surrey.ac.uk/10/2/8.html) claimed strong similarities between agent based social simulation and conventional social science – specifically econometric – approaches to empirical modelling and on that basis considered how econometric validation techniques might be used in empirical social simulations more broadly. An alternative is the approach of the French school of \'companion modelling\' associated with Bousquet, Barreteau, Le Page and others which engages stakeholders in the modelling and validation process. The conventional approach is con- strained by prior theory and the French school approach by evidence. In this sense they are at opposite ends of the theory-evidence spectrum. The problems for validation identified by Windrum et al. are shown to be irrelevant to companion modelling which readily incorporate complexity due to realistically descriptive specifications of individual behaviour and social interaction. The result combines the precision of formal approaches with the richness of narrative scenarios. Companion modelling is therefore found to be practicable and to achieve what is claimed for it and this alone is a key difference from conventional social science including agent based computational economics.Social Simulation, Validation, Companion Modelling, Data Generating Mechanisms, Complexity
- …