469,024 research outputs found

    Indexing Metric Spaces for Exact Similarity Search

    Full text link
    With the continued digitalization of societal processes, we are seeing an explosion in available data. This is referred to as big data. In a research setting, three aspects of the data are often viewed as the main sources of challenges when attempting to enable value creation from big data: volume, velocity and variety. Many studies address volume or velocity, while much fewer studies concern the variety. Metric space is ideal for addressing variety because it can accommodate any type of data as long as its associated distance notion satisfies the triangle inequality. To accelerate search in metric space, a collection of indexing techniques for metric data have been proposed. However, existing surveys each offers only a narrow coverage, and no comprehensive empirical study of those techniques exists. We offer a survey of all the existing metric indexes that can support exact similarity search, by i) summarizing all the existing partitioning, pruning and validation techniques used for metric indexes, ii) providing the time and storage complexity analysis on the index construction, and iii) report on a comprehensive empirical comparison of their similarity query processing performance. Here, empirical comparisons are used to evaluate the index performance during search as it is hard to see the complexity analysis differences on the similarity query processing and the query performance depends on the pruning and validation abilities related to the data distribution. This article aims at revealing different strengths and weaknesses of different indexing techniques in order to offer guidance on selecting an appropriate indexing technique for a given setting, and directing the future research for metric indexes

    An empirical learning-based validation procedure for simulation workflow

    Full text link
    Simulation workflow is a top-level model for the design and control of simulation process. It connects multiple simulation components with time and interaction restrictions to form a complete simulation system. Before the construction and evaluation of the component models, the validation of upper-layer simulation workflow is of the most importance in a simulation system. However, the methods especially for validating simulation workflow is very limit. Many of the existing validation techniques are domain-dependent with cumbersome questionnaire design and expert scoring. Therefore, this paper present an empirical learning-based validation procedure to implement a semi-automated evaluation for simulation workflow. First, representative features of general simulation workflow and their relations with validation indices are proposed. The calculation process of workflow credibility based on Analytic Hierarchy Process (AHP) is then introduced. In order to make full use of the historical data and implement more efficient validation, four learning algorithms, including back propagation neural network (BPNN), extreme learning machine (ELM), evolving new-neuron (eNFN) and fast incremental gaussian mixture model (FIGMN), are introduced for constructing the empirical relation between the workflow credibility and its features. A case study on a landing-process simulation workflow is established to test the feasibility of the proposed procedure. The experimental results also provide some useful overview of the state-of-the-art learning algorithms on the credibility evaluation of simulation models

    Bayesian leave-one-out cross-validation for large data

    Full text link
    Model inference, such as model comparison, model checking, and model selection, is an important part of model development. Leave-one-out cross-validation (LOO) is a general approach for assessing the generalizability of a model, but unfortunately, LOO does not scale well to large datasets. We propose a combination of using approximate inference techniques and probability-proportional-to-size-sampling (PPS) for fast LOO model evaluation for large datasets. We provide both theoretical and empirical results showing good properties for large data.Comment: Accepted to ICML 2019. This version is the submitted pape

    Empirical validation of dynamic thermal computer models of buildings

    Get PDF
    A methodology for the validation of dynamic thermal models of buildings has been presented. The three techniques, analytical verification, inter-model comparisons and empirical validation have been described and their relative merits assessed by reference to previous validation work on ESP, SERIR'S, DEROB and BLAST. Previous empirical validation work on these models has been reviewed. This research has shown that numerous sources of error have existed in previous studies leading to uncertainty in model predictions. The effects of these errors has meant that none of the previous empirical validation studies would have produced conclusive evidence of internal errors in the models themselves. An approach towards developing tests to empirically validate dynamic thermal models is given

    METAPHOR: Probability density estimation for machine learning based photometric redshifts

    Get PDF
    We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method able to provide a reliable PDF for photometric galaxy redshifts estimated through empirical techniques. METAPHOR is a modular workflow, mainly based on the MLPQNA neural network as internal engine to derive photometric galaxy redshifts, but giving the possibility to easily replace MLPQNA with any other method to predict photo-z's and their PDF. We present here the results about a validation test of the workflow on the galaxies from SDSS-DR9, showing also the universality of the method by replacing MLPQNA with KNN and Random Forest models. The validation test include also a comparison with the PDF's derived from a traditional SED template fitting method (Le Phare).Comment: proceedings of the International Astronomical Union, IAU-325 symposium, Cambridge University pres

    Empirical Studies in End-User Software Engineering and Viewing Scientific Programmers as End-Users -- POSITION STATEMENT --

    Get PDF
    My work has two relationships with End User Software Engineering. First, as an Empirical Software Engineer, I am interested in meeting with people who do research into techniques for improving end-user software engineering. All of these techniques need to have some type of empirical validation. In many cases this validation is performed by the researcher, but in other cases it is not. Regardless, an independent validation of a new approach is vital. Second, an area where I have done a fair amount of work is in software engineering for scientific software (typically written for a parallel supercomputer). These programmers are typically scientists who have little or no training in formal software engineering. Yet, to accomplish their work, they often write very complex simulation and computation software. I believe these programmers are a unique class of End-Users that must be addresse

    Introductory remarks

    Get PDF
    Suggested fluid mechanics research to be conducted in the National Transonic Facility include: wind tunnel calibration; flat plate skin friction, flow visualization and measurement techniques; leading edge separation; high angle of attack separation; shock-boundary layer interaction; submarine shapes; low speed studies of cylinder normal to flow; and wall interference effects. These theoretical aerodynamic investigations will provide empirical inputs or validation data for computational aerodynamics, and increase the usefulness of existing wind tunnels

    From Bioeconomic Farm Models to Multi-Agent Systems: Challenges for Parameterization and Validation

    Get PDF
    Bioeconomic farm models have been very instrumental in capturing the technical aspects of human-nature interactions and in highlighting the economic consequences of resource use changes. They may elucidate the tradeoffs that farm households face in crop choice and farming practices, assess the profitability of various land-use options and capture the internal costs of adjusting to changes in environmental and market conditions. But they face also limitations when it comes to analyzing situations, in which heterogeneity of households and landscapes is large and increasing. Multi-agent models building on the bioeconomic farm approach hold the promise of capturing more fully the heterogeneity and interactions of farm households. The fulfillment of this promise, however, depends on the empirical parameterization and validation of multi-agent models. Although multi-agent models have been widely applied in experimental and hypothetical settings, only few studies have tried to build empirical multi-agent models and the literature on methods of parameterization and validation is therefore limited. This paper suggests novel methods for the empirical parameterization and validation of multiagent models that may comply with the high standards established in bioeconomic farm modeling. The biophysical measurements (here: soil properties) are extrapolated over the landscape using multiple regressions and a Digital Elevation Model. The socioeconomic surveys are used to estimate probability functions for key characteristics of human actors, which are then assigned to the model agents with Monte-Carlo techniques. This approach generates a landscape and agent populations that are robust and statistically consistent with empirical observations.Farm Management,

    Alternative Approaches to the Empirical Validation of Agent-Based Models

    Get PDF
    This paper draws on the metaphor of a spectrum of models ranging from the most theory-driven to the most evidence-driven. The issue of concern is the practice and criteria that will be appro- priate to validation of different models. In order to address this concern, two modelling approaches are investigated in some detailed – one from each end of our metaphorical spectrum. Windrum et al. (2007) (http://jasss.soc.surrey.ac.uk/10/2/8.html) claimed strong similarities between agent based social simulation and conventional social science – specifically econometric – approaches to empirical modelling and on that basis considered how econometric validation techniques might be used in empirical social simulations more broadly. An alternative is the approach of the French school of \'companion modelling\' associated with Bousquet, Barreteau, Le Page and others which engages stakeholders in the modelling and validation process. The conventional approach is con- strained by prior theory and the French school approach by evidence. In this sense they are at opposite ends of the theory-evidence spectrum. The problems for validation identified by Windrum et al. are shown to be irrelevant to companion modelling which readily incorporate complexity due to realistically descriptive specifications of individual behaviour and social interaction. The result combines the precision of formal approaches with the richness of narrative scenarios. Companion modelling is therefore found to be practicable and to achieve what is claimed for it and this alone is a key difference from conventional social science including agent based computational economics.Social Simulation, Validation, Companion Modelling, Data Generating Mechanisms, Complexity
    • …
    corecore