7,160 research outputs found

    Model Calibration in Watershed Hydrology

    Get PDF
    Hydrologic models use relatively simple mathematical equations to conceptualize and aggregate the complex, spatially distributed, and highly interrelated water, energy, and vegetation processes in a watershed. A consequence of process aggregation is that the model parameters often do not represent directly measurable entities and must, therefore, be estimated using measurements of the system inputs and outputs. During this process, known as model calibration, the parameters are adjusted so that the behavior of the model approximates, as closely and consistently as possible, the observed response of the hydrologic system over some historical period of time. This Chapter reviews the current state-of-the-art of model calibration in watershed hydrology with special emphasis on our own contributions in the last few decades. We discuss the historical background that has led to current perspectives, and review different approaches for manual and automatic single- and multi-objective parameter estimation. In particular, we highlight the recent developments in the calibration of distributed hydrologic models using parameter dimensionality reduction sampling, parameter regularization and parallel computing

    A Comprehensive Survey on Rare Event Prediction

    Full text link
    Rare event prediction involves identifying and forecasting events with a low probability using machine learning and data analysis. Due to the imbalanced data distributions, where the frequency of common events vastly outweighs that of rare events, it requires using specialized methods within each step of the machine learning pipeline, i.e., from data processing to algorithms to evaluation protocols. Predicting the occurrences of rare events is important for real-world applications, such as Industry 4.0, and is an active research area in statistical and machine learning. This paper comprehensively reviews the current approaches for rare event prediction along four dimensions: rare event data, data processing, algorithmic approaches, and evaluation approaches. Specifically, we consider 73 datasets from different modalities (i.e., numerical, image, text, and audio), four major categories of data processing, five major algorithmic groupings, and two broader evaluation approaches. This paper aims to identify gaps in the current literature and highlight the challenges of predicting rare events. It also suggests potential research directions, which can help guide practitioners and researchers.Comment: 44 page

    Efficient Algorithms for Prokaryotic Whole Genome Assembly and Finishing

    Get PDF
    De-novo genome assembly from DNA fragments is primarily based on sequence overlap information. In addition, mate-pair reads or paired-end reads provide linking information for joining gaps and bridging repeat regions. Genome assemblers in general assemble long contiguous sequences (contigs) using both overlapping reads and linked reads until the assembly runs into an ambiguous repeat region. These contigs are further bridged into scaffolds using linked read information. However, errors can be made in both phases of assembly due to high error threshold of overlap acceptance and linking based on too few mate reads. Identical as well as similar repeat regions can often cause errors in overlap and mate-pair evidence. In addition, the problem of setting the correct threshold to minimize errors and optimize assembly of reads is not trivial and often requires a time-consuming trial and error process to obtain optimal results. The typical trial-and-error with multiple assembler, which can be computationally intensive, and is very inefficient, especially when users must learn how to use a wide variety of assemblers, many of which may be serial requiring long execution time and will not return usable or accurate results. Further, we show that the comparison of assembly results may not provide the users with a clear winner under all circumstances. Therefore, we propose a novel scaffolding tool, Correlative Algorithm for Repeat Placement (CARP), capable of joining short low error contigs using mate pair reads, computationally resolved repeat structures and synteny with one or more reference organisms. The CARP tool requires a set of repeat sequences such as insertion sequences (IS) that can be found computationally found without assembling the genome. Development of methods to identify such repeating regions directly from raw sequence reads or draft genomes led to the development of the ISQuest software package. ISQuest identifies bacterial ISs and their sequence elements—inverted and direct repeats—in raw read data or contigs using flexible search parameters. ISQuest is capable of finding ISs in hundreds of partially assembled genomes within hours; making it a valuable high-throughput tool for a global search of IS and repeat elements. The CARP tool matches very low error contigs with strong overlap using the ambiguous partial repeat sequence at the ends of the contig annotated using the repeat sequences discovered using ISQuest. These matches are verified by synteny with genomes of one or more reference organisms. We show that the CARP tool can be used to verify low mate pair evidence regions, independently find new joins and significantly reduce the number of scaffolds. Finally, we are demonstrate a novel viewer that presents to the user the computationally derived joins along with the evidence used to make the joins. The viewer allows the user to independently assess their confidence in the joins made by the finishing tools and make an informed decision of whether to invest the resources necessary to confirm a particular portion of the assembly. Further, we allow users to manually record join evidence, re-order contigs, and track the assembly finishing process

    A new approach for the quantification of qualitative measures of economic expectations

    Get PDF
    In this study a new approach to quantify qualitative survey data about the direction of change is presented. We propose a data-driven procedure based on evolutionary computation that avoids making any assumption about agents' expectations. The research focuses on experts' expectations about the state of the economy from the World Economic Survey in twenty eight countries of the Organisation for Economic Co-operation and Development. The proposed method is used to transform qualitative responses into estimates of economic growth. In a first experiment, we combine agents' expectations about the future to construct a leading indicator of economic activity. In a second experiment, agents' judgements about the present are combined to generate a coincident indicator. Then, we use index tracking to derive the optimal combination of weights for both indicators that best replicates the evolution of economic activity in each country. Finally, we compute several accuracy measures to assess the performance of these estimates in tracking economic growth. The different results across countries have led us to use multidimensional scaling analysis in order to group all economies in four clusters according to their performance

    A new approach for the quantification of qualitative measures of economic expectations

    Get PDF
    In this study a new approach to quantify qualitative survey data about the direction of change is presented. We propose a data-driven procedure based on evolutionary computation that avoids making any assumption about agents’ expectations. The research focuses on experts’ expectations about the state of the economy from the World Economic Survey in twenty eight countries of the Organisation for Economic Co-operation and Development. The proposed method is used to transform qualitative responses into estimates of economic growth. In a first experiment, we combine agents’ expectations about the future to construct a leading indicator of economic activity. In a second experiment, agents’ judgements about the present are combined to generate a coincident indicator. Then, we use index tracking to derive the optimal combination of weights for both indicators that best replicates the evolution of economic activity in each country. Finally, we compute several accuracy measures to assess the performance of these estimates in tracking economic growth. The different results across countries have led us to use multidimensional scaling analysis in order to group all economies in four clusters according to their performance. We obtain the best results for Belgium, Norway, Austria, Lithuania, Japan and the United Kingdom.Peer ReviewedPostprint (author's final draft
    • …
    corecore