5,576 research outputs found

    Modelling uncertainty for leak localization in Water Networks

    Get PDF
    The performance and success of model-based leak localization methods applied to water distribution networks (WDN) highly depends on the uncertainty of the system considered. This work proposes an original method of modeling the effect of uncertainties in these networks. The proposed method is based on the collection of real data in the water network in the absence of leaks. The discrepancy (residual) between the measured data and the one provided by a simulator of the network in normal operation is used to extrapolate the possible residuals in the different leak scenarios. In addition, indicators for assessing the effect of uncertainty in the performance of leak localization methods based on residual correlation analysis are provided. The error in terms of correlation intervals and leak localzation assessment between the proposed approximation and the real one is studied by means a simplified model of the WDN of Hanoi (Vietnam).Postprint (published version

    Modelling uncertainty of the radiation energy emitted by extensive air showers

    Full text link
    Recently, the energy determination of extensive air showers using radio emission has been shown to be both precise and accurate. In particular, radio detection offers the opportunity for an independent measurement of the absolute energy of cosmic rays, since the radiation energy (the energy radiated in the form of radio signals) can be predicted using first-principle calculations involving no free parameters, and the measurement of radio waves is not subject to any significant absorption or scattering in the atmosphere. Here, we verify the implementation of radiation-energy calculations from microscopic simulation codes by comparing Monte Carlo simulations made with the two codes CoREAS and ZHAireS. To isolate potential differences in the radio-emission calculation from differences in the air-shower simulation, the simulations are performed with equivalent settings, especially the same model for the hadronic interactions and the description of the atmosphere. Comparing a large set of simulations with different primary energies and shower directions we observe differences amounting to a total of only 3.3 %. This corresponds to an uncertainty of only 1.6 % in the determination of the absolute energy scale and thus opens the potential of using the radiation energy as an accurate calibration method for cosmic ray experiments.Comment: 8 pages, 2 figures, ICRC2017 contributio

    Modelling Uncertainty Caused by Internal Waves on the Accuracy of MBES

    Get PDF
    A 3D ray tracing model has been developed to estimate the effects of internal waves upon the accuracy of multibeam echosounders (MBES). A case study examines the variability in these effects as a function of survey line direction and also considers the case of improving 2D ray tracing models with wave parameters derived from MBES water column imagery. Results indicate that, under certain circumstances, the effects of internal waves can prove to be a significant source of uncertainty that detracts from the ability to efficiently map the seafloor with wide swath angles

    Modelling uncertainty for flash floods in coastal plains using adjoint methods

    Get PDF
    This paper shows the application of adjoint sensitivity analysis to flash flood wave propagation in a river channel. The adjoint sensitivity analysis is used to assess flood hazard in a coastal area caused by river discharge. The numerical model determines the sensitivities of predicted water levels to uncertainties in key controls such as inflow hydrograph, channel topography, frictional resistance and infiltration rate. Sensitivities are calculated using the adjoint equations and are specified in terms of water levels being greater than certain safe threshold levels along the channel. The flood propagation model is based on the St. Venant equations while the propagation of sensitivity information is based on the corresponding adjoint equations. This analysis is achieved using a numerical model that integrates The St. Venant equations forward in time using a staggered finite difference scheme. An enhanced method of characteristics at the downstream boundary provides open boundary conditions and overcomes the problem of reflections from the boundaries. Then, the adjoint model is integrated backwards in time to trace the sensitivity information back through the model domain towards the inflow control boundary. The adjoint model has been verified by means of an identical twin experiment

    Modelling Uncertainty in Physical Database Design

    Get PDF
    Physical database design can be marked as a crucial step in the overall design process of databases. The outcome of physical database design is a physical schema which describes the storage and access structures of the stored database. The selection of an ecient physical schema is an NP-complete problem. A signi cant number of eorts has been reported to develop tools that assist in the selection of physical schemas. Most of the eorts implicitly apply a number of heuristics to avoid the evaluation of all schemas. In this paper, we present an approach, based on the Dempster-Shafer theory, that explicitly models a rich set of heuristics |used for the selection of an ecient physical schema | into knowledge rules. These rules may be loaded into a knowledge base, which, in turn, can be embedded in physical database design tools.

    Modelling uncertainty in population monitoring data

    Get PDF
    Uncertainties in ecology are pervasive, and therefore, communicating the level of uncertainty for any inference derived from scientific research is key to sound decision-making and management of species and ecosystems. Characterising uncertainty is part of converting information into knowledge and has the added benefit of identifying fruitful avenues of further investigation. Without such care in accounting for uncertainties, we risk making misleading conclusions and inappropriate management decisions. In this thesis, it is argued strongly that rather than being something to avoid discussing, reducing uncertainty is fundamental to good ecological science. Uncertainty can come from a number of sources. Parameter estimation for demographic studies has inherently high uncertainty due to substantial variation between individuals, years, and spatial locations thus requiring considerable resources to obtain accurate estimates for survival, reproduction and growth. In some cases, certain life stages may be unseen during sampling procedures, such as seeds in the soil seed bank, or if non-breeding components of the population are not present in the selected sampling sites. While the potential sources of uncertainty are diverse, I attempted to cover a range of key areas of uncertainties relevant to ecologists over the course of this thesis. Specific areas of uncertainty were targeted using case studies to provide examples to demonstrate how these uncertainties can be addressed and how they can be used to aid inferences and provide recommendations for future data collection procedures. First, I highlighted the prevalence of authors excluding a cryptic but important life stage, the dormant seed bank, from their data collection procedures and population models (Chapter 2). The evolution of seed banks acts as a bet hedging strategy, improving the persistence of plant populations in variable environments, thus it is crucial that we are able to address this potential knowledge gap to avoid misleading conclusions. The consequences of this exclusion on model parameters such as population growth rates and extinction risks were explored using a joint empirical and simulation approach, combining information from the published literature with Monte Carlo simulations. These simulations explored a range of assumptions that need to be considered when including a seed bank into the model, such as seed longevity, viability and germination rates. A key result of these simulations is that our perspective regarding the importance of the seed bank can differ, further depending on the species and the type of demographic year. For example, inclusion of the seed bank and demographic uncertainty in seed bank parameters were found to have little effect for stable populations with high post-seedling survival. In such cases, the seed bank can be excluded, however this should be accompanied by appropriate justification either through literature confirmation that dormancy is not existent or demonstrated via simulations that it is of little consequence. Conversely, seed banks had a more demonstrable impact on growth and extinction rates for variable populations, particularly when populations experience poor demographic years. The use of simulations and published literature can thus be an effective means to explore uncertainties resulting from the presence of cryptic life stages. Second, I explored and demonstrated the use of multivariate auto-regressive state-space (MARSS) models as a versatile framework for capturing and addressing several sources of uncertainty including observation errors, and show how these models can be used to update and improve monitoring design (Chapter 3). MARSS models were constructed for a common, ephemeral plant using a 9 year time series dataset from multiple study sites within the Simpson Desert to explain trends over time and space. Modelling multi-dimensional time series data allowed the identification of spatial sub-population structure with respect to location and fire history, and the incorporation of population structure making use of count data for above ground plants and the seed bank. Model results suggested population dynamics to be driven primarily by geographical location possibly reflecting differences in soil conditions, local competition and local microclimate, overshadowing variation caused by fire history. The seed bank was also found to be characterised by high observation error with low environmental variability, while the converse was true for the above ground population estimates. Knowledge regarding the relative uncertainty of the above and below ground abundance estimates and the spatial distribution of population dynamics can then be used to provide guidelines for future monitoring efforts. For example, it may be more strategic to sample the seed bank less frequently as it less variable over time, and instead focus on obtaining more accurate counts when it is sampled to offset the high observation error. Additionally, the level of spatial heterogeneity in the Simpson Desert provides some justification for expanding spatial replication. Third, the validity of using visual cover estimates as a means of monitoring vegetation and environmental changes was assessed. Visual cover estimates are particularly susceptible to observation error, and previous studies on the repeatability and reliability of such measurements have raised concerns over their value in ecological monitoring and decision making. I made use of two primary long-term monitoring datasets on spinifex grasslands, each obtained with different motivations, methods of data collection, and varying degrees of spatial and temporal coverage to assess the consistency of spatial and temporal trends between these datasets. Thus it could be determined whether the different sampling strategies and observation errors produced inconsistent and conflicting results. Observation errors were found to be quite large, often exceeding variation due to environmental changes. However, when these errors are accounted for, trends in the spatial dynamics of spinifex cover were consistent between the datasets, with population dynamics being driven primarily by time since last fire. Models also showed similar population traces over time, reflecting the effects of major temporal drivers such as rainfall and fire history. These findings vindicate visual cover estimates as a useful source of information provided that uncertainties in the measurements are appropriately addressed. Finally, I shift the focus from single species analyses and apply dynamic factor analysis (DFA) to a large, multispecies database of abundances over time, which reduces the temporal dynamics of a large number of species to a small number of common trends. In producing these trends, interpretation of large multispecies data is greatly simplified. Furthermore, the common trends groups species with similar temporal responses, thus revealing where there is potential to borrow strength across species to supplement those that are less well sampled. Five common trends were identified for each site, and crucially, these trends were strongly associated with life form which showed distinctive signatures in the shape of their trends. Forbs and grasses for example demonstrated high levels of synchrony in their responses to rain events, although the signal for shrubs and subshrubs was weaker. These responses were also found to differ over relatively large (>20km) spatial scales. Thus plant life form is a reasonable predictor of changes in abundance over time and offers some justification for borrowing information to supplement data from poorly sampled species, provided the data are within the same locality. The results of this thesis underpin the value of acknowledging, measuring and managing uncertainties, and that these uncertainties can be used advantageously to guide inferences, extract value from datasets thought to be unreliable, provide justification for sourcing additional sources of information or excluding others, and inform future data collection protocols. Several methods for addressing uncertainty are highlighted, such as the use of simulations when data are unavailable, powerful state-space modelling techniques to account for observation error, and identifying opportunities for supplementing data from the literature, similar sites or species with similar dynamics. There are several more options available for reducing and managing uncertainty, and it is ultimately up to the researcher to first recognise where uncertainties are likely to exist, explore their options, and decide how such uncertainties are to be addressed

    Gaussian Process Modelling for Uncertainty Quantification in Convectively-Enhanced Dissolution Processes in Porous Media

    Get PDF
    Numerical groundwater flow and dissolution models of physico-chemical processes in deep aquifers are usually subject to uncertainty in one or more of the model input parameters. This uncertainty is propagated through the equations and needs to be quantified and characterised in order to rely on the model outputs. In this paper we present a Gaussian process emulation method as a tool for performing uncertainty quantification in mathematical models for convection and dissolution processes in porous media. One of the advantages of this method is its ability to significantly reduce the computational cost of an uncertainty analysis, while yielding accurate results, compared to classical Monte Carlo methods. We apply the methodology to a model of convectively-enhanced dissolution processes occurring during carbon capture and storage. In this model, the Gaussian process methodology fails due to the presence of multiple branches of solutions emanating from a bifurcation point, i.e., two equilibrium states exist rather than one. To overcome this issue we use a classifier as a precursor to the Gaussian process emulation, after which we are able to successfully perform a full uncertainty analysis in the vicinity of the bifurcation point

    Modelling Uncertainty in Black-box Classification Systems

    Get PDF
    [eng] Currently, thanks to the Big Data boom, the excellent results obtained by deep learning models and the strong digital transformation experienced over the last years, many companies have decided to incorporate machine learning models into their systems. Some companies have detected this opportunity and are making a portfolio of artificial intelligence services available to third parties in the form of application programming interfaces (APIs). Subsequently, developers include calls to these APIs to incorporate AI functionalities in their products. Although it is an option that saves time and resources, it is true that, in most cases, these APIs are displayed in the form of blackboxes, the details of which are unknown to their clients. The complexity of such products typically leads to a lack of control and knowledge of the internal components, which, in turn, can drive to potential uncontrolled risks. Therefore, it is necessary to develop methods capable of evaluating the performance of these black-boxes when applied to a specific application. In this work, we present a robust uncertainty-based method for evaluating the performance of both probabilistic and categorical classification black-box models, in particular APIs, that enriches the predictions obtained with an uncertainty score. This uncertainty score enables the detection of inputs with very confident but erroneous predictions while protecting against out of distribution data points when deploying the model in a productive setting. In the first part of the thesis, we develop a thorough revision of the concept of uncertainty, focusing on the uncertainty of classification systems. We review the existingrelated literature, describing the different approaches for modelling this uncertainty, its application to different use cases and some of its desirable properties. Next, we introduce the proposed method for modelling uncertainty in black-box settings. Moreover, in the last chapters of the thesis, we showcase the method applied to different domains, including NLP and computer vision problems. Finally, we include two reallife applications of the method: classification of overqualification in job descriptions and readability assessment of texts.[spa] La tesis propone un método para el cálculo de la incertidumbre asociada a las predicciones de APIs o librerías externas de sistemas de clasificación
    • …
    corecore