78 research outputs found
Similarity-Detection and Localization
The detection of similarities between long DNA and protein sequences is
studied using concepts of statistical physics. It is shown that mutual
similarities can be detected by sequence alignment methods only if their amount
exceeds a threshold value. The onset of detection is a continuous phase
transition which can be viewed as a localization-delocalization transition. The
``fidelity'' of the alignment is the order parameter of that transition; it
leads to criteria for the selection of optimal alignment parameters.Comment: 4 pages including 4 figures (308kb post-script file
A statistical concept to assess the uncertainty in Bayesian model weights and its impact on model ranking
Bayesian model averaging (BMA) ranks the plausibility of alternative conceptual models according to Bayes' theorem. A prior belief about each model's adequacy is updated to a posterior model probability based on the skill to reproduce observed data and on the principle of parsimony. The posterior model probabilities are then used as model weights for model ranking, selection, or averaging. Despite the statistically rigorous BMA procedure, model weights can become uncertain quantities due to measurement noise in the calibration data set or due to uncertainty in model input. Uncertain weights may in turn compromise the reliability of BMA results. We present a new statistical concept to investigate this weighting uncertainty, and thus, to assess the significance of model weights and the confidence in model ranking. Our concept is to resample the uncertain input or output data and then to analyze the induced variability in model weights. In the special case of weighting uncertainty due to measurement noise in the calibration data set, we interpret statistics of Bayesian model evidence to assess the distance of a model's performance from the theoretical upper limit. To illustrate our suggested approach, we investigate the reliability of soil-plant model selection following up on a study by Wöhling et al. (2015). Results show that the BMA routine should be equipped with our suggested upgrade to (1) reveal the significant but otherwise undetected impact of measurement noise on model ranking results and (2) to decide whether the considered set of models should be extended with better performing alternatives
Finding the right balance between groundwater model complexity and experimental effort via Bayesian model selection
Groundwater modelers face the challenge of how to assign representative parameter values to the studied aquifer. Several approaches are available to parameterize spatial heterogeneity in aquifer parameters. They differ in their conceptualization and complexity, ranging from homogeneous models to heterogeneous random fields. While it is common practice to invest more effort into data collection for models with a finer resolution of heterogeneities, there is a lack of advice which amount of data is required to justify a certain level of model complexity. In this study, we propose to use concepts related to Bayesian model selection to identify this balance. We demonstrate our approach on the characterization of a heterogeneous aquifer via hydraulic tomography in a sandbox experiment (Illman et al., 2010). We consider four increasingly complex parameterizations of hydraulic conductivity: (1) Effective homogeneous medium, (2) geology-based zonation, (3) interpolation by pilot points, and (4) geostatistical random fields. First, we investigate the shift in justified complexity with increasing amount of available data by constructing a model confusion matrix. This matrix indicates the maximum level of complexity that can be justified given a specific experimental setup. Second, we determine which parameterization is most adequate given the observed drawdown data. Third, we test how the different parameterizations perform in a validation setup. The results of our test case indicate that aquifer characterization via hydraulic tomography does not necessarily require (or justify) a geostatistical description. Instead, a zonation-based model might be a more robust choice, but only if the zonation is geologically adequate
The impact of illustrated side effect information on understanding and sustained retention of antiretroviral side effect knowledge:
About 7.5 million South Africans access gratuitous drinking water via communal taps provided by municipalities under a free basic water policy. Supplying running water for free to low-income communities is essential but can result in water wastage due to a potential indifference of non-paying end-consumers. The consequence is a loss of municipal water and financial resources. We outline a new strategy that rewards low-income communities for reducing water wastage. The incentive strategy promotes water conservation and community development and decreases recurring water-related public expenses. The concept is funded by a percentage of municipal cost savings yielded from the respective water conservation
- …