84 research outputs found
A Bayesian assessment of an approximate model for unconfined water flow in sloping layered porous media
The prediction of water table height in unconfined layered porous media is a difficult modelling problem that typically requires numerical simulation. This paper proposes an analytical model to approximate the exact solution based on a steady-state Dupuit–Forchheimer analysis. The key contribution in relation to a similar model in the literature relies in the ability of the proposed model to consider more than two layers with different thicknesses and slopes, so that the existing model becomes a special case of the proposed model herein. In addition, a model assessment methodology based on the Bayesian inverse problem is proposed to efficiently identify the values of the physical parameters for which the proposed model is accurate when compared against a reference model given by MODFLOW-NWT, the open-source finite-difference code by the U.S. Geological Survey. Based on numerical results for a representative case study, the ratio of vertical recharge rate to hydraulic conductivity emerges as a key parameter in terms of model accuracy so that, when appropriately bounded, both the proposed model and MODFLOW-NWT provide almost identical results
A Predictive Model of Intein Insertion Site for Use in the Engineering of Molecular Switches
Inteins are intervening protein domains with self-splicing ability that can be used as molecular switches to control activity of their host protein. Successfully engineering an intein into a host protein requires identifying an insertion site that permits intein insertion and splicing while allowing for proper folding of the mature protein post-splicing. By analyzing sequence and structure based properties of native intein insertion sites we have identified four features that showed significant correlation with the location of the intein insertion sites, and therefore may be useful in predicting insertion sites in other proteins that provide native-like intein function. Three of these properties, the distance to the active site and dimer interface site, the SVM score of the splice site cassette, and the sequence conservation of the site showed statistically significant correlation and strong predictive power, with area under the curve (AUC) values of 0.79, 0.76, and 0.73 respectively, while the distance to secondary structure/loop junction showed significance but with less predictive power (AUC of 0.54). In a case study of 20 insertion sites in the XynB xylanase, two features of native insertion sites showed correlation with the splice sites and demonstrated predictive value in selecting non-native splice sites. Structural modeling of intein insertions at two sites highlighted the role that the insertion site location could play on the ability of the intein to modulate activity of the host protein. These findings can be used to enrich the selection of insertion sites capable of supporting intein splicing and hosting an intein switch
Pan-cancer analysis of whole genomes
Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale(1-3). Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4-5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter(4); identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation(5,6); analyses timings and patterns of tumour evolution(7); describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity(8,9); and evaluates a range of more-specialized features of cancer genomes(8,10-18).Peer reviewe
Prediction of pile set-up in clays and sands
IOP Conference Series: Materials Science and Engineering v. 10 is conference proceedings of 9th WCCM/APCOM 2010Increase in pile capacity after initial driving has been well observed in clays and sands over decades. The phenomenon is referred to as pile set-up by geotechnical engineers. More economical pile design may benefit from this time-dependent increase subject to a reliable prediction. Simple empirical relations of the current capacity with the initial capacity and elapse time after driving are available in the literature with different model parameters being suggested for clays and sands, respectively. Nevertheless, appropriateness of the relations and confidence interval of the model parameters are rarely investigated and this hinders the application of these formulae. In this study, a revised single-parameter empirical relation is proposed based on the existing formulae. A comprehensive database from pile field tests data in clayey and sandy ground in the literature is compiled and Bayesian analysis is conducted on both these empirical formulae independently for clays and sands. Bayesian inference allows not only the estimation of the uncertain parameter but also the quantification of the associated uncertainty in the form of probability distribution. This study sheds lights on the confidence interval of the model parameter and it provides designers more reliable prediction of the additional capacity due to pile set-up.link_to_OA_fulltextThe 9th World Congress on Computational Mechanics and 4th Asian Pacific Congress on Computational Mechanics (WCCM/APCOM 2010), Sydney, Australia, 19-23 July 2010. In IOP Conference Series: Materials Science and Engineering, 2010, v. 10 n. 1, p. 1-8, article no. 01210
Statistical modal identification using ambient or strong wind response data
The problem of identification of the modal parameters of a structural model using measured ambient or strong wind response time histories is addressed. A Bayesian probabilistic approach is followed to obtain not only the most probable (optimal) values but also the probability distribution of the updated modal parameters. This is very important when one plans to use these estimates for further processing, such as for updating the theoretical finite-element model of the structure, because it provides a rational basis for weighting differently the errors of the various modal parameters, the errors being the differences between the theoretical and identified values of these parameters. The approach is introduced for a SDOF system and it can be extended to general MDOF systems. The statistical properties of an estimator of the spectral density are presented. Based on these statistical results expressions for the updated probability density function (PDF) of the modal parameters are derived. The updated PDF is well approximated by a Gaussian distribution centered at the optimal parameters at which the updated PDF is maximized. Numerical examples using simulated data are presented to illustrate the proposed method
Bayesian probabilistic approach for the correlations of compression index for marine clays
The compression index is an important soil property that is essential to many geotechnical designs. Over the decades, a number of empirical correlations have been proposed to relate the compressibility to other soil index properties, such as the liquid limit, plasticity index, in situ water content, void ratio, specific gravity, etc. The reliability and thus predictability of these correlations are always being questioned. Moreover, selection between simple and complicated models is a difficult task and often depends on subjective judgments. A more complicated model obviously provides "better fit" to the data but not necessarily offers an acceptable degree of robustness to measurement noise and modeling error. In the present study, the Bayesian probabilistic approach for model class selection is used to revisit the empirical multivariate linear regression formula of the compression index. The criterion in the formula structure selection is based on the plausibility of a class of formulas conditional on the measurement, instead of considering the likelihood only. The plausibility balances between the data fitting capability and sensitivity to measurement and modeling error, which is quantified by the Ockham factor. The Bayesian method is applied to analyze a data set of 795 records, including the compression index and other well-known geotechnical index properties of marine clay samples collected from various sites in South Korea. It turns out that the correlation formula linking the compression index to the initial void ratio and liquid limit possesses the highest plausibility among a total of 18 candidate classes of formulas. The physical significance of this most plausible correlation is addressed. It turns out to be consistent with previous studies and the Bayesian method provides the confirmation from another angle. © 2009 ASCE.link_to_subscribed_fulltex
- …