512 research outputs found

    Retroviral Integration Process in the Human Genome: Is It Really Non-Random? A New Statistical Approach

    Get PDF
    Retroviral vectors are widely used in gene therapy to introduce therapeutic genes into patients' cells, since, once delivered to the nucleus, the genes of interest are stably inserted (integrated) into the target cell genome. There is now compelling evidence that integration of retroviral vectors follows non-random patterns in mammalian genome, with a preference for active genes and regulatory regions. In particular, Moloney Leukemia Virus (MLV)–derived vectors show a tendency to integrate in the proximity of the transcription start site (TSS) of genes, occasionally resulting in the deregulation of gene expression and, where proto-oncogenes are targeted, in tumor initiation. This has drawn the attention of the scientific community to the molecular determinants of the retroviral integration process as well as to statistical methods to evaluate the genome-wide distribution of integration sites. In recent approaches, the observed distribution of MLV integration distances (IDs) from the TSS of the nearest gene is assumed to be non-random by empirical comparison with a random distribution generated by computational simulation procedures. To provide a statistical procedure to test the randomness of the retroviral insertion pattern, we propose a probability model (Beta distribution) based on IDs between two consecutive genes. We apply the procedure to a set of 595 unique MLV insertion sites retrieved from human hematopoietic stem/progenitor cells. The statistical goodness of fit test shows the suitability of this distribution to the observed data. Our statistical analysis confirms the preference of MLV-based vectors to integrate in promoter-proximal regions

    Beyond element-wise interactions: identifying complex interactions in biological processes

    Get PDF
    Background: Biological processes typically involve the interactions of a number of elements (genes, cells) acting on each others. Such processes are often modelled as networks whose nodes are the elements in question and edges pairwise relations between them (transcription, inhibition). But more often than not, elements actually work cooperatively or competitively to achieve a task. Or an element can act on the interaction between two others, as in the case of an enzyme controlling a reaction rate. We call “complex” these types of interaction and propose ways to identify them from time-series observations. Methodology: We use Granger Causality, a measure of the interaction between two signals, to characterize the influence of an enzyme on a reaction rate. We extend its traditional formulation to the case of multi-dimensional signals in order to capture group interactions, and not only element interactions. Our method is extensively tested on simulated data and applied to three biological datasets: microarray data of the Saccharomyces cerevisiae yeast, local field potential recordings of two brain areas and a metabolic reaction. Conclusions: Our results demonstrate that complex Granger causality can reveal new types of relation between signals and is particularly suited to biological data. Our approach raises some fundamental issues of the systems biology approach since finding all complex causalities (interactions) is an NP hard problem

    Application of Bayesian network structure learning to identify causal variant SNPs from resequencing data

    Get PDF
    Using single-nucleotide polymorphism (SNP) genotypes from the 1000 Genomes Project pilot3 data provided for Genetic Analysis Workshop 17 (GAW17), we applied Bayesian network structure learning (BNSL) to identify potential causal SNPs associated with the Affected phenotype. We focus on the setting in which target genes that harbor causal variants have already been chosen for resequencing; the goal was to detect true causal SNPs from among the measured variants in these genes. Examining all available SNPs in the known causal genes, BNSL produced a Bayesian network from which subsets of SNPs connected to the Affected outcome were identified and measured for statistical significance using the hypergeometric distribution. The exploratory phase of analysis for pooled replicates sometimes identified a set of involved SNPs that contained more true causal SNPs than expected by chance in the Asian population. Analyses of single replicates gave inconsistent results. No nominally significant results were found in analyses of African or European populations. Overall, the method was not able to identify sets of involved SNPs that included a higher proportion of true causal SNPs than expected by chance alone. We conclude that this method, as currently applied, is not effective for identifying causal SNPs that follow the simulation model for the GAW17 data set, which includes many rare causal SNPs

    Lifecourse socioeconomic status and type 2 diabetes: the role of chronic inflammation in the English Longitudinal Study of Ageing

    Get PDF
    We examined the association between lifecourse socioeconomic status (SES) and the risk of type 2 diabetes at older ages, ascertaining the extent to which adult lifestyle factors and systemic inflammation explain this relationship. Data were drawn from the English Longitudinal Study of Ageing (ELSA) which, established in 2002, is a representative cohort study of ?50-year olds individuals living in England. SES indicators were paternal social class, participants? education, participants? wealth, and a lifecourse socioeconomic index. Inflammatory markers (C-reactive protein and fibrinogen) and lifestyle factors were measured repeatedly; diabetes incidence (new cases) was monitored over 7.5 years of follow-up. Of the 6218 individuals free from diabetes at baseline (44% women, mean aged 66 years), 423 developed diabetes during follow-up. Relative to the most advantaged people, those in the lowest lifecourse SES group experienced more than double the risk of diabetes (hazard ratio 2.59; 95% Confidence Interval (CI)?=?1.81?3.71). Lifestyle factors explained 52% (95%CI:30?85) and inflammatory markers 22% (95%CI:13?37) of this gradient. Similar results were apparent with the separate SES indicators. In a general population sample, socioeconomic inequalities in the risk of type 2 diabetes extend to older ages and appear to partially originate from socioeconomic variations in modifiable factors which include lifestyle and inflammation

    A comparison between the APACHE II and Charlson Index Score for predicting hospital mortality in critically ill patients

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Risk adjustment and mortality prediction in studies of critical care are usually performed using acuity of illness scores, such as Acute Physiology and Chronic Health Evaluation II (APACHE II), which emphasize physiological derangement. Common risk adjustment systems used in administrative datasets, like the Charlson index, are entirely based on the presence of co-morbid illnesses. The purpose of this study was to compare the discriminative ability of the Charlson index to the APACHE II in predicting hospital mortality in adult multisystem ICU patients.</p> <p>Methods</p> <p>This was a population-based cohort design. The study sample consisted of adult (>17 years of age) residents of the Calgary Health Region admitted to a multisystem ICU between April 2002 and March 2004. Clinical data were collected prospectively and linked to hospital outcome data. Multiple regression analyses were used to compare the performance of APACHE II and the Charlson index.</p> <p>Results</p> <p>The Charlson index was a poor predictor of mortality (C = 0.626). There was minimal difference between a baseline model containing age, sex and acute physiology score (C = 0.74) and models containing either chronic health points (C = 0.76) or Charlson index variations (C = 0.75, 0.76, 0.77). No important improvement in prediction occurred when the Charlson index was added to the full APACHE II model (C = 0.808 to C = 0.813).</p> <p>Conclusion</p> <p>The Charlson index does not perform as well as the APACHE II in predicting hospital mortality in ICU patients. However, when acuity of illness scores are unavailable or are not recorded in a standard way, the Charlson index might be considered as an alternative method of risk adjustment and therefore facilitate comparisons between intensive care units.</p

    The strong emergence of molecular structure

    Get PDF
    One of the most plausible and widely discussed examples of strong emergence is molecular structure. The only detailed account of it, which has been very influential, is due to Robin Hendry and is formulated in terms of downward causation. This paper explains Hendry’s account of the strong emergence of molecular structure and argues that it is coherent only if one assumes a diachronic reflexive notion of downward causation. However, in the context of this notion of downward causation, the strong emergence of molecular structure faces three challenges that have not been met and which have so far remained unnoticed. First, the putative empirical evidence presented for the strong emergence of molecular structure equally undermines supervenience, which is one of the main tenets of strong emergence. Secondly, it is ambiguous how the assumption of determinate nuclear positions is invoked for the support of strong emergence, as the role of this assumption in Hendry’s argument can be interpreted in more than one way. Lastly, there are understandings of causation which render the postulation of a downward causal relation between a molecule’s structure and its quantum mechanical entities, untenable

    Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Bayesian Network (BN) is a powerful approach to reconstructing genetic regulatory networks from gene expression data. However, expression data by itself suffers from high noise and lack of power. Incorporating prior biological knowledge can improve the performance. As each type of prior knowledge on its own may be incomplete or limited by quality issues, integrating multiple sources of prior knowledge to utilize their consensus is desirable.</p> <p>Results</p> <p>We introduce a new method to incorporate the quantitative information from multiple sources of prior knowledge. It first uses the Naïve Bayesian classifier to assess the likelihood of functional linkage between gene pairs based on prior knowledge. In this study we included cocitation in PubMed and schematic similarity in Gene Ontology annotation. A candidate network edge reservoir is then created in which the copy number of each edge is proportional to the estimated likelihood of linkage between the two corresponding genes. In network simulation the Markov Chain Monte Carlo sampling algorithm is adopted, and samples from this reservoir at each iteration to generate new candidate networks. We evaluated the new algorithm using both simulated and real gene expression data including that from a yeast cell cycle and a mouse pancreas development/growth study. Incorporating prior knowledge led to a ~2 fold increase in the number of known transcription regulations recovered, without significant change in false positive rate. In contrast, without the prior knowledge BN modeling is not always better than a random selection, demonstrating the necessity in network modeling to supplement the gene expression data with additional information.</p> <p>Conclusion</p> <p>our new development provides a statistical means to utilize the quantitative information in prior biological knowledge in the BN modeling of gene expression data, which significantly improves the performance.</p

    Search for CP violation in D+→ϕπ+ and D+s→K0Sπ+ decays

    Get PDF
    A search for CP violation in D + → ϕπ + decays is performed using data collected in 2011 by the LHCb experiment corresponding to an integrated luminosity of 1.0 fb−1 at a centre of mass energy of 7 TeV. The CP -violating asymmetry is measured to be (−0.04 ± 0.14 ± 0.14)% for candidates with K − K + mass within 20 MeV/c 2 of the ϕ meson mass. A search for a CP -violating asymmetry that varies across the ϕ mass region of the D + → K − K + π + Dalitz plot is also performed, and no evidence for CP violation is found. In addition, the CP asymmetry in the D+s→K0Sπ+ decay is measured to be (0.61 ± 0.83 ± 0.14)%

    Utilisation of an operative difficulty grading scale for laparoscopic cholecystectomy

    Get PDF
    Background A reliable system for grading operative difficulty of laparoscopic cholecystectomy would standardise description of findings and reporting of outcomes. The aim of this study was to validate a difficulty grading system (Nassar scale), testing its applicability and consistency in two large prospective datasets. Methods Patient and disease-related variables and 30-day outcomes were identified in two prospective cholecystectomy databases: the multi-centre prospective cohort of 8820 patients from the recent CholeS Study and the single-surgeon series containing 4089 patients. Operative data and patient outcomes were correlated with Nassar operative difficultly scale, using Kendall’s tau for dichotomous variables, or Jonckheere–Terpstra tests for continuous variables. A ROC curve analysis was performed, to quantify the predictive accuracy of the scale for each outcome, with continuous outcomes dichotomised, prior to analysis. Results A higher operative difficulty grade was consistently associated with worse outcomes for the patients in both the reference and CholeS cohorts. The median length of stay increased from 0 to 4 days, and the 30-day complication rate from 7.6 to 24.4% as the difficulty grade increased from 1 to 4/5 (both p < 0.001). In the CholeS cohort, a higher difficulty grade was found to be most strongly associated with conversion to open and 30-day mortality (AUROC = 0.903, 0.822, respectively). On multivariable analysis, the Nassar operative difficultly scale was found to be a significant independent predictor of operative duration, conversion to open surgery, 30-day complications and 30-day reintervention (all p < 0.001). Conclusion We have shown that an operative difficulty scale can standardise the description of operative findings by multiple grades of surgeons to facilitate audit, training assessment and research. It provides a tool for reporting operative findings, disease severity and technical difficulty and can be utilised in future research to reliably compare outcomes according to case mix and intra-operative difficulty

    Analysis and Practical Guideline of Constraint-Based Boolean Method in Genetic Network Inference

    Get PDF
    Boolean-based method, despite of its simplicity, would be a more attractive approach for inferring a network from high-throughput expression data if its effectiveness has not been limited by high false positive prediction. In this study, we explored factors that could simply be adjusted to improve the accuracy of inferring networks. Our work focused on the analysis of the effects of discretisation methods, biological constraints, and stringency of Boolean function assignment on the performance of Boolean network, including accuracy, precision, specificity and sensitivity, using three sets of microarray time-series data. The study showed that biological constraints have pivotal influence on the network performance over the other factors. It can reduce the variation in network performance resulting from the arbitrary selection of discretisation methods and stringency settings. We also presented the master Boolean network as an approach to establish the unique solution for Boolean analysis. The information acquired from the analysis was summarised and deployed as a general guideline for an efficient use of Boolean-based method in the network inference. In the end, we provided an example of the use of such a guideline in the study of Arabidopsis circadian clock genetic network from which much interesting biological information can be inferred
    corecore