34 research outputs found

    A new pairwise kernel for biological network inference with support vector machines

    Get PDF
    International audienceBACKGROUND: Much recent work in bioinformatics has focused on the inference of various types of biological networks, representing gene regulation, metabolic processes, protein-protein interactions, etc. A common setting involves inferring network edges in a supervised fashion from a set of high-confidence edges, possibly characterized by multiple, heterogeneous data sets (protein sequence, gene expression, etc.). RESULTS: Here, we distinguish between two modes of inference in this setting: direct inference based upon similarities between nodes joined by an edge, and indirect inference based upon similarities between one pair of nodes and another pair of nodes. We propose a supervised approach for the direct case by translating it into a distance metric learning problem. A relaxation of the resulting convex optimization problem leads to the support vector machine (SVM) algorithm with a particular kernel for pairs, which we call the metric learning pairwise kernel. This new kernel for pairs can easily be used by most SVM implementations to solve problems of supervised classification and inference of pairwise relationships from heterogeneous data. We demonstrate, using several real biological networks and genomic datasets, that this approach often improves upon the state-of-the-art SVM for indirect inference with another pairwise kernel, and that the combination of both kernels always improves upon each individual kernel. CONCLUSION: The metric learning pairwise kernel is a new formulation to infer pairwise relationships with SVM, which provides state-of-the-art results for the inference of several biological networks from heterogeneous genomic data

    BayesPI - a new model to study protein-DNA interactions: a case study of condition-specific protein binding parameters for Yeast transcription factors

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We have incorporated Bayesian model regularization with biophysical modeling of protein-DNA interactions, and of genome-wide nucleosome positioning to study protein-DNA interactions, using a high-throughput dataset. The newly developed method (BayesPI) includes the estimation of a transcription factor (TF) binding energy matrices, the computation of binding affinity of a TF target site and the corresponding chemical potential.</p> <p>Results</p> <p>The method was successfully tested on synthetic ChIP-chip datasets, real yeast ChIP-chip experiments. Subsequently, it was used to estimate condition-specific and species-specific protein-DNA interaction for several yeast TFs.</p> <p>Conclusion</p> <p>The results revealed that the modification of the protein binding parameters and the variation of the individual nucleotide affinity in either recognition or flanking sequences occurred under different stresses and in different species. The findings suggest that such modifications may be adaptive and play roles in the formation of the environment-specific binding patterns of yeast TFs and in the divergence of TF binding sites across the related yeast species.</p

    Towards standardized measurement of adverse events in spine surgery: conceptual model and pilot evaluation

    Get PDF
    BACKGROUND: Independent of efficacy, information on safety of surgical procedures is essential for informed choices. We seek to develop standardized methodology for describing the safety of spinal operations and apply these methods to study lumbar surgery. We present a conceptual model for evaluating the safety of spine surgery and describe development of tools to measure principal components of this model: (1) specifying outcome by explicit criteria for adverse event definition, mode of ascertainment, cause, severity, or preventability, and (2) quantitatively measuring predictors such as patient factors, comorbidity, severity of degenerative spine disease, and invasiveness of spine surgery. METHODS: We created operational definitions for 176 adverse occurrences and established multiple mechanisms for reporting them. We developed new methods to quantify the severity of adverse occurrences, degeneration of lumbar spine, and invasiveness of spinal procedures. Using kappa statistics and intra-class correlation coefficients, we assessed agreement for the following: four reviewers independently coding etiology, preventability, and severity for 141 adverse occurrences, two observers coding lumbar spine degenerative changes in 10 selected cases, and two researchers coding invasiveness of surgery for 50 initial cases. RESULTS: During the first six months of prospective surveillance, rigorous daily medical record reviews identified 92.6% of the adverse occurrences we recorded, and voluntary reports by providers identified 38.5% (surgeons reported 18.3%, inpatient rounding team reported 23.1%, and conferences discussed 6.1%). Trained observers had fair agreement in classifying etiology of 141 adverse occurrences into 18 categories (kappa = 0.35), but agreement was substantial (kappa ≥ 0.61) for 4 specific categories: technical error, failure in communication, systems failure, and no error. Preventability assessment had moderate agreement (mean weighted kappa = 0.44). Adverse occurrence severity rating had fair agreement (mean weighted kappa = 0.33) when using a scale based on the JCAHO Sentinel Event Policy, but agreement was substantial for severity ratings on a new 11-point numerical severity scale (ICC = 0.74). There was excellent inter-rater agreement for a lumbar degenerative disease severity score (ICC = 0.98) and an index of surgery invasiveness (ICC = 0.99). CONCLUSION: Composite measures of disease severity and surgery invasiveness may allow development of risk-adjusted predictive models for adverse events in spine surgery. Standard measures of adverse events and risk adjustment may also facilitate post-marketing surveillance of spinal devices, effectiveness research, and quality improvement

    Learning a Prior on Regulatory Potential from eQTL Data

    Get PDF
    Genome-wide RNA expression data provide a detailed view of an organism's biological state; hence, a dataset measuring expression variation between genetically diverse individuals (eQTL data) may provide important insights into the genetics of complex traits. However, with data from a relatively small number of individuals, it is difficult to distinguish true causal polymorphisms from the large number of possibilities. The problem is particularly challenging in populations with significant linkage disequilibrium, where traits are often linked to large chromosomal regions containing many genes. Here, we present a novel method, Lirnet, that automatically learns a regulatory potential for each sequence polymorphism, estimating how likely it is to have a significant effect on gene expression. This regulatory potential is defined in terms of “regulatory features”—including the function of the gene and the conservation, type, and position of genetic polymorphisms—that are available for any organism. The extent to which the different features influence the regulatory potential is learned automatically, making Lirnet readily applicable to different datasets, organisms, and feature sets. We apply Lirnet both to the human HapMap eQTL dataset and to a yeast eQTL dataset and provide statistical and biological results demonstrating that Lirnet produces significantly better regulatory programs than other recent approaches. We demonstrate in the yeast data that Lirnet can correctly suggest a specific causal sequence variation within a large, linked chromosomal region. In one example, Lirnet uncovered a novel, experimentally validated connection between Puf3—a sequence-specific RNA binding protein—and P-bodies—cytoplasmic structures that regulate translation and RNA stability—as well as the particular causative polymorphism, a SNP in Mkt1, that induces the variation in the pathway

    Computational Identification of Transcriptional Regulators in Human Endotoxemia

    Get PDF
    One of the great challenges in the post-genomic era is to decipher the underlying principles governing the dynamics of biological responses. As modulating gene expression levels is among the key regulatory responses of an organism to changes in its environment, identifying biologically relevant transcriptional regulators and their putative regulatory interactions with target genes is an essential step towards studying the complex dynamics of transcriptional regulation. We present an analysis that integrates various computational and biological aspects to explore the transcriptional regulation of systemic inflammatory responses through a human endotoxemia model. Given a high-dimensional transcriptional profiling dataset from human blood leukocytes, an elementary set of temporal dynamic responses which capture the essence of a pro-inflammatory phase, a counter-regulatory response and a dysregulation in leukocyte bioenergetics has been extracted. Upon identification of these expression patterns, fourteen inflammation-specific gene batteries that represent groups of hypothetically ‘coregulated’ genes are proposed. Subsequently, statistically significant cis-regulatory modules (CRMs) are identified and decomposed into a list of critical transcription factors (34) that are validated largely on primary literature. Finally, our analysis further allows for the construction of a dynamic representation of the temporal transcriptional regulatory program across the host, deciphering possible combinatorial interactions among factors under which they might be active. Although much remains to be explored, this study has computationally identified key transcription factors and proposed a putative time-dependent transcriptional regulatory program associated with critical transcriptional inflammatory responses. These results provide a solid foundation for future investigations to elucidate the underlying transcriptional regulatory mechanisms under the host inflammatory response. Also, the assumption that coexpressed genes that are functionally relevant are more likely to share some common transcriptional regulatory mechanism seems to be promising, making the proposed framework become essential in unravelling context-specific transcriptional regulatory interactions underlying diverse mammalian biological processes

    Does electrification spur the fertility transition? evidence from Indonesia

    Get PDF
    We analyze various pathways through which access to electricity affects fertility in Indonesia, using a district difference-in-difference approach. The electrification rate increased by 65 % over the study period, and our results suggest that the subsequent effects on fertility account for about 18 % to 24 % of the overall decline in fertility. A key channel is increased exposure to television. Using in addition several waves of Demographic and Health Surveys, we find suggestive evidence that increased exposure to TV affects, in particular, fertility preferences and increases the effective use of contraception. Reduced child mortality seems to be another important pathway
    corecore