125 research outputs found

    High Resolution Models of Transcription Factor-DNA Affinities Improve In Vitro and In Vivo Binding Predictions

    Get PDF
    Accurately modeling the DNA sequence preferences of transcription factors (TFs), and using these models to predict in vivo genomic binding sites for TFs, are key pieces in deciphering the regulatory code. These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices (PSSMs), which may match large numbers of sites and produce an unreliable list of target genes. Recently, protein binding microarray (PBM) experiments have emerged as a new source of high resolution data on in vitro TF binding specificities. PBM data has been analyzed either by estimating PSSMs or via rank statistics on probe intensities, so that individual sequence patterns are assigned enrichment scores (E-scores). This representation is informative but unwieldy because every TF is assigned a list of thousands of scored sequence patterns. Meanwhile, high-resolution in vivo TF occupancy data from ChIP-seq experiments is also increasingly available. We have developed a flexible discriminative framework for learning TF binding preferences from high resolution in vitro and in vivo data. We first trained support vector regression (SVR) models on PBM data to learn the mapping from probe sequences to binding intensities. We used a novel -mer based string kernel called the di-mismatch kernel to represent probe sequence similarities. The SVR models are more compact than E-scores, more expressive than PSSMs, and can be readily used to scan genomics regions to predict in vivo occupancy. Using a large data set of yeast and mouse TFs, we found that our SVR models can better predict probe intensity than the E-score method or PBM-derived PSSMs. Moreover, by using SVRs to score yeast, mouse, and human genomic regions, we were better able to predict genomic occupancy as measured by ChIP-chip and ChIP-seq experiments. Finally, we found that by training kernel-based models directly on ChIP-seq data, we greatly improved in vivo occupancy prediction, and by comparing a TF's in vitro and in vivo models, we could identify cofactors and disambiguate direct and indirect binding

    Segmentation and intensity estimation for microarray images with saturated pixels

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microarray image analysis processes scanned digital images of hybridized arrays to produce the input spot-level data for downstream analysis, so it can have a potentially large impact on those and subsequent analysis. Signal saturation is an optical effect that occurs when some pixel values for highly expressed genes or peptides exceed the upper detection threshold of the scanner software (2<sup>16 </sup>- 1 = 65, 535 for 16-bit images). In practice, spots with a sizable number of saturated pixels are often flagged and discarded. Alternatively, the saturated values are used without adjustments for estimating spot intensities. The resulting expression data tend to be biased downwards and can distort high-level analysis that relies on these data. Hence, it is crucial to effectively correct for signal saturation.</p> <p>Results</p> <p>We developed a flexible mixture model-based segmentation and spot intensity estimation procedure that accounts for saturated pixels by incorporating a censored component in the mixture model. As demonstrated with biological data and simulation, our method extends the dynamic range of expression data beyond the saturation threshold and is effective in correcting saturation-induced bias when the lost information is not tremendous. We further illustrate the impact of image processing on downstream classification, showing that the proposed method can increase diagnostic accuracy using data from a lymphoma cancer diagnosis study.</p> <p>Conclusions</p> <p>The presented method adjusts for signal saturation at the segmentation stage that identifies a pixel as part of the foreground, background or other. The cluster membership of a pixel can be altered versus treating saturated values as truly observed. Thus, the resulting spot intensity estimates may be more accurate than those obtained from existing methods that correct for saturation based on already segmented data. As a model-based segmentation method, our procedure is able to identify inner holes, fuzzy edges and blank spots that are common in microarray images. The approach is independent of microarray platform and applicable to both single- and dual-channel microarrays.</p

    The importance of service quality in British Muslim’s choice of an Islamic or non-Islamic bank account

    Get PDF
    Using an extended SERVQUAL model, this study identifies and compares the importance of service quality to Muslim consumers with an Islamic or non-Islamic bank account in a non-Muslim country, Britain. Eight group discussions and survey with 300 Muslims were conducted. Five dimensions of service quality were identified, i.e. Responsiveness, Credibility, Islamic Tangibles, Accessibility and Reputation. These differ in structure and content from the original SERVQUAL developed in the west and the subsequent CARTER model constructed in a Muslim country. In addition, significant differences were found in the importance rating of items by respondents holding an account with an Islamic bank compared to those with a non-Islamic bank account. This study is one of the first to identify and compare the importance of service quality between Islamic and non-Islamic bank account holders in a western non-Muslim country. The results advance our understanding of the impact of culture on SERVQUAL. The study provides insight into Muslims’ bank choice and helps bank managers of both Islamic and non-Islamic banks to focus their attention on the service quality dimensions that matter most to Muslim customers

    What Happened to Gray Whales during the Pleistocene? The Ecological Impact of Sea-Level Change on Benthic Feeding Areas in the North Pacific Ocean

    Get PDF
    Gray whales (Eschrichtius robustus) undertake long migrations, from Baja California to Alaska, to feed on seasonally productive benthos of the Bering and Chukchi seas. The invertebrates that form their primary prey are restricted to shallow water environments, but global sea-level changes during the Pleistocene eliminated or reduced this critical habitat multiple times. Because the fossil record of gray whales is coincident with the onset of Northern Hemisphere glaciation, gray whales survived these massive changes to their feeding habitat, but it is unclear how.We reconstructed gray whale carrying capacity fluctuations during the past 120,000 years by quantifying gray whale feeding habitat availability using bathymetric data for the North Pacific Ocean, constrained by their maximum diving depth. We calculated carrying capacity based on modern estimates of metabolic demand, prey availability, and feeding duration; we also constrained our estimates to reflect current population size and account for glaciated and non-glaciated areas in the North Pacific. Our results show that key feeding areas eliminated by sea-level lowstands were not replaced by commensurate areas. Our reconstructions show that such reductions affected carrying capacity, and harmonic means of these fluctuations do not differ dramatically from genetic estimates of carrying capacity.Assuming current carrying capacity estimates, Pleistocene glacial maxima may have created multiple, weak genetic bottlenecks, although the current temporal resolution of genetic datasets does not test for such signals. Our results do not, however, falsify molecular estimates of pre-whaling population size because those abundances would have been sufficient to survive the loss of major benthic feeding areas (i.e., the majority of the Bering Shelf) during glacial maxima. We propose that gray whales survived the disappearance of their primary feeding ground by employing generalist filter-feeding modes, similar to the resident gray whales found between northern Washington State and Vancouver Island

    Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene-induced Signaling

    Get PDF
    Cellular signal transduction generally involves cascades of post-translational protein modifications that rapidly catalyze changes in protein-DNA interactions and gene expression. High-throughput measurements are improving our ability to study each of these stages individually, but do not capture the connections between them. Here we present an approach for building a network of physical links among these data that can be used to prioritize targets for pharmacological intervention. Our method recovers the critical missing links between proteomic and transcriptional data by relating changes in chromatin accessibility to changes in expression and then uses these links to connect proteomic and transcriptome data. We applied our approach to integrate epigenomic, phosphoproteomic and transcriptome changes induced by the variant III mutation of the epidermal growth factor receptor (EGFRvIII) in a cell line model of glioblastoma multiforme (GBM). To test the relevance of the network, we used small molecules to target highly connected nodes implicated by the network model that were not detected by the experimental data in isolation and we found that a large fraction of these agents alter cell viability. Among these are two compounds, ICG-001, targeting CREB binding protein (CREBBP), and PKF118–310, targeting β-catenin (CTNNB1), which have not been tested previously for effectiveness against GBM. At the level of transcriptional regulation, we used chromatin immunoprecipitation sequencing (ChIP-Seq) to experimentally determine the genome-wide binding locations of p300, a transcriptional co-regulator highly connected in the network. Analysis of p300 target genes suggested its role in tumorigenesis. We propose that this general method, in which experimental measurements are used as constraints for building regulatory networks from the interactome while taking into account noise and missing data, should be applicable to a wide range of high-throughput datasets.National Science Foundation (U.S.) (DB1-0821391)National Institutes of Health (U.S.) (Grant U54-CA112967)National Institutes of Health (U.S.) (Grant R01-GM089903)National Institutes of Health (U.S.) (P30-ES002109

    Postoperative acute kidney injury in adult non-cardiac surgery:joint consensus report of the Acute Disease Quality Initiative and PeriOperative Quality Initiative

    Get PDF
    Postoperative acute kidney injury (PO-AKI) is a common complication of major surgery that is strongly associated with short-term surgical complications and long-term adverse outcomes, including increased risk of chronic kidney disease, cardiovascular events and death. Risk factors for PO-AKI include older age and comorbid diseases such as chronic kidney disease and diabetes mellitus. PO-AKI is best defined as AKI occurring within 7 days of an operative intervention using the Kidney Disease Improving Global Outcomes (KDIGO) definition of AKI; however, additional prognostic information may be gained from detailed clinical assessment and other diagnostic investigations in the form of a focused kidney health assessment (KHA). Prevention of PO-AKI is largely based on identification of high baseline risk, monitoring and reduction of nephrotoxic insults, whereas treatment involves the application of a bundle of interventions to avoid secondary kidney injury and mitigate the severity of AKI. As PO-AKI is strongly associated with long-term adverse outcomes, some form of follow-up KHA is essential; however, the form and location of this will be dictated by the nature and severity of the AKI. In this Consensus Statement, we provide graded recommendations for AKI after non-cardiac surgery and highlight priorities for future research

    The Human Phenotype Ontology in 2024: phenotypes around the world

    Get PDF
    \ua9 The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. The Human Phenotype Ontology (HPO) is a widely used resource that comprehensively organizes and defines the phenotypic features of human disease, enabling computational inference and supporting genomic and phenotypic analyses through semantic similarity and machine learning algorithms. The HPO has widespread applications in clinical diagnostics and translational research, including genomic diagnostics, gene-disease discovery, and cohort analytics. In recent years, groups around the world have developed translations of the HPO from English to other languages, and the HPO browser has been internationalized, allowing users to view HPO term labels and in many cases synonyms and definitions in ten languages in addition to English. Since our last report, a total of 2239 new HPO terms and 49235 new HPO annotations were developed, many in collaboration with external groups in the fields of psychiatry, arthrogryposis, immunology and cardiology. The Medical Action Ontology (MAxO) is a new effort to model treatments and other measures taken for clinical management. Finally, the HPO consortium is contributing to efforts to integrate the HPO and the GA4GH Phenopacket Schema into electronic health records (EHRs) with the goal of more standardized and computable integration of rare disease data in EHRs

    Multi-ancestry genome-wide association meta-analysis of Parkinson’s disease

    Get PDF
    \ua9 2023, This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply. Although over 90 independent risk variants have been identified for Parkinson’s disease using genome-wide association studies, most studies have been performed in just one population at a time. Here we performed a large-scale multi-ancestry meta-analysis of Parkinson’s disease with 49,049 cases, 18,785 proxy cases and 2,458,063 controls including individuals of European, East Asian, Latin American and African ancestry. In a meta-analysis, we identified 78 independent genome-wide significant loci, including 12 potentially novel loci (MTF2, PIK3CA, ADD1, SYBU, IRS2, USP8, PIGL, FASN, MYLK2, USP25, EP300 and PPP6R2) and fine-mapped 6 putative causal variants at 6 known PD loci. By combining our results with publicly available eQTL data, we identified 25 putative risk genes in these novel loci whose expression is associated with PD risk. This work lays the groundwork for future efforts aimed at identifying PD loci in non-European populations
    • …
    corecore