259 research outputs found
PolyDoms: a whole genome database for the identification of non-synonymous coding SNPs with the potential to impact disease
As knowledge of human genetic polymorphisms grows, so does the opportunity and challenge of identifying those polymorphisms that may impact the health or disease risk of an individual person. A critical need is to organize large-scale polymorphism analyses and to prioritize candidate non-synonymous coding SNPs (nsSNPs) that should be tested in experimental and epidemiological studies to establish their context-specific impacts on protein function. In addition, with emerging high-resolution clinical genetics testing, new polymorphisms must be analyzed in the context of all available protein feature knowledge including other known mutations and polymorphisms. To approach this, we developed PolyDoms () as a database to integrate the results of multiple algorithmic procedures and functional criteria applied to the entire Entrez dbSNP dataset. In addition to predicting structural and functional impacts of all nsSNPs, filtering functions enable group-based identification of potentially harmful nsSNPs among multiple genes associated with specific diseases, anatomies, mammalian phenotypes, gene ontologies, pathways or protein domains. PolyDoms, thus, provides a means to derive a list of candidate SNPs to be evaluated in experimental or epidemiological studies for impact on protein functions and disease risk associations. PolyDoms will continue to be curated to improve its usefulness
Dissecting microregulation of a master regulatory network
<p>Abstract</p> <p>Background</p> <p>The master regulator p53 tumor-suppressor protein through coordination of several downstream target genes and upstream transcription factors controls many pathways important for tumor suppression. While it has been reported that some of the p53's functions are microRNA-mediated, it is not known as to how many other microRNAs might contribute to the p53-mediated tumorigenesis.</p> <p>Results</p> <p>Here, we use bioinformatics-based integrative approach to identify and prioritize putative p53-regulated miRNAs, and unravel the miRNA-based microregulation of the p53 master regulatory network. Specifically, we identify putative microRNA regulators of a) transcription factors that are upstream or downstream to p53 and b) p53 interactants. The putative <it>p53-miRs </it>and their targets are prioritized using current knowledge of cancer biology and literature-reported cancer-miRNAs.</p> <p>Conclusion</p> <p>Our predicted p53-miRNA-gene networks strongly suggest that coordinated transcriptional and <it>p53-miR </it>mediated networks could be integral to tumorigenesis and the underlying processes and pathways.</p
Improved human disease candidate gene prioritization using mouse phenotype
<p>Abstract</p> <p>Background</p> <p>The majority of common diseases are multi-factorial and modified by genetically and mechanistically complex polygenic interactions and environmental factors. High-throughput genome-wide studies like linkage analysis and gene expression profiling, tend to be most useful for classification and characterization but do not provide sufficient information to identify or prioritize specific disease causal genes.</p> <p>Results</p> <p>Extending on an earlier hypothesis that the majority of genes that impact or cause disease share membership in any of several functional relationships we, for the first time, show the utility of mouse phenotype data in human disease gene prioritization. We study the effect of different data integration methods, and based on the validation studies, we show that our approach, ToppGene <url>http://toppgene.cchmc.org</url>, outperforms two of the existing candidate gene prioritization methods, SUSPECTS and ENDEAVOUR.</p> <p>Conclusion</p> <p>The incorporation of phenotype information for mouse orthologs of human genes greatly improves the human disease candidate gene analysis and prioritization.</p
CisMols Analyzer: identification of compositionally similar cis-element clusters in ortholog conserved regions of coordinately expressed genes
Combinatorial interactions of sequence-specific trans-acting factors with localized genomic cis-element clusters are the principal mechanism for regulating tissue-specific and developmental gene expression. With the emergence of expanding numbers of genome-wide expression analyses, the identification of the cis-elements responsible for specific patterns of transcriptional regulation represents a critical area of investigation. Computational methods for the identification of functional cis-regulatory modules are difficult to devise, principally because of the short length and degenerate nature of individual cis-element binding sites and the inherent complexity that is generated by combinatorial interactions within cis-clusters. Filtering candidate cis-element clusters based on phylogenetic conservation is helpful for an individual ortholog gene pair, but combining data from cis-conservation and coordinate expression across multiple genes is a more difficult problem. To approach this, we have extended an ortholog gene-pair database with additional analytical architecture to allow for the analysis and identification of maximal numbers of compositionally similar and phylogenetically conserved cis-regulatory element clusters from a list of user-selected genes. The system has been successfully tested with a series of functionally related and microarray profile-based co-expressed ortholog pairs of promoters and genes using known regulatory regions as training sets and co-expressed genes in the olfactory and immunohematologic systems as test sets. CisMols Analyzer is accessible via a Web interface at
GenomeTrafac: a whole genome resource for the detection of transcription factor binding site clusters associated with conventional and microRNA encoding genes conserved between mouse and human gene orthologs
Transcriptional cis-regulatory control regions frequently are found within non-coding DNA segments conserved across multi-species gene orthologs. Adopting a systematic gene-centric pipeline approach, we report here the development of a web-accessible database resource—GenomeTraFac ()—that allows genome-wide detection and characterization of compositionally similar cis-clusters that occur in gene orthologs between any two genomes for both microRNA genes as well as conventional RNA-encoding genes. Each ortholog gene pair can be scanned to visualize overall conserved sequence regions, and within these, the relative density of conserved cis-element motif clusters form graph peak structures. The results of these analyses can be mined en masse to identify most frequently represented cis-motifs in a list of genes. The system also provides a method for rapid evaluation and visualization of gene model-consistency between orthologs, and facilitates consideration of the potential impact of sequence variation in conserved non-coding regions to impact complex cis-element structures. Using the mouse and human genomes via the NCBI Reference Sequence database and the Sanger Institute miRBase, the system demonstrated the ability to identify validated transcription factor targets within promoter and distal genomic regulatory regions of both conventional and microRNA genes
Recommended from our members
Large-scale evaluation of automated clinical note de-identification and its impact on information extraction
Objective: (1) To evaluate a state-of-the-art natural language processing (NLP)-based approach to automatically de-identify a large set of diverse clinical notes. (2) To measure the impact of de-identification on the performance of information extraction algorithms on the de-identified documents. Material and methods A cross-sectional study that included 3503 stratified, randomly selected clinical notes (over 22 note types) from five million documents produced at one of the largest US pediatric hospitals. Sensitivity, precision, F value of two automated de-identification systems for removing all 18 HIPAA-defined protected health information elements were computed. Performance was assessed against a manually generated ‘gold standard’. Statistical significance was tested. The automated de-identification performance was also compared with that of two humans on a 10% subsample of the gold standard. The effect of de-identification on the performance of subsequent medication extraction was measured. Results: The gold standard included 30 815 protected health information elements and more than one million tokens. The most accurate NLP method had 91.92% sensitivity (R) and 95.08% precision (P) overall. The performance of the system was indistinguishable from that of human annotators (annotators' performance was 92.15%(R)/93.95%(P) and 94.55%(R)/88.45%(P) overall while the best system obtained 92.91%(R)/95.73%(P) on same text). The impact of automated de-identification was minimal on the utility of the narrative notes for subsequent information extraction as measured by the sensitivity and precision of medication name extraction. Discussion and conclusion NLP-based de-identification shows excellent performance that rivals the performance of human annotators. Furthermore, unlike manual de-identification, the automated approach scales up to millions of documents quickly and inexpensively
PU.1 positively regulates GATA-1 expression in mast cells
Coexpression of PU.1 and GATA-1 is required for proper specification of the mast cell lineage; however, in the myeloid and erythroid lineages, PU.1 and GATA-1 are functionally antagonistic. In this study, we report a transcriptional network in which PU.1 positively regulates GATA-1 expression in mast cell development. We isolated a variant mRNA isoform of GATA-1 in murine mast cells that is significantly upregulated during mast cell differentiation. This isoform contains an alternatively spliced first exon (IB) that is distinct from the first exon (IE) incorporated in the major erythroid mRNA transcript. In contrast to erythroid and megakaryocyte cells, in mast cells we show that PU.1 and GATA-2 predominantly occupy potential cis-regulatory elements in the IB exon region in vivo. Using reporter assays, we identify an enhancer flanking the IB exon that is activated by PU.1. Furthermore, we observe that in PU.1 -/- fetal liver cells, low levels of the IE GATA-1 isoform is expressed, but the variant IB isoform is absent. Reintroduction of PU.1 restores variant IB isoform and upregulates total GATA-1 protein expression, which is concurrent with mast cell differentiation. Our results are consistent with a transcriptional hierarchy in which PU.1, possibly in concert with GATA-2, activates GATA-1 expression in mast cells in a pathway distinct from that seen in the erythroid and megakaryocytic lineages. Copyright © 2010 by The American Association of Immunologists, Inc
Dysregulation of Mesenchymal Cell Survival Pathways in Severe Fibrotic Lung Disease: The Effect of Nintedanib Therapy
Impaired apoptotic clearance of myofibroblasts can result in the continuous expansion of scar tissue during the persistent injury in the lung. However, the molecular and cellular mechanisms underlying the apoptotic clearance of multiple mesenchymal cells including fibrocytes, fibroblasts and myofibroblasts in severe fibrotic lung diseases such as idiopathic pulmonary fibrosis (IPF) remain largely unknown. We analyzed the apoptotic pathways activated in mesenchymal cells of IPF and in a mouse model of TGFα-induced pulmonary fibrosis. We found that fibrocytes and myofibroblasts in fibrotic lung lesions have acquired resistance to Fas-induced apoptosis, and an FDA-approved anti-fibrotic agent, nintedanib, effectively induced apoptotic cell death in both. In support, comparative gene expression analyses suggest that apoptosis-linked gene networks similarly dysregulated in both IPF and a mouse model of TGFα-induced pulmonary fibrosis. TGFα mice treated with nintedanib show increased active caspase 3-positive cells in fibrotic lesions and reduced fibroproliferation and collagen production. Further, the long-term nintedanib therapy attenuated fibrocyte accumulation, collagen deposition, and lung function decline during TGFα-induced pulmonary fibrosis. These results highlight the importance of inhibiting survival pathways and other pro-fibrotic processes in the various types of mesenchymal cells and suggest that the TGFα mouse model is relevant for testing of anti-fibrotic drugs either alone or in combination with nintedanib
- …