87 research outputs found
Unified translation repression mechanism for microRNAs and upstream AUGs
Abstract
Background
MicroRNAs (miRNAs) are endogenous small RNAs that modulate gene expression at the post-transcriptional level by binding complementary sites in the 3'-UTR. In a recent genome-wide study reporting a new miRNA target class (miBridge), we identified and validated interactions between 5'-UTRs and miRNAs. Separately, upstream AUGs (uAUGs) in 5'-UTRs are known to regulate genes translationally without affecting mRNA levels, one of the mechanisms for miRNA-mediated repression.
Results
Using sequence data from whole-genome cDNA alignments we identified 1418 uAUG sequences on the 5'-UTR that specifically interact with 3'-ends of conserved miRNAs. We computationally identified miRNAs that can target six genes through their uAUGs that were previously reported to suppress translation. We extended this meta-analysis by confirming expression of these miRNAs in cell-lines used in the uAUG studies. Similarly, seven members of the KLF family of genes containing uAUGs were computationally identified as interacting with several miRNAs. Using KLF9 as an example (whose protein expression is limited to brain tissue despite the mRNA being expressed ubiquitously), we show computationally that miRNAs expressed only in HeLa cells and not in neuroblastoma (N2A) cells can bind the uAUGs responsible for translation inhibition. Our computed results demonstrate that tissue- or cell-line specific repression of protein translation by uAUGs can be explained by the presence or absence of miRNAs that target these uAUG sequences. We propose that these uAUGs represent a subset of miRNA interaction sites on 5'-UTRs in miBridge, whereby a miRNA binding a uAUG hinders the progression of ribosome scanning the mRNA before it reaches the open reading frame (ORF).
Conclusions
While both miRNAs and uAUGs are separately known to down-regulate protein expression, we show that they may be functionally related by identifying potential interactions through a sequence-specific binding mechanism. Using prior experimental evidence that shows uAUG effects on translation repression together with miRNA expression data specific to cell lines, we demonstrate through computational analysis that cell-specific down-regulation of protein expression (while maintaining mRNA levels) correlates well with the simultaneous presence of miRNA and target uAUG sequences in one cell type and not others, suggesting tissue-specific translation repression by miRNAs through uAUGs.http://deepblue.lib.umich.edu/bitstream/2027.42/112383/1/12864_2009_Article_2749.pd
miBLAST: scalable evaluation of a batch of nucleotide sequence queries with BLAST
A common task in many modern bioinformatics applications is to match a set of nucleotide query sequences against a large sequence dataset. Exis-ting tools, such as BLAST, are designed to evaluate a single query at a time and can be unacceptably slow when the number of sequences in the query set is large. In this paper, we present a new algorithm, called miBLAST, that evaluates such batch workloads efficiently. At the core, miBLAST employs a q-gram filtering and an index join for efficiently detecting similarity between the query sequences and database sequences. This set-oriented technique, which indexes both the query and the database sets, results in substantial performance improvements over existing methods. Our results show that miBLAST is significantly faster than BLAST in many cases. For example, miBLAST aligned 247 965 oligonucleotide sequences in the Affymetrix probe set against the Human UniGene in 1.26 days, compared with 27.27 days with BLAST (an improvement by a factor of 22). The relative performance of miBLAST increases for larger word sizes; however, it decreases for longer queries. miBLAST employs the familiar BLAST statistical model and output format, guaranteeing the same accuracy as BLAST and facilitating a seamless transition for existing BLAST users
Integrated metabolome and transcriptome analysis of the NCI60 dataset
Abstract
Background
Metabolite profiles can be used for identifying molecular signatures and mechanisms underlying diseases since they reflect the outcome of complex upstream genomic, transcriptomic, proteomic and environmental events. The scarcity of publicly accessible large scale metabolome datasets related to human disease has been a major obstacle for assessing the potential of metabolites as biomarkers as well as understanding the molecular events underlying disease-related metabolic changes. The availability of metabolite and gene expression profiles for the NCI-60 cell lines offers the possibility of identifying significant metabolome and transcriptome features and discovering unique molecular processes related to different cancer types.
Methods
We utilized a combination of analytical methods in the R statistical package to evaluate metabolic features associated with cancer cell lines from different tissue origins, identify metabolite-gene correlations and detect outliers cell lines based on metabolome and transcriptome data. Statistical analysis results are integrated with metabolic pathway annotations as well as COSMIC and Tumorscape databases to explore associated molecular mechanisms.
Results
Our analysis reveals that although the NCI-60 metabolome dataset is quite noisy comparing with microarray-based transcriptome data, it does contain tissue origin specific signatures. We also identified biologically meaningful gene-metabolite associations. Most remarkably, several abnormal gene-metabolite relationships identified by our approach can be directly linked to known gene mutations and copy number variations in the corresponding cell lines.
Conclusions
Our results suggest that integrative metabolome and transcriptome analysis is a powerful method for understanding molecular machinery underlying various pathophysiological processes. We expect the availability of large scale metabolome data in the coming years will significantly promote the discovery of novel biomarkers, which will in turn improve the understanding of molecular mechanism underlying diseases.http://deepblue.lib.umich.edu/bitstream/2027.42/112946/1/12859_2011_Article_4394.pd
MiSearch adaptive pubMed search tool
Summary: MiSearch is an adaptive biomedical literature search tool that ranks citations based on a statistical model for the likelihood that a user will choose to view them. Citation selections are automatically acquired during browsing and used to dynamically update a likelihood model that includes authorship, journal and PubMed indexing information. The user can optionally elect to include or exclude specific features and vary the importance of timeliness in the ranking
High-throughput next-generation sequencing technologies foster new cutting-edge computing techniques in bioinformatics
The advent of high-throughput next generation sequencing technologies have fostered enormous potential applications of supercomputing techniques in genome sequencing, epi-genetics, metagenomics, personalized medicine, discovery of non-coding RNAs and protein-binding sites. To this end, the 2008 International Conference on Bioinformatics and Computational Biology (Biocomp) – 2008 World Congress on Computer Science, Computer Engineering and Applied Computing (Worldcomp) was designed to promote synergistic inter/multidisciplinary research and education in response to the current research trends and advances. The conference attracted more than two thousand scientists, medical doctors, engineers, professors and students gathered at Las Vegas, Nevada, USA during July 14–17 and received great success. Supported by International Society of Intelligent Biological Medicine (ISIBM), International Journal of Computational Biology and Drug Design (IJCBDD), International Journal of Functional Informatics and Personalized Medicine (IJFIPM) and the leading research laboratories from Harvard, M.I.T., Purdue, UIUC, UCLA, Georgia Tech, UT Austin, U. of Minnesota, U. of Iowa etc, the conference received thousands of research papers. Each submitted paper was reviewed by at least three reviewers and accepted papers were required to satisfy reviewers' comments. Finally, the review board and the committee decided to select only 19 high-quality research papers for inclusion in this supplement to BMC Genomics based on the peer reviews only. The conference committee was very grateful for the Plenary Keynote Lectures given by: Dr. Brian D. Athey (University of Michigan Medical School), Dr. Vladimir N. Uversky (Indiana University School of Medicine), Dr. David A. Patterson (Member of United States National Academy of Sciences and National Academy of Engineering, University of California at Berkeley) and Anousheh Ansari (Prodea Systems, Space Ambassador). The theme of the conference to promote synergistic research and education has been achieved successfully
Ontology-Based Combinatorial Comparative Analysis of Adverse Events Associated with Killed and Live Influenza Vaccines
Vaccine adverse events (VAEs) are adverse bodily changes occurring after vaccination. Understanding the adverse event (AE)
profiles is a crucial step to identify serious AEs. Two different types of seasonal influenza vaccines have been used on the
market: trivalent (killed) inactivated influenza vaccine (TIV) and trivalent live attenuated influenza vaccine (LAIV). Different
adverse event profiles induced by these two groups of seasonal influenza vaccines were studied based on the data drawn
from the CDC Vaccine Adverse Event Report System (VAERS). Extracted from VAERS were 37,621 AE reports for four TIVs
(Afluria, Fluarix, Fluvirin, and Fluzone) and 3,707 AE reports for the only LAIV (FluMist). The AE report data were analyzed by
a novel combinatorial, ontology-based detection of AE method (CODAE). CODAE detects AEs using Proportional Reporting
Ratio (PRR), Chi-square significance test, and base level filtration, and groups identified AEs by ontology-based hierarchical
classification. In total, 48 TIV-enriched and 68 LAIV-enriched AEs were identified (PRR.2, Chi-square score .4, and the
number of cases .0.2% of total reports). These AE terms were classified using the Ontology of Adverse Events (OAE),
MedDRA, and SNOMED-CT. The OAE method provided better classification results than the two other methods. Thirteen out
of 48 TIV-enriched AEs were related to neurological and muscular processing such as paralysis, movement disorders, and
muscular weakness. In contrast, 15 out of 68 LAIV-enriched AEs were associated with inflammatory response and respiratory
system disorders. There were evidences of two severe adverse events (Guillain-Barre Syndrome and paralysis) present in TIV.
Although these severe adverse events were at low incidence rate, they were found to be more significantly enriched in TIVvaccinated
patients than LAIV-vaccinated patients. Therefore, our novel combinatorial bioinformatics analysis discovered
that LAIV had lower chance of inducing these two severe adverse events than TIV. In addition, our meta-analysis found that
all previously reported positive correlation between GBS and influenza vaccine immunization were based on trivalent
influenza vaccines instead of monovalent influenza vaccines.This work was supported by the National Institutes of Health (NIH) grant U54 DA021519 for the National Center for Integrative Biomedical Informatics
and NIH National Institute of Allergy and Infectious Diseases (NIAID) grant R01AI081062. The funders had no role in study design, data collection and analysis,
decision to publish, or preparation of the manuscript.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/99110/1/journal.pone.0049941.pd
Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data
Genome-wide expression profiling is a powerful tool for implicating novel gene ensembles in cellular mechanisms of health and disease. The most popular platform for genome-wide expression profiling is the Affymetrix GeneChip. However, its selection of probes relied on earlier genome and transcriptome annotation which is significantly different from current knowledge. The resultant informatics problems have a profound impact on analysis and interpretation the data. Here, we address these critical issues and offer a solution. We identified several classes of problems at the individual probe level in the existing annotation, under the assumption that current genome and transcriptome databases are more accurate than those used for GeneChip design. We then reorganized probes on more than a dozen popular GeneChips into gene-, transcript- and exon-specific probe sets in light of up-to-date genome, cDNA/EST clustering and single nucleotide polymorphism information. Comparing analysis results between the original and the redefined probe sets reveals ∼30–50% discrepancy in the genes previously identified as differentially expressed, regardless of analysis method. Our results demonstrate that the original Affymetrix probe set definitions are inaccurate, and many conclusions derived from past GeneChip analyses may be significantly flawed. It will be beneficial to re-analyze existing GeneChip data with updated probe set definitions
2K09 and thereafter : the coming era of integrative bioinformatics, systems biology and intelligent computing for functional genomics and personalized medicine research
Significant interest exists in establishing synergistic research in bioinformatics, systems biology and intelligent computing. Supported by the United States National Science Foundation (NSF), International Society of Intelligent Biological Medicine (http://www.ISIBM.org), International Journal of Computational Biology and Drug Design (IJCBDD) and International Journal of Functional Informatics and Personalized Medicine, the ISIBM International Joint Conferences on Bioinformatics, Systems Biology and Intelligent Computing (ISIBM IJCBS 2009) attracted more than 300 papers and 400 researchers and medical doctors world-wide. It was the only inter/multidisciplinary conference aimed to promote synergistic research and education in bioinformatics, systems biology and intelligent computing. The conference committee was very grateful for the valuable advice and suggestions from honorary chairs, steering committee members and scientific leaders including Dr. Michael S. Waterman (USC, Member of United States National Academy of Sciences), Dr. Chih-Ming Ho (UCLA, Member of United States National Academy of Engineering and Academician of Academia Sinica), Dr. Wing H. Wong (Stanford, Member of United States National Academy of Sciences), Dr. Ruzena Bajcsy (UC Berkeley, Member of United States National Academy of Engineering and Member of United States Institute of Medicine of the National Academies), Dr. Mary Qu Yang (United States National Institutes of Health and Oak Ridge, DOE), Dr. Andrzej Niemierko (Harvard), Dr. A. Keith Dunker (Indiana), Dr. Brian D. Athey (Michigan), Dr. Weida Tong (FDA, United States Department of Health and Human Services), Dr. Cathy H. Wu (Georgetown), Dr. Dong Xu (Missouri), Drs. Arif Ghafoor and Okan K Ersoy (Purdue), Dr. Mark Borodovsky (Georgia Tech, President of ISIBM), Dr. Hamid R. Arabnia (UGA, Vice-President of ISIBM), and other scientific leaders. The committee presented the 2009 ISIBM Outstanding Achievement Awards to Dr. Joydeep Ghosh (UT Austin), Dr. Aidong Zhang (Buffalo) and Dr. Zhi-Hua Zhou (Nanjing) for their significant contributions to the field of intelligent biological medicine
CLO: The cell line ontology
Abstract
Background
Cell lines have been widely used in biomedical research. The community-based Cell Line Ontology (CLO) is a member of the OBO Foundry library that covers the domain of cell lines. Since its publication two years ago, significant updates have been made, including new groups joining the CLO consortium, new cell line cells, upper level alignment with the Cell Ontology (CL) and the Ontology for Biomedical Investigation, and logical extensions.
Construction and content
Collaboration among the CLO, CL, and OBI has established consensus definitions of cell line-specific terms such as ‘cell line’, ‘cell line cell’, ‘cell line culturing’, and ‘mortal’ vs. ‘immortal cell line cell’. A cell line is a genetically stable cultured cell population that contains individual cell line cells. The hierarchical structure of the CLO is built based on the hierarchy of the in vivo cell types defined in CL and tissue types (from which cell line cells are derived) defined in the UBERON cross-species anatomy ontology. The new hierarchical structure makes it easier to browse, query, and perform automated classification. We have recently added classes representing more than 2,000 cell line cells from the RIKEN BRC Cell Bank to CLO. Overall, the CLO now contains ~38,000 classes of specific cell line cells derived from over 200 in vivo cell types from various organisms.
Utility and discussion
The CLO has been applied to different biomedical research studies. Example case studies include annotation and analysis of EBI ArrayExpress data, bioassays, and host-vaccine/pathogen interaction. CLO’s utility goes beyond a catalogue of cell line types. The alignment of the CLO with related ontologies combined with the use of ontological reasoners will support sophisticated inferencing to advance translational informatics development.http://deepblue.lib.umich.edu/bitstream/2027.42/109554/1/13326_2013_Article_185.pd
- …