67 research outputs found

    Algorithms and tools for splicing junction donor recognition in genomic DNA sequences

    Get PDF
    The consensus sequences at splicing junctions in genomic DNA are required for pre-mRNA breaking and rejoining which must be carried out precisely. Programs currently available for identification or prediction of transcribed sequences from within genomic DNA are far from being powerful enough to elucidate genomic structure completely[4]. In this research, we develop a degenerate pattern match algorithm for 5\u27 splicing site (Donor Site) recognition.. Using the Motif models we developed, we can mine out the degenerate pattern information from the consensus splicing junction sequences. Our experimental results show that, this algorithm can correctly recognize 93% of the total donor sites at the right positions in the test DNA group. And more than 91% of the donor sites the algorithm predicted are correct. These precision rates are higher than the best existing donor classification algorithm[25]. This research made a very important progress toward our full gene structure detection algorithm development

    Knowledge discovery and modeling in genomic databases

    Get PDF
    This dissertation research is targeted toward developing effective and accurate methods for identifying gene structures in the genomes of high eukaryotes, such as vertebrate organisms. Several effective hidden Markov models (HMMs) are developed to represent the consensus and degeneracy features of the functional sites including protein-translation start sites, mRNA splicing junction donor and acceptor sites in vertebrate genes. The HMM system based on the developed models is fully trained using an expectation maximization (EM) algorithm and the system performance is evaluated using a 10-way cross-validation method. Experimental results show that the proposed HMM system achieves high sensitivity and specificity in detecting the functional sites. This HMM system is then incorporated into a new gene detection system, called GeneScout. The main hypothesis is that, given a vertebrate genomic DNA sequence S, it is always possible to construct a directed acyclic graph G such that the path for the actual coding region of S is in the set of all paths on G. Thus, the gene detection problem is reduced to the analysis of paths in the graph G. A dynamic programming algorithm is employed by GeneScout to find the optimal path in G. Experimental results on the standard test dataset collected by Burset and Guigo indicate that GeneScout is comparable to existing gene discovery tools and complements the widely used GenScan system

    DEVELOPMENT AND APPLICATION OF MASS SPECTROMETRY-BASED PROTEOMICS TO GENERATE AND NAVIGATE THE PROTEOMES OF THE GENUS POPULUS

    Get PDF
    Historically, there has been tremendous synergy between biology and analytical technology, such that one drives the development of the other. Over the past two decades, their interrelatedness has catalyzed entirely new experimental approaches and unlocked new types of biological questions, as exemplified by the advancements of the field of mass spectrometry (MS)-based proteomics. MS-based proteomics, which provides a more complete measurement of all the proteins in a cell, has revolutionized a variety of scientific fields, ranging from characterizing proteins expressed by a microorganism to tracking cancer-related biomarkers. Though MS technology has advanced significantly, the analysis of complicated proteomes, such as plants or humans, remains challenging because of the incongruity between the complexity of the biological samples and the analytical techniques available. In this dissertation, analytical methods utilizing state-of-the-art MS instrumentation have been developed to address challenges associated with both qualitative and quantitative characterization of eukaryotic organisms. In particular, these efforts focus on characterizing Populus, a model organism and potential feedstock for bioenergy. The effectiveness of pre-existing MS techniques, initially developed to identify proteins reliably in microbial proteomes, were tested to define the boundaries and characterize the landscape of functional genome expression in Populus. Although these approaches were generally successful, achieving maximal proteome coverage was still limited by a number of factors, including genome complexity, the dynamic range of protein identification, and the abundance of protein variants. To overcome these challenges, improvements were needed in sample preparation, MS instrumentation, and bioinformatics. Optimization of experimental procedures and implementation of current state-of-the-art instrumentation afforded the most detailed look into the predicted proteome space of Populus, offering varying proteome perspectives: 1) network-wide, 2) pathway-specific, and 3) protein-level viewpoints. In addition, we implemented two bioinformatic approaches that were capable of decoding the plasticity of the Populus proteome, facilitating the identification of single amino acid polymorphisms and generating a more accurate profile of protein expression. Though the methods and results presented in this dissertation have direct implications in the study of bioenergy research, more broadly this dissertation focuses on developing techniques to contend with the notorious challenges associated with protein characterization in all eukaryotic organisms

    The RNA-binding protein LARP1 as potential biomarker and therapeutic target in ovarian cancer

    Get PDF
    Ovarian cancer is the most lethal gynaecological malignancy, responsible for over 4,000 deaths each year in the UK. There is growing evidence that mRNA-binding proteins (RBPs) can be post-transcriptional drivers of cancer progression. Here, I investigated the expression of the RBP LARP1 in ovarian malignancies and role of the protein in ovarian cancer cell biology. LARP1 is highly expressed at both an mRNA and protein level in ovarian cancers compared with benign tumours and normal ovarian tissue. I show that higher levels of LARP1 in tumour tissue are predictive of poor patient survival. Consistent with this clinical finding, in xenograft studies knockdown of LARP1 expression causes a dramatic reduction in tumour growth. In vitro, LARP1 knockdown is associated with increased apoptosis, and is sufficient to restore platinum sensitivity in chemotherapy-resistant cell lines. Furthermore, LARP1 is required to maintain cancer stem cell marker-positive populations, and knockdown decreases tumour-initiating potential, as demonstrated by in vivo limiting dilution assays. Transcriptome deep-sequencing following LARP1 knockdown revealed altered expression of multiple genes linked to survival and evasion of apoptosis, including BCL2 and BIK. Transcripts of both genes are in complex with LARP1 protein, and LARP1 maintains the stability of BCL2 mRNA, whilst actively destabilising BIK transcripts. This effect is mediated at the level of the 3’ untranslated region. I therefore conclude that by differentially regulating mRNA stability, LARP1 is a key post-transcriptional driver of tumourigenicity and cell survival in ovarian cancer.Open Acces

    Development of computational methods for the analysis of proteomics and next generation sequencing data

    Get PDF

    Grand Celebration: 10th Anniversary of the Human Genome Project

    Get PDF
    In 1990, scientists began working together on one of the largest biological research projects ever proposed. The project proposed to sequence the three billion nucleotides in the human genome. The Human Genome Project took 13 years and was completed in April 2003, at a cost of approximately three billion dollars. It was a major scientific achievement that forever changed the understanding of our own nature. The sequencing of the human genome was in many ways a triumph for technology as much as it was for science. From the Human Genome Project, powerful technologies have been developed (e.g., microarrays and next generation sequencing) and new branches of science have emerged (e.g., functional genomics and pharmacogenomics), paving new ways for advancing genomic research and medical applications of genomics in the 21st century. The investigations have provided new tests and drug targets, as well as insights into the basis of human development and diagnosis/treatment of cancer and several mysterious humans diseases. This genomic revolution is prompting a new era in medicine, which brings both challenges and opportunities. Parallel to the promising advances over the last decade, the study of the human genome has also revealed how complicated human biology is, and how much remains to be understood. The legacy of the understanding of our genome has just begun. To celebrate the 10th anniversary of the essential completion of the Human Genome Project, in April 2013 Genes launched this Special Issue, which highlights the recent scientific breakthroughs in human genomics, with a collection of papers written by authors who are leading experts in the field

    Use of neural networks to model molecular structure and function

    Get PDF
    This thesis is a study of some applications of neural networks - a recent computer algorithm - to modelling the structure and function of biologically important molecules. In Chapter 1, an introduction to neural networks is given. An overview of quantitative structure activity relationships (QSARs) is presented. The applications of neural networks to QSAR and to the prediction of structural and functional features of protein and nucleic acid sequences are reviewed. The neural network algorithms used are discussed in Chapter 2. In Chapter 3, a two-layer feed-forward neural network has been trained to recognise an ATP/GTP-binding local sequence motif. A comparably sophisticated statistical method was developed, which performed marginally better than the neural network. In a second study, described in Chapters 4 and 5, one of the largest data sets available for developing a quantitative structure activity relationship - the inhibition of dihydrofolate reductase by 2,4-diamino-6,6-dimethyl-5-phenyldihydrotriazine derivatives has been used to benchmark several computational methods. A hidden-layer neural network, a decision tree and inductive logic programming have been compared with the more established methods of linear regression and nearest neighbour. The data were represented in two ways: by the traditional Hansch parameters and by a new set of descriptors designed to allow the formulation of rules relating the activity of the inhibitors to their chemical structure. The performance of neural networks has been assessed rigourously in two distinct areas of biomolecular modelling; sequence analysis and drug design. The conclusions of these studies are presented in Chapter 6

    Annual Report

    Get PDF

    Plant Genetics and Molecular Biology

    Get PDF
    This book reviews the latest advances in multiple fields of plant biotechnology and the opportunities that plant genetics, genomics and molecular biology have offered for agriculture improvement. Advanced technologies can dramatically enhance our capacity in understanding the molecular basis of traits and utilizing the available resources for accelerated development of high yielding, nutritious, input-use efficient and climate-smart crop varieties. In this book, readers will discover the significant advances in plant genetics, structural and functional genomics, trait and gene discovery, transcriptomics, proteomics, metabolomics, epigenomics, nanotechnology and analytical & decision support tools in breeding. This book appeals to researchers, academics and other stakeholders of global agriculture

    Impact of gene expression profiling tests on breast cancer outcomes

    Get PDF
    prepared for Agency for Healthcare Research and Quality, U.S. Dept. of Health and Human Services ; prepared by the Johns Hopkins University Evidence-based Practice Center ; investigators, Luigi Marchionni ... [et al.]."Contract No. 290-02-0018.""January 2008.""The Agency for Healthcare Research and Quality (AHRQ), through its Evidence-Based Practice Centers (EPCs), sponsors the development of evidence reports and technology assessments to assist public- and private-sector organizations in their efforts to improve the quality of health care in the United States. The Centers for Disease Control and Prevention (CDC) requested and provided funding for this report. The reports and assessments provide organizations with comprehensive, science-based information on common, costly medical conditions and new health care technologies. The EPCs systematically review the relevant scientific literature on topics assigned to them by AHRQ and conduct additional analyses when appropriate prior to developing their reports and assessments." - p. iiiAlso available via the World Wide Web.Includes bibliographical references (p. 101-105).Marchionni L, Wilson RF, Marinopoulos SS, Wolff AC, Parmigiani G, Bass EB, Goodman SN. Impact of Gene Expression Profiling Tests on Breast Cancer Outcomes. Evidence Report/Technology Assessment No. 160. (Prepared by The Johns Hopkins University Evidencebased Practice Center under contract No. 290-02-0018). AHRQ Publication No. 08-E002. Rockville, MD: Agency for Healthcare Research and Quality. January 2008
    • …
    corecore