1,271 research outputs found

    Knowledge Management Approaches for predicting Biomarker and Assessing its Impact on Clinical Trials

    Get PDF
    The recent success of companion diagnostics along with the increasing regulatory pressure for better identification of the target population has created an unprecedented incentive for the drug discovery companies to invest into novel strategies for stratified biomarker discovery. Catching with this trend, trials with stratified biomarker in drug development have quadrupled in the last decade but represent a small part of all Interventional trials reflecting multiple co-developmental challenges of therapeutic compounds and companion diagnostics. To overcome the challenge, varied knowledge management and system biology approaches are adopted in the clinics to analyze/interpret an ever increasing collection of OMICS data. By semi-automatic screening of more than 150,000 trials, we filtered trials with stratified biomarker to analyse their therapeutic focus, major drivers and elucidated the impact of stratified biomarker programs on trial duration and completion. The analysis clearly shows that cancer is the major focus for trials with stratified biomarker. But targeted therapies in cancer require more accurate stratification of patient population. This can be augmented by a fresh approach of selecting a new class of biomolecules i.e. miRNA as candidate stratification biomarker. miRNA plays an important role in tumorgenesis in regulating expression of oncogenes and tumor suppressors; thus affecting cell proliferation, differentiation, apoptosis, invasion, angiogenesis. miRNAs are potential biomarkers in different cancer. However, the relationship between response of cancer patients towards targeted therapy and resulting modifications of the miRNA transcriptome in pathway regulation is poorly understood. With ever-increasing pathways and miRNA-mRNA interaction databases, freely available mRNA and miRNA expression data in multiple cancer therapy have created an unprecedented opportunity to decipher the role of miRNAs in early prediction of therapeutic efficacy in diseases. We present a novel SMARTmiR algorithm to predict the role of miRNA as therapeutic biomarker for an anti-EGFR monoclonal antibody i.e. cetuximab treatment in colorectal cancer. The application of an optimised and fully automated version of the algorithm has the potential to be used as clinical decision support tool. Moreover this research will also provide a comprehensive and valuable knowledge map demonstrating functional bimolecular interactions in colorectal cancer to scientific community. This research also detected seven miRNA i.e. hsa-miR-145, has-miR-27a, has- miR-155, hsa-miR-182, hsa-miR-15a, hsa-miR-96 and hsa-miR-106a as top stratified biomarker candidate for cetuximab therapy in CRC which were not reported previously. Finally a prospective plan on future scenario of biomarker research in cancer drug development has been drawn focusing to reduce the risk of most expensive phase III drug failures

    Localized inhibition of protein phosphatase 1 by NUAK1 promotes spliceosome activity and reveals a MYC-sensitive feedback control of transcription.

    Get PDF
    Deregulated expression of MYC induces a dependence on the NUAK1 kinase, but the molecular mechanisms underlying this dependence have not been fully clarified. Here, we show that NUAK1 is a predominantly nuclear protein that associates with a network of nuclear protein phosphatase 1 (PP1) interactors and that PNUTS, a nuclear regulatory subunit of PP1, is phosphorylated by NUAK1. Both NUAK1 and PNUTS associate with the splicing machinery. Inhibition of NUAK1 abolishes chromatin association of PNUTS, reduces spliceosome activity, and suppresses nascent RNA synthesis. Activation of MYC does not bypass the requirement for NUAK1 for spliceosome activity but significantly attenuates transcription inhibition. Consequently, NUAK1 inhibition in MYC-transformed cells induces global accumulation of RNAPII both at the pause site and at the first exon-intron boundary but does not increase mRNA synthesis. We suggest that NUAK1 inhibition in the presence of deregulated MYC traps non-productive RNAPII because of the absence of correctly assembled spliceosomes

    Text Mining and Gene Expression Analysis Towards Combined Interpretation of High Throughput Data

    Get PDF
    Microarrays can capture gene expression activity for thousands of genes simultaneously and thus make it possible to analyze cell physiology and disease processes on molecular level. The interpretation of microarray gene expression experiments profits from knowledge on the analyzed genes and proteins and the biochemical networks in which they play a role. The trend is towards the development of data analysis methods that integrate diverse data types. Currently, the most comprehensive biomedical knowledge source is a large repository of free text articles. Text mining makes it possible to automatically extract and use information from texts. This thesis addresses two key aspects, biomedical text mining and gene expression data analysis, with the focus on providing high-quality methods and data that contribute to the development of integrated analysis approaches. The work is structured in three parts. Each part begins by providing the relevant background, and each chapter describes the developed methods as well as applications and results. Part I deals with biomedical text mining: Chapter 2 summarizes the relevant background of text mining; it describes text mining fundamentals, important text mining tasks, applications and particularities of text mining in the biomedical domain, and evaluation issues. In Chapter 3, a method for generating high-quality gene and protein name dictionaries is described. The analysis of the generated dictionaries revealed important properties of individual nomenclatures and the used databases (Fundel and Zimmer, 2006). The dictionaries are publicly available via a Wiki, a web service, and several client applications (Szugat et al., 2005). In Chapter 4, methods for the dictionary-based recognition of gene and protein names in texts and their mapping onto unique database identifiers are described. These methods make it possible to extract information from texts and to integrate text-derived information with data from other sources. Three named entity identification systems have been set up, two of them building upon the previously existing tool ProMiner (Hanisch et al., 2003). All of them have shown very good performance in the BioCreAtIvE challenges (Fundel et al., 2005a; Hanisch et al., 2005; Fundel and Zimmer, 2007). In Chapter 5, a new method for relation extraction (Fundel et al., 2007) is presented. It was applied on the largest collection of biomedical literature abstracts, and thus a comprehensive network of human gene and protein relations has been generated. A classification approach (Küffner et al., 2006) can be used to specify relation types further; e. g., as activating, direct physical, or gene regulatory relation. Part II deals with gene expression data analysis: Gene expression data needs to be processed so that differentially expressed genes can be identified. Gene expression data processing consists of several sequential steps. Two important steps are normalization, which aims at removing systematic variances between measurements, and quantification of differential expression by p-value and fold change determination. Numerous methods exist for these tasks. Chapter 6 describes the relevant background of gene expression data analysis; it presents the biological and technical principles of microarrays and gives an overview of the most relevant data processing steps. Finally, it provides a short introduction to osteoarthritis, which is in the focus of the analyzed gene expression data sets. In Chapter 7, quality criteria for the selection of normalization methods are described, and a method for the identification of differentially expressed genes is proposed, which is appropriate for data with large intensity variances between spots representing the same gene (Fundel et al., 2005b). Furthermore, a system is described that selects an appropriate combination of feature selection method and classifier, and thus identifies genes which lead to good classification results and show consistent behavior in different sample subgroups (Davis et al., 2006). The analysis of several gene expression data sets dealing with osteoarthritis is described in Chapter 8. This chapter contains the biomedical analysis of relevant disease processes and distinct disease stages (Aigner et al., 2006a), and a comparison of various microarray platforms and osteoarthritis models. Part III deals with integrated approaches and thus provides the connection between parts I and II: Chapter 9 gives an overview of different types of integrated data analysis approaches, with a focus on approaches that integrate gene expression data with manually compiled data, large-scale networks, or text mining. In Chapter 10, a method for the identification of genes which are consistently regulated and have a coherent literature background (Küffner et al., 2005) is described. This method indicates how gene and protein name identification and gene expression data can be integrated to return clusters which contain genes that are relevant for the respective experiment together with literature information that supports interpretation. Finally, in Chapter 11 ideas on how the described methods can contribute to current research and possible future directions are presented

    Structured digital tables on the Semantic Web: toward a structured digital literature

    Get PDF
    In parallel to the growth in bioscience databases, biomedical publications have increased exponentially in the past decade. However, the extraction of high-quality information from the corpus of scientific literature has been hampered by the lack of machine-interpretable content, despite text-mining advances. To address this, we propose creating a structured digital table as part of an overall effort in developing machine-readable, structured digital literature. In particular, we envision transforming publication tables into standardized triples using Semantic Web approaches. We identify three canonical types of tables (conveying information about properties, networks, and concept hierarchies) and show how more complex tables can be built from these basic types. We envision that authors would create tables initially using the structured triples for canonical types and then have them visually rendered for publication, and we present examples for converting representative tables into triples. Finally, we discuss how ‘stub' versions of structured digital tables could be a useful bridge for connecting together the literature with databases, allowing the former to more precisely document the later

    Proteomic and Phospho-Proteomic Profile of Human Platelets in Basal, Resting State: Insights into Integrin Signaling

    Get PDF
    During atherogenesis and vascular inflammation quiescent platelets are activated to increase the surface expression and ligand affinity of the integrin αIIbβ3 via inside-out signaling. Diverse signals such as thrombin, ADP and epinephrine transduce signals through their respective GPCRs to activate protein kinases that ultimately lead to the phosphorylation of the cytoplasmic tail of the integrin αIIbβ3 and augment its function. The signaling pathways that transmit signals from the GPCR to the cytosolic domain of the integrin are not well defined. In an effort to better understand these pathways, we employed a combination of proteomic profiling and computational analyses of isolated human platelets. We analyzed ten independent human samples and identified a total of 1507 unique proteins in platelets. This is the most comprehensive platelet proteome assembled to date and includes 190 membrane-associated and 262 phosphorylated proteins, which were identified via independent proteomic and phospho-proteomic profiling. We used this proteomic dataset to create a platelet protein-protein interaction (PPI) network and applied novel contextual information about the phosphorylation step to introduce limited directionality in the PPI graph. This newly developed contextual PPI network computationally recapitulated an integrin signaling pathway. Most importantly, our approach not only provided insights into the mechanism of integrin αIIbβ3 activation in resting platelets but also provides an improved model for analysis and discovery of PPI dynamics and signaling pathways in the future

    Proteomic and Phospho-Proteomic Profile of Human Platelets in Basal, Resting State: Insights into Integrin Signaling

    Get PDF
    During atherogenesis and vascular inflammation quiescent platelets are activated to increase the surface expression and ligand affinity of the integrin αIIbβ3 via inside-out signaling. Diverse signals such as thrombin, ADP and epinephrine transduce signals through their respective GPCRs to activate protein kinases that ultimately lead to the phosphorylation of the cytoplasmic tail of the integrin αIIbβ3 and augment its function. The signaling pathways that transmit signals from the GPCR to the cytosolic domain of the integrin are not well defined. In an effort to better understand these pathways, we employed a combination of proteomic profiling and computational analyses of isolated human platelets. We analyzed ten independent human samples and identified a total of 1507 unique proteins in platelets. This is the most comprehensive platelet proteome assembled to date and includes 190 membrane-associated and 262 phosphorylated proteins, which were identified via independent proteomic and phospho-proteomic profiling. We used this proteomic dataset to create a platelet protein-protein interaction (PPI) network and applied novel contextual information about the phosphorylation step to introduce limited directionality in the PPI graph. This newly developed contextual PPI network computationally recapitulated an integrin signaling pathway. Most importantly, our approach not only provided insights into the mechanism of integrin αIIbβ3 activation in resting platelets but also provides an improved model for analysis and discovery of PPI dynamics and signaling pathways in the future

    Semantic models as metrics for kernel-based interaction identification

    Get PDF
    Automatic detection of protein-protein interactions (PPIs) in biomedical publications is vital for efficient biological research. It also presents a host of new challenges for pattern recognition methodologies, some of which will be addressed by the research in this thesis. Proteins are the principal method of communication within a cell; hence, this area of research is strongly motivated by the needs of biologists investigating sub-cellular functions of organisms, diseases, and treatments. These researchers rely on the collaborative efforts of the entire field and communicate through experimental results published in reviewed biomedical journals. The substantial number of interactions detected by automated large-scale PPI experiments, combined with the ease of access to the digitised publications, has increased the number of results made available each day. The ultimate aim of this research is to provide tools and mechanisms to aid biologists and database curators in locating relevant information. As part of this objective this thesis proposes, studies, and develops new methodologies that go some way to meeting this grand challenge. Pattern recognition methodologies are one approach that can be used to locate PPI sentences; however, most accurate pattern recognition methods require a set of labelled examples to train on. For this particular task, the collection and labelling of training data is highly expensive. On the other hand, the digital publications provide a plentiful source of unlabelled data. The unlabelled data is used, along with word cooccurrence models, to improve classification using Gaussian processes, a probabilistic alternative to the state-of-the-art support vector machines. This thesis presents and systematically assesses the novel methods of using the knowledge implicitly encoded in biomedical texts and shows an improvement on the current approaches to PPI sentence detection

    Evaluation of the relevance and impact of kinase dysfunction in neurological disorders through proteomics and phosphoproteomics bioinformatics

    Get PDF
    Phosphorylation is an important post-translational modification that is involved in various biological processes and its dysregulation has in particular been linked to diseases of the central nervous system including neurological disorders. The present thesis characterizes alterations in the phosphoproteome and protein abundance associated with schizophrenia and Parkinson's disease, with the goal of uncovering the underlying disease mechanisms. To support this goal, I eventually created an automated analysis pipeline in R to streamline the analysis process of proteomics and phosphoproteomics data. Mass spectrometry (MS) technology is utilized to generate proteomics and phosphoproteomics data. Study I of the thesis demonstrates an automated R pipeline, PhosPiR, created to perform multi-level functional analyses of MS data after the identification and quantification of the raw spectral data. The pipeline does not require coding knowledge to run. It supports 18 different organisms, and provides analyses of MS intensity data from preprocessing, normalization and imputation, through to figure overviews, statistical analysis, enrichment analysis, PTM-SEA, kinase prediction and activity analysis, network analysis, hub analysis, annotation mining, and homolog alignment. The LRRK2-G2019S mutation, a frequent genetic cause of late onset Parkinson's disease, was investigated in Study II and III. One study investigated the mechanism of LRRK2-G2019S function in brain, and the other identified proteins with significantly altered overall translation patterns in sporadic and LRRK2-G2019S patient samples. Specifically, study II identified that LRRK2 is localized to the small 40S ribosomal subunit and that LRRK2 activity suppresses RNA translation, as validated in cell and animal models of Parkinson's disease and in patient cells. Study III utilized bio-orthogonal non-canonical amino acid tagging to label newly translated proteins in order to identify which proteins were affected by repressed translation in patient samples, using mass spectrometry analysis. The analysis revealed 33 and 30 nascent proteins with reduced synthesis in sporadic and LRRK2-G2019S Parkinson’s cases, respectively. The biological process "cytosolic signal recognition particle (SRP)-dependent co-translational protein targeting to membrane" was functionally significantly affected in both sporadic and LRRK2-G2019S Parkinson's, while "Tubulin/FTsz C-terminal domain superfamily network" was only significantly enriched in LRRK2-G2019S Parkinson’s cases. The findings were validated bytargeted proteomics and immunoblotting. Study IV is conducted to investigate the role of JNK1 in schizophrenia. Wild type and Jnk1-/- mice were used to analyze the phosphorylation profile using LC-MS/MS analysis. 126 proteins associated with schizophrenia were identified to overlap with the significantly differentially phosphorylated proteins in Jnk1-/- mice brain. The NMDAR trafficking pathway was found to be highly enriched, and surface staining of NMDAR subunits in neurons showed that surface expression of both subunits in Jnk1-/- neurons was significantly decreased. Further behavioral tests conducted with MK801 treatment have associated the Jnk1-/- molecular and behavioral phenotype with schizophrenia and neuropsychiatric disease

    Foreword

    Get PDF
    The aim of this Workshop is to focus on building and evaluating resources used to facilitate biomedical text mining, including their design, update, delivery, quality assessment, evaluation and dissemination. Key resources of interest are lexical and knowledge repositories (controlled vocabularies, terminologies, thesauri, ontologies) and annotated corpora, including both task-specific resources and repositories reengineered from biomedical or general language resources. Of particular interest is the process of building annotated resources, including designing guidelines and annotation schemas (aiming at both syntactic and semantic interoperability) and relying on language engineering standards. Challenging aspects are updates and evolution management of resources, as well as their documentation, dissemination and evaluation
    corecore