Search CORE

2,203 research outputs found

Protein Ontology: Enhancing and scaling up the representation of protein entities

Author: Arighi Cecilia N.
Blake Judith A.
Bona Jonathan
Chen Chuming
Chen Sheng-Chih
Christie Karen R.
Cowart Julie
D'Eustachio Peter
Diehl Alexander D.
Drabkin Harold J.
Duncan William D.
Huang Hongzhan
Natale Darren A.
Ren Jia
Ross Karen
Ruttenberg Alan
Publication venue
Publication date: 01/01/2017
Field of study

The Protein Ontology (PRO; http://purl.obolibrary.org/obo/pr) formally defines and describes taxon-specific and taxon-neutral protein-related entities in three major areas: proteins related by evolution; proteins produced from a given gene; and protein-containing complexes. PRO thus serves as a tool for referencing protein entities at any level of specificity. To enhance this ability, and to facilitate the comparison of such entities described in different resources, we developed a standardized representation of proteoforms using UniProtKB as a sequence reference and PSI-MOD as a post-translational modification reference. We illustrate its use in facilitating an alignment between PRO and Reactome protein entities. We also address issues of scalability, describing our first steps into the use of text mining to identify protein-related entities, the large-scale import of proteoform information from expert curated resources, and our ability to dynamically generate PRO terms. Web views for individual terms are now more informative about closely-related terms, including for example an interactive multiple sequence alignment. Finally, we describe recent improvement in semantic utility, with PRO now represented in OWL and as a SPARQL endpoint. These developments will further support the anticipated growth of PRO and facilitate discoverability of and allow aggregation of data relating to protein entities

PhilPapers

ImmPort, toward repurposing of open access immunological assay data for translational and clinical research

Author: Bhattacharya Sanchita
Butte Atul
Chen Jieming
Dunn Patrick
Hu Zicheng
Schaefer Henry
Shankar Ravi
Shen-Orr Shai
Smith Barry
Thomas Cristel
Thomson Elizabeth
Wiser Jeffrey
Zalocusky Kelly
Publication venue
Publication date: 01/01/2018
Field of study

Immunology researchers are beginning to explore the possibilities of reproducibility, reuse and secondary analyses of immunology data. Open-access datasets are being applied in the validation of the methods used in the original studies, leveraging studies for meta-analysis, or generating new hypotheses. To promote these goals, the ImmPort data repository was created for the broader research community to explore the wide spectrum of clinical and basic research data and associated findings. The ImmPort ecosystem consists of four components–Private Data, Shared Data, Data Analysis, and Resources—for data archiving, dissemination, analyses, and reuse. To date, more than 300 studies have been made freely available through the ImmPort Shared Data portal , which allows research data to be repurposed to accelerate the translation of new insights into discoveries

PhilPapers

eScholarship - University of California

Application of Biomedical Text Mining

Author: Gong Lejun
Publication venue: 'IntechOpen'
Publication date: 04/04/2018
Field of study

With the enormous volume of biological literature, increasing growth phenomenon due to the high rate of new publications is one of the most common motivations for the biomedical text mining. Aiming at this massive literature to process, it could extract more biological information for mining biomedical knowledge. Using the information will help understand the mechanism of disease generation, promote the development of disease diagnosis technology, and promote the development of new drugs in the field of biomedical research. Based on the background, this chapter introduces the rise of biomedical text mining. Then, it describes the biomedical text-mining technology, namely natural language processing, including the several components. This chapter emphasizes the two aspects in biomedical text mining involving static biomedical information recognization and dynamic biomedical information extraction using instance analysis from our previous works. The aim is to provide a way to quickly understand biomedical text mining for some researchers

IntechOpen

Crossref

Otter-Knowledge: benchmarks of multimodal knowledge graph representation learning from different sources for drug discovery

Author: Fernández-Díaz Raúl
Gallindo Marcos Martínez
Lam Hoang Thanh
López Vanessa
Picco Gabriele
Ramis Cesar Berrospi
Sbodio Marco Luca
Valls Víctor
Zayats Mykhaylo
Publication venue
Publication date: 22/06/2023
Field of study

Recent research in representation learning utilizes large databases of proteins or molecules to acquire knowledge of drug and protein structures through unsupervised learning techniques. These pre-trained representations have proven to significantly enhance the accuracy of subsequent tasks, such as predicting the affinity between drugs and target proteins. In this study, we demonstrate that by incorporating knowledge graphs from diverse sources and modalities into the sequences or SMILES representation, we can further enrich the representation and achieve state-of-the-art results on established benchmark datasets. We provide preprocessed and integrated data obtained from 7 public sources, which encompass over 30M triples. Additionally, we make available the pre-trained models based on this data, along with the reported outcomes of their performance on three widely-used benchmark datasets for drug-target binding affinity prediction found in the Therapeutic Data Commons (TDC) benchmarks. Additionally, we make the source code for training models on benchmark datasets publicly available. Our objective in releasing these pre-trained models, accompanied by clean data for model pretraining and benchmark results, is to encourage research in knowledge-enhanced representation learning

arXiv.org e-Print Archive

The Infectious Disease Ontology in the Age of COVID-19

Author: Babcock Shane
Beverley John
Cowell Lindsay G.
Smith Barry
Publication venue
Publication date: 01/01/2021
Field of study

The Infectious Disease Ontology (IDO) is a suite of interoperable ontology modules that aims to provide coverage of all aspects of the infectious disease domain, including biomedical research, clinical care, and public health. IDO Core is designed to be a disease and pathogen neutral ontology, covering just those types of entities and relations that are relevant to infectious diseases generally. IDO Core is then extended by a collection of ontology modules focusing on specific diseases and pathogens. In this paper we present applications of IDO Core within various areas of infectious disease research, together with an overview of all IDO extension ontologies and the methodology on the basis of which they are built. We also survey recent developments involving IDO, including the creation of IDO Virus; the Coronaviruses Infectious Disease Ontology (CIDO); and an extension of CIDO focused on COVID-19 (IDO-CovID-19).We also discuss how these ontologies might assist in information-driven efforts to deal with the ongoing COVID-19 pandemic, to accelerate data discovery in the early stages of future pandemics, and to promote reproducibility of infectious disease research

PhilPapers

Directory of Open Access Journals

PubMed Central

PathwayMatcher: proteoform-centric network construction enables fine-granularity multi-omics pathway mapping

Author: Barsnes Harald
Burger Bram
Fabregat Antonio
Hermjakob Henning
Hernández Sánchez Luis Francisco
Horro Marcos Carlos
Johansson Stefan
Njølstad Pål Rasmus
Vaudel Marc
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2019
Field of study

Background Mapping biomedical data to functional knowledge is an essential task in bioinformatics and can be achieved by querying identifiers (e.g., gene sets) in pathway knowledge bases. However, the isoform and posttranslational modification states of proteins are lost when converting input and pathways into gene-centric lists. Findings Based on the Reactome knowledge base, we built a network of protein-protein interactions accounting for the documented isoform and modification statuses of proteins. We then implemented a command line application called PathwayMatcher (github.com/PathwayAnalysisPlatform/PathwayMatcher) to query this network. PathwayMatcher supports multiple types of omics data as input and outputs the possibly affected biochemical reactions, subnetworks, and pathways. Conclusions PathwayMatcher enables refining the network representation of pathways by including proteoforms defined as protein isoforms with posttranslational modifications. The specificity of pathway analyses is hence adapted to different levels of granularity, and it becomes possible to distinguish interactions between different forms of the same protein.publishedVersio

University of Bergen

NORA - Norwegian Open Research Archives

Enabling Web-scale data integration in biomedicine through Linked Open Data

Author: Fernandez Garcia Javier David
Kamdar Maulik R.
Musen Mark A.
Polleres Axel
Tudorache Tania
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

The biomedical data landscape is fragmented with several isolated, heterogeneous data and knowledge sources, which use varying formats, syntaxes, schemas, and entity notations, existing on the Web. Biomedical researchers face severe logistical and technical challenges to query, integrate, analyze, and visualize data from multiple diverse sources in the context of available biomedical knowledge. Semantic Web technologies and Linked Data principles may aid toward Web-scale semantic processing and data integration in biomedicine. The biomedical research community has been one of the earliest adopters of these technologies and principles to publish data and knowledge on the Web as linked graphs and ontologies, hence creating the Life Sciences Linked Open Data (LSLOD) cloud. In this paper, we provide our perspective on some opportunities proffered by the use of LSLOD to integrate biomedical data and knowledge in three domains: (1) pharmacology, (2) cancer research, and (3) infectious diseases. We will discuss some of the major challenges that hinder the wide-spread use and consumption of LSLOD by the biomedical research community. Finally, we provide a few technical solutions and insights that can address these challenges. Eventually, LSLOD can enable the development of scalable, intelligent infrastructures that support artificial intelligence methods for augmenting human intelligence to achieve better clinical outcomes for patients, to enhance the quality of biomedical research, and to improve our understanding of living systems

Elektronische Publikationen der Wirtschaftsuniversität Wien

Reply to: Soils need to be considered when assessing the impacts of land-use change on carbon sequestration

Author: Bruckner Martin
Canelas Joana
Eisenmenger Nina
Erb Karl-Heinz
Hilbers Jelle P.
Huijbregts Mark A. J.
Kastner Thomas
Marques Alexandra
Martins Inês S.
Pereira Henrique M.
Plutzar Christoph
Stadler Konstantin
Theurl Michaela C.
Tukker Arnold
Wood Richard
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/11/2019
Field of study

Industrial Ecolog

Elektronische Publikationen der Wirtschaftsuniversität Wien

Leiden University Scholary Publications

The Gene Ontology knowledgebase in 2023

Author: Ahmed Saadullah H
Aimo Lucila
Aleksander Suzi A
Aleksander Suzi A
Ali Kadhum Mohamed
Antonazzo Giulia
Argoud-Puy Ghislaine
Asanitthong Praoparn
Aspromonte Maria Cristina
Attrill Helen
Auchincloss Andrea
Axelsen Kristian
Axelsen Kristian
Bakker Erika
Balhoff James
Bateman Alex
Berardini Tanya Z
Blake Judith
Blatter Marie-Claude
Boutet Emmanuel
Bowler-Barnett Emily
Breuza Lionel
Bridge Alan
Bridge Alan
Bye-A-Jee Hema
Carbon Seth
Casals-Casas Cristina
Chan Juancarlos
Cherry J Michael
Cherry J Michael
Chisholm Rex L
Christie Karen
Cooper Laurel
Corbani Lori
Coudert Elisabeth
Cuzick Alayne
D'Eustachio Peter
De Pons Jeffrey L
Denny Paul
Diamantakis Stavros
Diehl Alexander D
Dolan Mary E
Dos Santos Gil
Drabkin Harold J
Drabkin Harold J
Dwinell Melinda R
Ebert Dustin
Elser Justin
Engel Stacia R
Erdol Meltem N
Estreicher Anne
Feuermann Marc
Feuermann Marc
Fey Petra
Fisher Malcolm
Gage Matthew C
Gaudet Pascale
Gene Ontology Consortium
Giglio Michelle
Gos Arnaud
Gruaz-Gumowski Nadine
Gupta Parul
Harris Nomi L
Hayman G Thomas
Hill David P
Hill David P
Hulo Chantal
Hyka-Nouspikel Nevila
Hyka-Nouspikel Nevila
Ignatchenko Alexandr
Ishtiaq Rizwan
Jaiswal Pankaj
Jaiswal Pankaj
James-Zorn Christina
Jungo Florence
Kaldunski Mary L
Karra Kalpana
Kwitek Anne E
Laulederkind Stanley JF
Le Mercier Philippe
Lee Raymond
Lee Raymond
Lera-Ramirez Manuel
Li Kan Yan Chloe
Lieberherr Damien
Livia Famiglietti Maria
Lock Antonia
Logie Colin
Long Miao
Lovering Ruth C
Luna Buitrago Diana
Lussi Yvonne
Magrane Michele
Martin Maria J
Marygold Steven
Masson Patrick
Mi Huaiyu
Michalak Aleksandra
Miyasato Stuart R
Morgat Anne
Morgat Anne
Moxon Sierra
Mungall Christopher J
Muruganugan Anushya
Mushayahama Tremayne
Nadendla Suvarna
Naithani Sushma
Nash Robert S
Ni Li
Nugnes Maria Victoria
Oliferenko Snezhana
Orchard Sandra
Pedruzzi Ivo
Pesala Angeline
Ponferrada Virgilio
Pourcel Lucille
Poux Sylvain
Pritazahra Armalya
Quaglia Federica
Raciti Daniela
Ramachandran Sridhar
Ramsey Jolene
Raposo Pedro
Reiser Leonore
Rivoire Catherine
Rutherford Kim
Ruzicka Leyla
Saverimuttu Shirin CC
Seager James
Siegele Deborah A
Sitnikov Dmitry
Skrzypek Marek S
Smith Cynthia
Speretta Elena
Sternberg Paul W
Strelets Victor
Su Renzhi
Sundaram Shyamala
Tabone Christopher J
Thomas Paul D
Thurlow Kate E
Thurmond Jim
Tosatto Silvio
Tutaj Marek A
Tyagi Nidhi
Van Auken Kimberly
Vedi Mahima
Wang Shur-Jen
Warner Kate
Weng Shuai
Westerfield Monte
Wong Edith D
Wood Valerie
Zarowiecki Magdalena
Zaru Rossana
Zhou Pinglei
Zorn Aaron
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/05/2023
Field of study

The Gene Ontology (GO) knowledgebase (http://geneontology.org) is a comprehensive resource concerning the functions of genes and gene products (proteins and noncoding RNAs). GO annotations cover genes from organisms across the tree of life as well as viruses, though most gene function knowledge currently derives from experiments carried out in a relatively small number of model organisms. Here, we provide an updated overview of the GO knowledgebase, as well as the efforts of the broad, international consortium of scientists that develops, maintains, and updates the GO knowledgebase. The GO knowledgebase consists of three components: (1) the GO-a computational knowledge structure describing the functional characteristics of genes; (2) GO annotations-evidence-supported statements asserting that a specific gene product has a particular functional characteristic; and (3) GO Causal Activity Models (GO-CAMs)-mechanistic models of molecular "pathways" (GO biological processes) created by linking multiple GO annotations using defined relations. Each of these components is continually expanded, revised, and updated in response to newly published discoveries and receives extensive QA checks, reviews, and user feedback. For each of these components, we provide a description of the current contents, recent developments to keep the knowledgebase up to date with new discoveries, and guidance on how users can best make use of the data that we provide. We conclude with future directions for the project

UCL Discovery

NETME: on-the-fly knowledge network construction from biomedical literature

Author: Alaimo Salvatore
Bellomo Lorenzo
Billeci Fabrizio
Borzì Stefano
Di Maria Antonio
Ferragina Paolo
Ferro Alfredo
Muscolino Alessandro
Pulvirenti Alfredo
Rapicavoli Rosaria Valentina
Publication venue
Publication date: 01/01/2022
Field of study

Background: The rapidly increasing biological literature is a key resource to automatically extract and gain knowledge concerning biological elements and their relations. Knowledge Networks are helpful tools in the context of biological knowledge discovery and modeling. Results: We introduce a novel system called NETME, which, starting from a set of full-texts obtained from PubMed, through an easy-to-use web interface, interactively extracts biological elements from ontological databases and then synthesizes a network inferring relations among such elements. The results clearly show that our tool is capable of inferring comprehensive and reliable biological networks

Archivio istituzionale della Ricerca - Scuola Normale Superiore

PubMed Central