Search CORE

11,686 research outputs found

Matching experiments across species using expression values and textual information

Author: A. Wise
Barr
Dovio
Kaletta
Kuo
Moriwaki
Needleman
Netea
Rifkin
Rustici
Stark
Sugimoto
Susztak
Z. Bar-Joseph
Z. N. Oltvai
Zinman
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: With the vast increase in the number of gene expression datasets deposited in public databases, novel techniques are required to analyze and mine this wealth of data. Similar to the way BLAST enables cross-species comparison of sequence data, tools that enable cross-species expression comparison will allow us to better utilize these datasets: cross-species expression comparison enables us to address questions in evolution and development, and further allows the identification of disease-related genes and pathways that play similar roles in humans and model organisms. Unlike sequence, which is static, expression data changes over time and under different conditions. Thus, a prerequisite for performing cross-species analysis is the ability to match experiments across species

Crossref

PubMed Central

Large-scale event extraction from literature with multi-level gene normalization

Author: Ananiadou Sophia
Bjorne Jari
Ginter Filip
Hakala Kai
Kao Hung-Yu
Lu Zhiyong
Pyysalo Sampo
Salakoski Tapio
Van de Peer Yves
Van Landeghem Sofie
Wei Chih-Hsuan
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Text mining for the life sciences aims to aid database curation, knowledge summarization and information retrieval through the automated processing of biomedical texts. To provide comprehensive coverage and enable full integration with existing biomolecular database records, it is crucial that text mining tools scale up to millions of articles and that their analyses can be unambiguously linked to information recorded in resources such as UniProt, KEGG, BioGRID and NCBI databases. In this study, we investigate how fully automated text mining of complex biomolecular events can be augmented with a normalization strategy that identifies biological concepts in text, mapping them to identifiers at varying levels of granularity, ranging from canonicalized symbols to unique gene and proteins and broad gene families. To this end, we have combined two state-of-the-art text mining components, previously evaluated on two community-wide challenges, and have extended and improved upon these methods by exploiting their complementary nature. Using these systems, we perform normalization and event extraction to create a large-scale resource that is publicly available, unique in semantic scope, and covers all 21.9 million PubMed abstracts and 460 thousand PubMed Central open access full-text articles. This dataset contains 40 million biomolecular events involving 76 million gene/protein mentions, linked to 122 thousand distinct genes from 5032 species across the full taxonomic tree. Detailed evaluations and analyses reveal promising results for application of this data in database and pathway curation efforts. The main software components used in this study are released under an open-source license. Further, the resulting dataset is freely accessible through a novel API, providing programmatic and customized access (http://www.evexdb.org/api/v001/). Finally, to allow for large-scale bioinformatic analyses, the entire resource is available for bulk download from http://evexdb.org/download/, under the Creative Commons -Attribution - Share Alike (CC BY-SA) license

Crossref

Ghent University Academic Bibliography

Directory of Open Access Journals

PubMed Central

The University of Manchester - Institutional Repository

FigShare

On the Feasibility of Automated Detection of Allusive Text Reuse

Author: Kestemont Mike
Long Brian
Manjavacas Enrique
Publication venue
Publication date: 01/01/2019
Field of study

The detection of allusive text reuse is particularly challenging due to the sparse evidence on which allusive references rely---commonly based on none or very few shared words. Arguably, lexical semantics can be resorted to since uncovering semantic relations between words has the potential to increase the support underlying the allusion and alleviate the lexical sparsity. A further obstacle is the lack of evaluation benchmark corpora, largely due to the highly interpretative character of the annotation process. In the present paper, we aim to elucidate the feasibility of automated allusion detection. We approach the matter from an Information Retrieval perspective in which referencing texts act as queries and referenced texts as relevant documents to be retrieved, and estimate the difficulty of benchmark corpus compilation by a novel inter-annotator agreement study on query segmentation. Furthermore, we investigate to what extent the integration of lexical semantic information derived from distributional models and ontologies can aid retrieving cases of allusive reuse. The results show that (i) despite low agreement scores, using manual queries considerably improves retrieval performance with respect to a windowing approach, and that (ii) retrieval performance can be moderately boosted with distributional semantics

arXiv.org e-Print Archive

Crossref

Institutional Repository Universiteit Antwerpen

Adapting a relation extraction pipeline for the BioCreAtIvE II task

Author: Grover Claire
Haddow Barry
Klein Ewan
Matthews Michael
Nielsen Leif Arda
Tobin Richard
Wang Xinglong
Publication venue
Publication date: 01/01/2007
Field of study

Edinburgh Research Explorer

Content-based microarray search using differential expression profiles

Author: Altman Russ B
Butte Atul J
Chen Rong
Dudley Joel T
Engreitz Jesse M
Morgan Alexander A
Thathoo Rahul
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background With the expansion of public repositories such as the Gene Expression Omnibus (GEO), we are rapidly cataloging cellular transcriptional responses to diverse experimental conditions. Methods that query these repositories based on gene expression content, rather than textual annotations, may enable more effective experiment retrieval as well as the discovery of novel associations between drugs, diseases, and other perturbations. Results We develop methods to retrieve gene expression experiments that differentially express the same transcriptional programs as a query experiment. Avoiding thresholds, we generate differential expression profiles that include a score for each gene measured in an experiment. We use existing and novel dimension reduction and correlation measures to rank relevant experiments in an entirely data-driven manner, allowing emergent features of the data to drive the results. A combination of matrix decomposition and <it>p</it>-weighted Pearson correlation proves the most suitable for comparing differential expression profiles. We apply this method to index all GEO DataSets, and demonstrate the utility of our approach by identifying pathways and conditions relevant to transcription factors Nanog and FoxO3. Conclusions Content-based gene expression search generates relevant hypotheses for biological inquiry. Experiments across platforms, tissue types, and protocols inform the analysis of new datasets.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

A proposal for a coordinated effort for the determination of brainwide neuroanatomical connectivity in model organisms at a mesoscopic scale

Author: A MacKenzie-Graham
A Reiner
A Vercelli
A Visel
Allan Jones
AM Hattox
Arthur W. Toga
AW Toga
AY Hardan
B Egaas
B Horwitz
BL Davidson
Brett D. Mensh
Bruce W. Stillman
C Gustafson
C Kobbert
Caizhi Wu
CL Veenman
Claus C. Hilgetag
Clifford B. Saper
CR Gerfen
D Atasoy
DA Benson
Daniel G. Herrera
David C. Van Essen
David Kleinfeld
DC Van Essen
DC Van Essen
DL Sparks
E Miyashita
ED Jarvis
Edward G. Jones
EM Callaway
ES Lein
ET Bullmore
F Castelli
F Crick
G Aston-Jones
H Markram
Hans C. Breiter
Harvey J. Karten
HC Breiter
Helen Barbas
Hemant Bokil
Henry A. Lester
Hollis T. Cline
IR Wickersham
J DeFalco
J Dejerine
J Panksepp
J Panksepp
Jaak Panksepp
James D. Watson
Jason W. Bohland
JD Schmahmann
Jeremy D. Schmahmann
JF Démonet
JG Bjaalie
JG Bjaalie
JG Bjaalie
JG White
JL Lanciego
JM Lin
John C. Doyle
John M. Lin
Joseph L. Price
Joseph Safdieh
K Oishi
K Wernicke
Karel Svoboda
KE Stephan
KE Stephan
L Ng
L Stein
Larry W. Swanson
LM Coolen
M Bota
M Bota
M Bota
M Murias
MA Just
MD Johnson
MI Ekstrand
Michael Hawrylycz
Mihail Bota
MJ Swift
N Geschwind
Nicholas D. Schiff
O Sporns
Olaf Sporns
Partha P. Mitra
Peter J. Freed
PH Luppi
PJ Broser
R Kotter
R Kotter
Ralph J. Greenspan
RH Güting
RM Kelly
Rolf Kötter
RW Baughman
S Folstein
S Lillehaug
S Mikula
Shawn Mikula
Suzanne N. Haber
U Burgel
U Frith
V Grinevich
Z. Josh Huang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2009
Field of study

In this era of complete genomes, our knowledge of neuroanatomical circuitry remains surprisingly sparse. Such knowledge is however critical both for basic and clinical research into brain function. Here we advocate for a concerted effort to fill this gap, through systematic, experimental mapping of neural circuits at a mesoscopic scale of resolution suitable for comprehensive, brain-wide coverage, using injections of tracers or viral vectors. We detail the scientific and medical rationale and briefly review existing knowledge and experimental techniques. We define a set of desiderata, including brain-wide coverage; validated and extensible experimental techniques suitable for standardization and automation; centralized, open access data repository; compatibility with existing resources, and tractability with current informatics technology. We discuss a hypothetical but tractable plan for mouse, additional efforts for the macaque, and technique development for human. We estimate that the mouse connectivity project could be completed within five years with a comparatively modest budget.Comment: 41 page

Cold Spring Harbor Laboratory Institutional Repository

Boston University Institutional Repository (OpenBU)

Directory of Open Access Journals

Caltech Authors

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

PubMed Central