Search CORE

559 research outputs found

A scalable machine-learning approach to recognize chemical names within large text databases

Author: A Zamora
CH Davis
E Charniak
G Nenadic
I Donaldson
J Finkel
JD Wren
JD Wren
JD Wren
JD Wren
Jonathan D Wren
L Hirschman
LR Rabiner
M Krauthammer
M Narayanaswamy
MA Drake
MD Yandell
PAV Hall
S Albert
S Raychaudhuri
U Leser
WJ Wilbur
WR Pearson
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

MOTIVATION: The use or study of chemical compounds permeates almost every scientific field and in each of them, the amount of textual information is growing rapidly. There is a need to accurately identify chemical names within text for a number of informatics efforts such as database curation, report summarization, tagging of named entities and keywords, or the development/curation of reference databases. RESULTS: A first-order Markov Model (MM) was evaluated for its ability to distinguish chemical names from words, yielding ~93% recall in recognizing chemical terms and ~99% precision in rejecting non-chemical terms on smaller test sets. However, because total false-positive events increase with the number of words analyzed, the scalability of name recognition was measured by processing 13.1 million MEDLINE records. The method yielded precision ranges from 54.7% to 100%, depending upon the cutoff score used, averaging 82.7% for approximately 1.05 million putative chemical terms extracted. Extracted chemical terms were analyzed to estimate the number of spelling variants per term, which correlated with the total number of times the chemical name appeared in MEDLINE. This variability in term construction was found to affect both information retrieval and term mapping when using PubMed and Ovid

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Proceedings of the Second Annual Conference of the MidSouth Computational Biology and Bioinformatics Society

Author: AA Ptitsyn
E Marshall
H Fang
H Hong
JD Wren
JD Wren
Jonathan D Wren
L Shi
L Shi
NR Garge
PK Tan
Q Xie
RL Frank
RR Delongchamp
SF Jennings
William Slikker
Y Tang
Z Xu
Publication venue: BioMed Central
Publication date: 01/07/2005
Field of study

The MCBIOS 2004 conference brought together regional researchers and students in biology, computer science and bioinformatics on October 7th-9th 2004 to present their latest work. This editorial describes the conference itself and introduces the twelve peer-reviewed manuscripts accepted for publication in the Proceedings of the MCBIOS 2004 Conference. These manuscripts included new methods for analysis of high-throughput gene expression experiments, EST clustering, analysis of mass spectrometry data and genomic analysi

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

On the persistence of supplementary resources in biomedical publications

Author: C Santos
GA Petsko
JD Wren
Nicholas R Anderson
Peter Tarczy-Hornoch
Roger E Bumgarner
SCFGEPDFGNFKKAGL Lawrence
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Providing for long-term and consistent public access to scientific data is a growing concern in biomedical research. One aspect of this problem can be demonstrated by evaluating the persistence of supplementary data associated with published biomedical papers. METHODS: We manually evaluated 655 supplementary data links extracted from PubMed abstracts published 1998–2005 (Method 1) as well as a further focused subset of 162 full-text manuscripts published within three representative high-impact biomedical journals between September and December 2004 (Method 2). RESULTS: For Method 1 we found that since 2001, only 71 – 92% of supplementary data were still accessible via the links provided, with 93% of these inaccessible links occurring where supplementary data was not stored with the publishing journal. Of the manuscripts evaluated in Method 2, we found that only 83% of these links were available approximately a year after publication, with 55% of these inaccessible links were at locations outside the journal of publication. CONCLUSION: We conclude that if supplemental data is required to support the publication, journals policies must take-on the responsibility to accept and store such data or require that it be maintained with a credible independent institution or under the terms of a strategic data storage plan specified by the authors. We further recommend that publishers provide automated systems to ensure that supplementary links remain persistent, and that granting bodies such as the NIH develop policies and funding mechanisms to maintain long-term persistent access to these data

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Large-scale directional relationship extraction and resolution

Author: A Culotta
A Gladki
A Koike
A Yuryev
AB Clegg
C Rodriguez-Penagos
CM Topinka
Cory B Giles
D Zhou
F Rinaldi
F Rinaldi
H Chen
H Jang
H Kim
I Donaldson
IK Ruf
J Ding
J Jiang
JA Mitchell
JC Park
JD Kim
JD Kim
JD Wren
JD Wren
JD Wren
Jonathan D Wren
JP Vaque
K Fundel
K Sagae
LM Juliano
M Bundschus
M Chagoyen
M Huang
M Lease
M Wang
M-C de Marneffe
N Daraselia
P Zweigenbaum
R Bunescu
R Kuffner
RC Bunescu
RT Tsai
S Kim
S Novichkova
TK Jenssen
W Pratt
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Proceedings of the Third Annual Conference of the MidSouth Computational Biology and Bioinformatics Society

Author: A Kel
A Ptitsyn
A Ptitsyn
Andrey Ptitsyn
H Fang
H Hong
H Sun
JD Wren
JD Wren
JD Wren
Jonathan D Wren
L Guo
L Shi
L Shi
LA Nahum
N Mei
NR Garge
Q Xie
R Delongchamp
R Loganantharaj
RL Frank
RL Frank
RR Delongchamp
RT Iqbal
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
SF Jennings
Stephen Winters-Hilt
T Chen
T Han
T Han
TG Smolinski
V Nagarajan
V Thodima
Y Ding
Yuriy Gusev
Z Xu
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Access to Scientific Publications: The Scientist's Perspective

Author: Alan Bernstein
Askar Myrzahmetov
JD Wren
JM Panitch
K Simiyu
K.T. Jeang
M Chan
M Laakso
S Harnad
S Rerks-Ngarm
Yegor Voronin
Publication venue: Public Library of Science
Publication date: 17/11/2011
Field of study

BACKGROUND: Scientific publishing is undergoing significant changes due to the growth of online publications, increases in the number of open access journals, and policies of funders and universities requiring authors to ensure that their publications become publicly accessible. Most studies of the impact of these changes have focused on the growth of articles available through open access or the number of open-access journals. Here, we investigated access to publications at a number of institutes and universities around the world, focusing on publications in HIV vaccine research--an area of biomedical research with special importance to the developing world. METHODS AND FINDINGS: We selected research papers in HIV vaccine research field, creating: 1) a first set of 50 most recently published papers with keywords "HIV vaccine" and 2) a second set of 200 articles randomly selected from those cited in the first set. Access to the majority (80%) of the recently published articles required subscription, while cited literature was much more accessible (67% freely available online). Subscriptions at a number of institutions around the world were assessed for providing access to subscription-only articles from the two sets. The access levels varied widely, ranging among institutions from 20% to 90%. Through the WHO-supported HINARI program, institutes in low-income countries had access comparable to that of institutes in the North. Finally, we examined the response rates for reprint requests sent to corresponding authors, a method commonly used before internet access became widespread. Contacting corresponding authors with requests for electronic copies of articles by email resulted in a 55-60% success rate, although in some cases it took up to 1.5 months to get a response. CONCLUSIONS: While research articles are increasingly available on the internet in open access format, institutional subscriptions continue to play an important role. However, subscriptions do not provide access to the full range of HIV vaccine research literature. Access to papers through subscriptions is complemented by a variety of other means, including emailing corresponding authors, joint affiliations, use of someone else's login information and posting requests on message boards. This complex picture makes it difficult to assess the real ability of scientists to access literature, but the observed differences in access levels between institutions suggest an unlevel playing field, in which some researchers have to spend more efforts than others to obtain the same information

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Automating Genomic Data Mining via a Sequence-based Matrix Format and Associative Rule Set

Author: BFJ Manly
CI Castillo-Davis
David Johnson
DB Searls
DB Searls
DD Womble
E Badidi
F Antequera
J Krueger
J Theilhaber
JD Wren
JD Wren
JF Costello
JM Claverie
Jonathan D Wren
JR Quinlan
K Davies
K Nakai
L Stein
Le Gruenwald
LV Zhang
M Ashburner
M Gardiner-Garden
M Safran
P Clark
RS Michalski
S Foissac
S Muggleton
SP Shah
TV Venkatesh
V Bajic
W Frawley
WM Shui
WM Shui
Y Liu
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

There is an enormous amount of information encoded in each genome – enough to create living, responsive and adaptive organisms. Raw sequence data alone is not enough to understand function, mechanisms or interactions. Changes in a single base pair can lead to disease, such as sickle-cell anemia, while some large megabase deletions have no apparent phenotypic effect. Genomic features are varied in their data types and annotation of these features is spread across multiple databases. Herein, we develop a method to automate exploration of genomes by iteratively exploring sequence data for correlations and building upon them. First, to integrate and compare different annotation sources, a sequence matrix (SM) is developed to contain position-dependant information. Second, a classification tree is developed for matrix row types, specifying how each data type is to be treated with respect to other data types for analysis purposes. Third, correlative analyses are developed to analyze features of each matrix row in terms of the other rows, guided by the classification tree as to which analyses are appropriate. A prototype was developed and successful in detecting coinciding genomic features among genes, exons, repetitive elements and CpG islands

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

BibGlimpse: The case for a light-weight reprint manager in distributed literature research

Author: Alexandra Graf
BP Suomela
D Giustini
D Rebholz-Schuhmann
David P Kreil
DP Corney
E Postma
G Velez
Golda Velez
HM Müller
J Bockhorst
J Natarajan
J Saric
JD Kim
JD Kim
JD Wren
K Cohen
L Hunter
LJ Jensen
M Lee
S Ananiadou
S Ray
T Kuhn
Thomas Tüchler
WJ Wilbur
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Background While text-mining and distributed annotation systems both aim at capturing knowledge and presenting it in a standardized form, there have been few attempts to investigate potential synergies between these two fields. For instance, distributed annotation would be very well suited for providing topic focussed, expert knowledge enriched text corpora. A key limitation for this approach is the availability of literature annotation systems that can be routinely used by groups of collaborating researchers on a day to day basis, not distracting from the main focus of their work. Results For this purpose, we have designed BibGlimpse. Features like drop-to-file, SVM based automated retrieval of PubMed bibliography for PDF reprints, and annotation support make BibGlimpse an efficient, light-weight reprint manager that facilitates distributed literature research for work groups. Building on an established open search engine, full-text search and structured queries are supported, while at the same time making shared collections of annotated reprints accessible to literature classification and text-mining tools. Conclusion BibGlimpse offers scientists a tool that enhances their own literature management. Moreover, it may be used to create content enriched, annotated text corpora for research in text-mining

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Publikationsserver der Universitätsbibliothek Bodenkultur Wien

Publikationsserver der Fachhochschule (FH) Campus Wien

Warwick Research Archives Portal Repository

Effectiveness of neonatal pulse oximetry screening for detection of critical congenital heart disease in daily clinical routine—results from a prospective multicenter study

Author: A Meberg
A Wahl Granelli de
A Wahl Granelli de
AF Bakr
AH Schultz
Andreas Möckel
BA Kao
BJ Byrne
C Wren
C Wren
CM Glazener
Cornelia Wörner
CPF O’Donnell
DM Sendelbach
E Rosati
Frank Thomas Riede
FT Riede
I Griebsch
Ingo Dähnert
JD Malkin
JD Reich
JD Reich
KL Brown
KS Kuehl
M Mellander
Martin Kostelka
P Valmari
Peter Schneider
PG Hetzel
R Arlettaz
RI Koppel
RK Chang
RK Chang
S Ainsworth
S Richmond
S Thangaratinam
TR Hoke
WT Mahle
Publication venue: Springer-Verlag
Publication date: 01/01/2010
Field of study

Pulse oximetry screening (POS) has been proposed as an effective, noninvasive, inexpensive tool allowing earlier diagnosis of critical congenital heart disease (cCHD). Our aim was to test the hypothesis that POS can reduce the diagnostic gap in cCHD in daily clinical routine in the setting of tertiary, secondary and primary care centres. We conducted a prospective multicenter trial in Saxony, Germany. POS was performed in healthy term and post-term newborns at the age of 24–72 h. If an oxygen saturation (SpO2) of ≤95% was measured on lower extremities and confirmed after 1 h, complete clinical examination and echocardiography were performed. POS was defined as false-negative when a diagnosis of cCHD was made after POS in the participating hospitals/at our centre. From July 2006–June 2008, 42,240 newborns from 34 institutions have been included. Seventy-two children were excluded due to prenatal diagnosis (n = 54) or clinical signs of cCHD (n = 18) before POS. Seven hundred ninety-five newborns did not receive POS, mainly due to early discharge after birth (n = 727; 91%). In 41,445 newborns, POS was performed. POS was true positive in 14, false positive in 40, true negative in 41,384 and false negative in four children (three had been excluded for violation of study protocol). Sensitivity, specificity, positive and negative predictive value were 77.78%, 99.90%, 25.93% and 99.99%, respectively. With POS as an adjunct to prenatal diagnosis, physical examination and clinical observation, the percentage of newborns with late diagnosis of cCHD was 4.4%. POS can substantially reduce the postnatal diagnostic gap in cCHD, and false-positive results leading to unnecessary examinations of healthy newborns are rare. POS should be implemented in routine postnatal care

Crossref

Springer - Publisher Connector

PubMed Central

Ethnicity-specific epigenetic variation in naïve CD4+ T cells and the susceptibility to autoimmunity

Author: A Alonso
Amr H. Sawalha
C Goda
DL Huffman
EC Somers
Elizabeth Gensterblum
FF Zhang
FJ Diez-Guerra
G Pan
GS Cooper
H Heyn
J Lewin
JD Wren
JD Wren
JD Wren
JD Wren
Jonathan D. Wren
Kathleen Maksimowicz-McKinnon
LR Devireddy
MF Fraga
Mikhail Ognenovski
MJ Aryee
P Coit
P Coit
Patrick Coit
R Shoemaker
S Li
SA Chung
SE Lofgren
SH Kim
TH Bestor
TR Rebbeck
V Dhir
VR Moulton
W da Huang
W da Huang
X Luo
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref