Search CORE

Directory of Open Access Journals

MBA: a literature mining system for extracting biomedical abbreviations

Author: AM Cohen
AS Schwartz
H Ao
H Yu
HL Fred
J Pustejovsky
JT Chang
K Taghva
LJ Jensen
LS Larkey
M Torii
N Okazaki
S Needleman
TF Smith
W Zhou
Y Park
YiMing Lei
Yu Xue
Yun Xu
YuZhong Zhao
ZhiHao Wang
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The exploding growth of the biomedical literature presents many challenges for biological researchers. One such challenge is from the use of a great deal of abbreviations. Extracting abbreviations and their definitions accurately is very helpful to biologists and also facilitates biomedical text analysis. Existing approaches fall into four broad categories: rule based, machine learning based, text alignment based and statistically based. State of the art methods either focus exclusively on acronym-type abbreviations, or could not recognize rare abbreviations. We propose a systematic method to extract abbreviations effectively. At first a scoring method is used to classify the abbreviations into acronym-type and non-acronym-type abbreviations, and then their corresponding definitions are identified by two different methods: text alignment algorithm for the former, statistical method for the latter. Results A literature mining system MBA was constructed to extract both acronym-type and non-acronym-type abbreviations. An abbreviation-tagged literature corpus, called Medstract gold standard corpus, was used to evaluate the system. MBA achieved a recall of 88% at the precision of 91% on the Medstract gold-standard EVALUATION Corpus. Conclusion We present a new literature mining system MBA for extracting biomedical abbreviations. Our evaluation demonstrates that the MBA system performs better than the others. It can identify the definition of not only acronym-type abbreviations including a little irregular acronym-type abbreviations (e.g., <CNS1, cyclophilin seven suppressor>), but also non-acronym-type abbreviations (e.g., <Fas, CD95>).</p

Directory of Open Access Journals

EnzyMiner: automatic identification of protein level mutations and their impact on target enzymes from PubMed abstracts

Author: A Bairoch
A Chang
A Fleischmann
AK McCallum
C Cleverdon
CJO Baker
CJO Baker
CT Porter
D Hanisch
D Rebholz-Schuhmann
F Horn
F Sebastiani
G Szarvas
GL Holliday
J Barthelmes
JG Caporaso
JT Chang
K Hult
K Rajaraman
LC Lee
LS Larkey
M Erdogmus
N Gövert
N Nagano
O Zamir
R Caspi
R Witte
R Witte
R Witte
RN Goldberg
SK Dwivedi
Süveyda Yeniterzi
T Karopka
Uğur Sezerman
V Renugopalakrishnan
Y Tsuruoka
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

BACKGROUND: A better understanding of the mechanisms of an enzyme's functionality and stability, as well as knowledge and impact of mutations is crucial for researchers working with enzymes. Though, several of the enzymes' databases are currently available, scientific literature still remains at large for up-to-date source of learning the effects of a mutation on an enzyme. However, going through vast amounts of scientific documents to extract the information on desired mutation has always been a time consuming process. In this paper, therefore, we describe an unique method, termed as EnzyMiner, which automatically identifies the PubMed abstracts that contain information on the impact of a protein level mutation on the stability and/or the activity of a given enzyme. RESULTS: We present an automated system which identifies the abstracts that contain an amino-acid-level mutation and then classifies them according to the mutation's effect on the enzyme. In the case of mutation identification, MuGeX, an automated mutation-gene extraction system has an accuracy of 93.1% with a 91.5 F-measure. For impact analysis, document classification is performed to identify the abstracts that contain a change in enzyme's stability or activity resulting from the mutation. The system was trained on lipases and tested on amylases with an accuracy of 85%. CONCLUSION: EnzyMiner identifies the abstracts that contain a protein mutation for a given enzyme and checks whether the abstract is related to a disease with the help of information extraction and machine learning techniques. For disease related abstracts, the mutation list and direct links to the abstracts are retrieved from the system and displayed on the Web. For those abstracts that are related to non-diseases, in addition to having the mutation list, the abstracts are also categorized into two groups. These two groups determine whether the mutation has an effect on the enzyme's stability or functionality followed by displaying these on the web

Sabanci University Research Database

Participant recruitment and retention in a pilot program to prevent weight gain in low-income overweight and obese mothers

Author: AA Stone
AK Yancey
B Lohse
BE Ainsworth
CL Graffagnino
D Damron
D Watson
DF Tate
DH Ryan
DS Blumenthal
EM Yass-Reed
G Godin
J Hintze
JS Gavaler
K McManus
K Resnicow
KA Robinson
L Katzer
LH Zayas
LK Larkey
LS Radloff
M Chang
M Coday
Mei-Wei Chang
ML Dansinger
MN Fouad
MW Kreuter
P Davis Martin
PA Arean
PD Martin
PJ Teixeira
PS Berger
R Dalle Grave
Roger Brown
RW Jeffery
S Cohen
S Havas
S Levkoff
Susan Nitzke
TA Hammad
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Recruitment and retention are key functions for programs promoting nutrition and other lifestyle behavioral changes in low-income populations. This paper describes strategies for recruitment and retention and presents predictors of early (two-month post intervention) and late (eight-month post intervention) dropout (non retention) and overall retention among young, low-income overweight and obese mothers participating in a community-based randomized pilot trial called <it>Mothers In Motion</it>. Methods Low-income overweight and obese African American and white mothers ages 18 to 34 were recruited from the Special Supplemental Nutrition Program for Women, Infants, and Children in southern Michigan. Participants (n = 129) were randomly assigned to an intervention (n = 64) or control (n = 65) group according to a stratification procedure to equalize representation in two racial groups (African American and white) and three body mass index categories (25.0-29.9 kg/m2, 30.0-34.9 kg/m2, and 35.0-39.9 kg/m2). The 10-week theory-based culturally sensitive intervention focused on healthy eating, physical activity, and stress management messages that were delivered via an interactive DVD and reinforced by five peer-support group teleconferences. Forward stepwise multiple logistic regression was performed to examine whether dietary fat, fruit and vegetable intake behaviors, physical activity, perceived stress, positive and negative affect, depression, and race predicted dropout as data were collected two-month and eight-month after the active intervention phase. Results Trained personnel were successful in recruiting subjects. Increased level of depression was a predictor of early dropout (odds ratio = 1.04; 95% CI = 1.00, 1.08; p = 0.03). Greater stress predicted late dropout (odds ratio = 0.20; 95% CI = 0.00, 0.37; p = 0.01). Dietary fat, fruit, and vegetable intake behaviors, physical activity, positive and negative affect, and race were not associated with either early or late dropout. Less negative affect was a marginal predictor of participant retention (odds ratio = 0.57; 95% CI = 0.31, 1.03; p = 0.06). Conclusion Dropout rates in this study were higher for participants who reported higher levels of depression and stress. Trial registration Current Controlled Trials NCT00944060</p

Directory of Open Access Journals

A Survey of Text Classification Algorithms

Author: A Blum
A Dayanik
A McCallum
A McCallum
A Weigand
AP Dempster
B Liu
C Apte
C Cortes
CC Aggarwal
D Boley
D Chickering
D Hardin
D Hull
D Jensen
D Johnson
D Lewis
D Lewis
D Lewis
G-R Xue
H Drucker
H Li
H Raghavan
H Schutze
J Zhang
JR Quinlan
K Myers
K Nigam
L Breiman
L Brieman
L Cai
LS Larkey
M Aizerman
M Craven
M Craven
M Ruiz
N Littlestone
N Slonim
N Slonim
P Domingos
P Howland
P Howland
P Long
R Bekkerman
R El-Yaniv
R Fisher
R Iyer
R Schapire
R Shapire
S Basu
S Chakrabarti
S Chakrabarti
S Chakraborti
S Deerwester
S Dumais
S Dumais
S Gopal
S Lam
S Zhu
SA Macskassy
SE Robertson
SM Weiss
T Salles
TM Cover
V Castelli
V Sindhwani
V Vapnik
W Cohen
W Cooper
W Lam
Y Li
Y Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study