Search CORE

10,056 research outputs found

Extracting adverse drug reactions and their context using sequence labelling ensembles in TAC2017

Author: Belousov Maksim
Dixon William
Milosevic Nikola
Nenadic Goran
Publication venue
Publication date: 01/01/2018
Field of study

Adverse drug reactions (ADRs) are unwanted or harmful effects experienced after the administration of a certain drug or a combination of drugs, presenting a challenge for drug development and drug administration. In this paper, we present a set of taggers for extracting adverse drug reactions and related entities, including factors, severity, negations, drug class and animal. The systems used a mix of rule-based, machine learning (CRF) and deep learning (BLSTM with word2vec embeddings) methodologies in order to annotate the data. The systems were submitted to adverse drug reaction shared task, organised during Text Analytics Conference in 2017 by National Institute for Standards and Technology, archiving F1-scores of 76.00 and 75.61 respectively.Comment: Paper describing submission for TAC ADR shared tas

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Crossref

Identification and Progression of Heart Disease Risk Factors in Diabetic Patients from Longitudinal Electronic Health Records

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Crossref

Statistical interaction modeling of bovine herd behaviors

Author: Andonovic I.
Bell M.
Dwyer C.
Hyslop J.
Kwong K.H.
Michie C.
Ross D.
Stephen B.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2011
Field of study

While there has been interest in modeling the group behavior of herds or flocks, much of this work has focused on simulating their collective spatial motion patterns which have not accounted for individuality in the herd and instead assume a homogenized role for all members or sub-groups of the herd. Animal behavior experts have noted that domestic animals exhibit behaviors that are indicative of social hierarchy: leader/follower type behaviors are present as well as dominance and subordination, aggression and rank order, and specific social affiliations may also exist. Both wild and domestic cattle are social species, and group behaviors are likely to be influenced by the expression of specific social interactions. In this paper, Global Positioning System coordinate fixes gathered from a herd of beef cows tracked in open fields over several days at a time are utilized to learn a model that focuses on the interactions within the herd as well as its overall movement. Using these data in this way explores the validity of existing group behavior models against actual herding behaviors. Domain knowledge, location geography and human observations, are utilized to explain the causes of these deviations from this idealized behavior

University of Strathclyde Institutional Repository

Machine Learning and Clinical Text. Supporting Health Information Flow

Author: Suominen Hanna
Publication venue: Turku Centre for Computer Science
Publication date: 15/12/2009
Field of study

Fluent health information flow is critical for clinical decision-making. However, a considerable part of this information is free-form text and inabilities to utilize it create risks to patient safety and cost-effective hospital administration. Methods for automated processing of clinical text are emerging. The aim in this doctoral dissertation is to study machine learning and clinical text in order to support health information flow.First, by analyzing the content of authentic patient records, the aim is to specify clinical needs in order to guide the development of machine learning applications.The contributions are a model of the ideal information flow,a model of the problems and challenges in reality, and a road map for the technology development. Second, by developing applications for practical cases,the aim is to concretize ways to support health information flow. Altogether five machine learning applications for three practical cases are described: The first two applications are binary classification and regression related to the practical case of topic labeling and relevance ranking.The third and fourth application are supervised and unsupervised multi-class classification for the practical case of topic segmentation and labeling.These four applications are tested with Finnish intensive care patient records.The fifth application is multi-label classification for the practical task of diagnosis coding. It is tested with English radiology reports.The performance of all these applications is promising. Third, the aim is to study how the quality of machine learning applications can be reliably evaluated.The associations between performance evaluation measures and methods are addressed,and a new hold-out method is introduced.This method contributes not only to processing time but also to the evaluation diversity and quality. The main conclusion is that developing machine learning applications for text requires interdisciplinary, international collaboration. Practical cases are very different, and hence the development must begin from genuine user needs and domain expertise. The technological expertise must cover linguistics,machine learning, and information systems. Finally, the methods must be evaluated both statistically and through authentic user-feedback.Siirretty Doriast

UTUPub

Automatic de-identification of textual documents in the electronic health record: a review of recent research

Author: B Wellner
BA Beckwith
Brett R South
C Friedman
D Gupta
DA Dorr
E Aramaki
EM Fielstein
F Jeffrey Friedlin
FJ Friedlin
FP Morrison
G Szarvas
G Szarvas
GPO U.S
GPO U.S
H Cunningham
I Neamatullah
J Gardner
JJ Berman
K Atkinson
K Hara
L Sweeney
Matthew H Samore
NCI
NLM
NLM
NLM
O Uzuner
O Uzuner
O Uzuner
O Uzuner
P Ruch
RK Taira
Shuying Shen
SM Meystre
SM Thomas
SM Thomas
Stephane M Meystre
Y Guo
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background In the United States, the Health Insurance Portability and Accountability Act (HIPAA) protects the confidentiality of patient data and requires the informed consent of the patient and approval of the Internal Review Board to use data for research purposes, but these requirements can be waived if data is de-identified. For clinical data to be considered de-identified, the HIPAA "Safe Harbor" technique requires 18 data elements (called PHI: Protected Health Information) to be removed. The de-identification of narrative text documents is often realized manually, and requires significant resources. Well aware of these issues, several authors have investigated automated de-identification of narrative text documents from the electronic health record, and a review of recent research in this domain is presented here. Methods This review focuses on recently published research (after 1995), and includes relevant publications from bibliographic queries in PubMed, conference proceedings, the ACM Digital Library, and interesting publications referenced in already included papers. Results The literature search returned more than 200 publications. The majority focused only on structured data de-identification instead of narrative text, on image de-identification, or described manual de-identification, and were therefore excluded. Finally, 18 publications describing automated text de-identification were selected for detailed analysis of the architecture and methods used, the types of PHI detected and removed, the external resources used, and the types of clinical documents targeted. All text de-identification systems aimed to identify and remove person names, and many included other types of PHI. Most systems used only one or two specific clinical document types, and were mostly based on two different groups of methodologies: pattern matching and machine learning. Many systems combined both approaches for different types of PHI, but the majority relied only on pattern matching, rules, and dictionaries. Conclusions In general, methods based on dictionaries performed better with PHI that is rarely mentioned in clinical text, but are more difficult to generalize. Methods based on machine learning tend to perform better, especially with PHI that is not mentioned in the dictionaries used. Finally, the issues of anonymization, sufficient performance, and "over-scrubbing" are discussed in this publication.</p

Crossref

IUPUIScholarWorks

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Text Mining for Translational Bioinformatics

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Crossref

Relaxation Penalties and Priors for Plausible Modeling of Nonidentified Bias Sources

Author: Greenland Sander
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 15/01/2010
Field of study

In designed experiments and surveys, known laws or design feat ures provide checks on the most relevant aspects of a model and identify the target parameters. In contrast, in most observational studies in the health and social sciences, the primary study data do not identify and may not even bound target parameters. Discrepancies between target and analogous identified parameters (biases) are then of paramount concern, which forces a major shift in modeling strategies. Conventional approaches are based on conditional testing of equality constraints, which correspond to implausible point-mass priors. When these constraints are not identified by available data, however, no such testing is possible. In response, implausible constraints can be relaxed into penalty functions derived from plausible prior distributions. The resulting models can be fit within familiar full or partial likelihood frameworks. The absence of identification renders all analyses part of a sensitivity analysis. In this view, results from single models are merely examples of what might be plausibly inferred. Nonetheless, just one plausible inference may suffice to demonstrate inherent limitations of the data. Points are illustrated with misclassified data from a study of sudden infant death syndrome. Extensions to confounding, selection bias and more complex data structures are outlined.Comment: Published in at http://dx.doi.org/10.1214/09-STS291 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Evaluating indoor positioning systems in a shopping mall : the lessons learned from the IPIN 2018 competition

Author: Ali Muhammad Usman
Ben-Moshe Boaz
Chien Ying-Ren
Cho Eunyoung
Ding Zhenxing
Fang Shih-Hau
Hacohen Shlomi
Han Jaeseung
Hur Soojung
Jeong Hyeongyo
Jun Sungwoo
Knauth Stefan
Kronenwett Nikolai
Kuang Jian
Landa Vlad
Landau Yael
Lee Changeun
Lee Keumryeol
Lee Soyeon
Lee Yonghyun
Li Xianghong
Li Yu
Lu Chuanhua
Lungenstrass Tomas
Marbel Revital
Martin Mendoza-Silva German
Niu Xiaoji
Opiela Miroslav
Ortiz Miguel
Pablo Morales Juan
Park Chan Gook
Park Changjun
Park Sangjoon
Park So Young
Park Yongwan
Perez-Navarro Antoni
Perul Johan
Pipelidis Georgios
Plets David
Ramon Jimenez Antonio
Renaudin Valerie
Rew Jehyeok
Seco Fernando
Shimada Atsushi
Shvalb Nir
Taniguchi Rin-Ichiro
Thomas Diego
Torres-Sospedra Joaquin
Trogh Jens
Tsao Yu
Tsiamitros Nikolaos
Uchiyama Hideaki
Vladimirov Blagovest
Wei Dongyan
Xu Feng
Yang Shi-Shen
Ye Feng
Ye Shih-Jyun
Zhang Wenchao
Zhang Ying
Zheng Xingyu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

The Indoor Positioning and Indoor Navigation (IPIN) conference holds an annual competition in which indoor localization systems from different research groups worldwide are evaluated empirically. The objective of this competition is to establish a systematic evaluation methodology with rigorous metrics both for real-time (on-site) and post-processing (off-site) situations, in a realistic environment unfamiliar to the prototype developers. For the IPIN 2018 conference, this competition was held on September 22nd, 2018, in Atlantis, a large shopping mall in Nantes (France). Four competition tracks (two on-site and two off-site) were designed. They consisted of several 1 km routes traversing several floors of the mall. Along these paths, 180 points were topographically surveyed with a 10 cm accuracy, to serve as ground truth landmarks, combining theodolite measurements, differential global navigation satellite system (GNSS) and 3D scanner systems. 34 teams effectively competed. The accuracy score corresponds to the third quartile (75th percentile) of an error metric that combines the horizontal positioning error and the floor detection. The best results for the on-site tracks showed an accuracy score of 11.70 m (Track 1) and 5.50 m (Track 2), while the best results for the off-site tracks showed an accuracy score of 0.90 m (Track 3) and 1.30 m (Track 4). These results showed that it is possible to obtain high accuracy indoor positioning solutions in large, realistic environments using wearable light-weight sensors without deploying any beacon. This paper describes the organization work of the tracks, analyzes the methodology used to quantify the results, reviews the lessons learned from the competition and discusses its future

Ghent University Academic Bibliography

Digital.CSIC