Search CORE

633 research outputs found

ExaCT: automatic extraction of clinical trial characteristics from journal publications

Author: Carini Simona
de Bruijn Berry
Kiritchenko Svetlana
Martin Joel
Sim Ida
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Clinical trials are one of the most important sources of evidence for guiding evidence-based practice and the design of new trials. However, most of this information is available only in free text - e.g., in journal publications - which is labour intensive to process for systematic reviews, meta-analyses, and other evidence synthesis studies. This paper presents an automatic information extraction system, called ExaCT, that assists users with locating and extracting key trial characteristics (e.g., eligibility criteria, sample size, drug dosage, primary outcomes) from full-text journal articles reporting on randomized controlled trials (RCTs). Methods ExaCT consists of two parts: an information extraction (IE) engine that searches the article for text fragments that best describe the trial characteristics, and a web browser-based user interface that allows human reviewers to assess and modify the suggested selections. The IE engine uses a statistical text classifier to locate those sentences that have the highest probability of describing a trial characteristic. Then, the IE engine's second stage applies simple rules to these sentences to extract text fragments containing the target answer. The same approach is used for all 21 trial characteristics selected for this study. Results We evaluated ExaCT using 50 previously unseen articles describing RCTs. The text classifier (<it>first stage</it>) was able to recover 88% of relevant sentences among its top five candidates (top5 recall) with the topmost candidate being relevant in 80% of cases (top1 precision). Precision and recall of the extraction rules (<it>second stage</it>) were 93% and 91%, respectively. Together, the two stages of the extraction engine were able to provide (partially) correct solutions in 992 out of 1050 test tasks (94%), with a majority of these (696) representing fully correct and complete answers. Conclusions Our experiments confirmed the applicability and efficacy of ExaCT. Furthermore, they demonstrated that combining a statistical method with 'weak' extraction rules can identify a variety of study characteristics. The system is flexible and can be extended to handle other characteristics and document types (e.g., study protocols).</p

NRC Publications Archive

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

A Taxonomy of Academic Abstract Sentence Classification Modelling

Author: Busch Peter
Smith Stephen
Stead Connor
Vatanasakdakul Savanid
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2021
Field of study

Background: Abstract sentence classification modelling has the potential to advance literature discovery capability for the array of academic literature information systems, however, no artefact exists that categorises known models and identifies their key characteristics. Aims: To systematically categorise known abstract sentence classification models and make this knowledge readily available to future researchers and professionals concerned with abstract sentence classification model development and deployment. Method: An information systems taxonomy development methodology was adopted after a literature review to categorise 23 abstract sentence classification models identified from the literature. Corresponding dimensions and characteristics were derived from this process with the resulting taxonomy presented. Results: Abstract sentence classification modelling has evolved significantly with state-of-the-art models now leveraging neural networks to achieve high-performance sentence classification. The resulting taxonomy provides a novel means to observe the development of this research field and enables us to consider how such models can be further improved or deployed in real-world applications

AIS Electronic Library (AISeL)

Are decision trees a feasible knowledge representation to guide extraction of critical information from randomized controlled trial reports?

Author: A Aguirre-Junco
A Geissbuhler
A Keech
A Taddio
Ad Hoc working group for Critical Appraisal of the Medical Literature
AD Oxman
C Orasan
CD Mulrow
D Demner-Fushman
DG Altman
DG Covell
DL Sackett
DM D'Alessandro
E Coiera
E Coiera
Enrico Coiera
F Salager-Meyer
G Georg
Grace Y Chung
GY Cheng
HS Sacks
I Sim
J Cohen
J Hartley
J Swales
JJ Cimino
JW Ely
JW Ely
K Fozi
KA L'Abbe
L McKnight
M Clarke
M Clarke
M Dawes
M Fiszman
M Hunink
MC Weinstein
MH Ebell
ML Chambliss
MY Tsay
N Elhadad
NC Ide
PJ Devereaux
R Xu
RB Haynes
RL Kane
S Teufel
SP Balasubramanian
W Hersh
WS Richardson
Y Niu
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background This paper proposes the use of decision trees as the basis for automatically extracting information from published randomized controlled trial (RCT) reports. An exploratory analysis of RCT abstracts is undertaken to investigate the feasibility of using decision trees as a semantic structure. Quality-of-paper measures are also examined. Methods A subset of 455 abstracts (randomly selected from a set of 7620 retrieved from Medline from 1998 – 2006) are examined for the quality of RCT reporting, the identifiability of RCTs from abstracts, and the completeness and complexity of RCT abstracts with respect to key decision tree elements. Abstracts were manually assigned to 6 sub-groups distinguishing whether they were primary RCTs versus other design types. For primary RCT studies, we analyzed and annotated the reporting of intervention comparison, population assignment and outcome values. To measure completeness, the frequencies by which complete intervention, population and outcome information are reported in abstracts were measured. A qualitative examination of the reporting language was conducted. Results Decision tree elements are manually identifiable in the majority of primary RCT abstracts. 73.8% of a random subset was primary studies with a single population assigned to two or more interventions. 68% of these primary RCT abstracts were structured. 63% contained pharmaceutical interventions. 84% reported the total number of study subjects. In a subset of 21 abstracts examined, 71% reported numerical outcome values. Conclusion The manual identifiability of decision tree elements in the abstract suggests that decision trees could be a suitable construct to guide machine summarisation of RCTs. The presence of decision tree elements could also act as an indicator for RCT report quality in terms of completeness and uniformity.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Macquarie University ResearchOnline

Making decisions based on context: models and applications in cognitive sciences and natural language processing

Author: Zhu Henghui
Publication venue
Publication date: 29/01/2020
Field of study

It is known that humans are capable of making decisions based on context and generalizing what they have learned. This dissertation considers two related problem areas and proposes different models that take context information into account. By including the context, the proposed models exhibit strong performance in each of the problem areas considered. The first problem area focuses on a context association task studied in cognitive science, which evaluates the ability of a learning agent to associate specific stimuli with an appropriate response in particular spatial contexts. Four neural circuit models are proposed to model how the stimulus and context information are processed to produce a response. The neural networks are trained by modifying the strength of neural connections (weights) using principles of Hebbian learning. Such learning is considered biologically plausible, in contrast to back propagation techniques that do not have a solid neurophysiological basis. A series of theoretical results for the neural circuit models are established, guaranteeing convergence to an optimal configuration when all the stimulus-context pairs are provided during training. Among all the models, a specific model based on ideas from recommender systems trained with a primal-dual update rule, achieves perfect performance in learning and generalizing the mapping from context-stimulus pairs to correct responses. The second problem area considered in the thesis focuses on clinical natural language processing (NLP). A particular application is the development of deep-learning models for analyzing radiology reports. Four NLP tasks are considered including anatomy named entity recognition, negation detection, incidental finding detection, and clinical concept extraction. A hierarchical Recurrent Neural Network (RNN) is proposed for anatomy named entity recognition, which is then used to produce a set of features for incidental finding detection of pulmonary nodules. A clinical context word embedding model is obtained, which is used with an RNN to model clinical concept extraction. Finally, feature-enriched RNN and transformer-based models with contextual word embedding are proposed for negation detection. All these models take the (clinical) context information into account. The models are evaluated on different datasets and are shown to achieve strong performance, largely outperforming the state-of-art

Boston University Institutional Repository (OpenBU)

Preface

Author: Press Vilnius University
Publication venue: 'Vilnius University Press'
Publication date: 01/01/2018
Field of study

DAMSS-2018 is the jubilee 10th international workshop on data analysis methods for software systems, organized in Druskininkai, Lithuania, at the end of the year. The same place and the same time every year. Ten years passed from the first workshop. History of the workshop starts from 2009 with 16 presentations. The idea of such workshop came up at the Institute of Mathematics and Informatics. Lithuanian Academy of Sciences and the Lithuanian Computer Society supported this idea. This idea got approval both in the Lithuanian research community and abroad. The number of this year presentations is 81. The number of registered participants is 113 from 13 countries. In 2010, the Institute of Mathematics and Informatics became a member of Vilnius University, the largest university of Lithuania. In 2017, the institute changes its name into the Institute of Data Science and Digital Technologies. This name reflects recent activities of the institute. The renewed institute has eight research groups: Cognitive Computing, Image and Signal Analysis, Cyber-Social Systems Engineering, Statistics and Probability, Global Optimization, Intelligent Technologies, Education Systems, Blockchain Technologies. The main goal of the workshop is to introduce the research undertaken at Lithuanian and foreign universities in the fields of data science and software engineering. Annual organization of the workshop allows the fast interchanging of new ideas among the research community. Even 11 companies supported the workshop this year. This means that the topics of the workshop are actual for business, too. Topics of the workshop cover big data, bioinformatics, data science, blockchain technologies, deep learning, digital technologies, high-performance computing, visualization methods for multidimensional data, machine learning, medical informatics, ontological engineering, optimization in data science, business rules, and software engineering. Seeking to facilitate relations between science and business, a special session and panel discussion is organized this year about topical business problems that may be solved together with the research community. This book gives an overview of all presentations of DAMSS-2018.DAMSS-2018 is the jubilee 10th international workshop on data analysis methods for software systems, organized in Druskininkai, Lithuania, at the end of the year. The same place and the same time every year. Ten years passed from the first workshop. History of the workshop starts from 2009 with 16 presentations. The idea of such workshop came up at the Institute of Mathematics and Informatics. Lithuanian Academy of Sciences and the Lithuanian Computer Society supported this idea. This idea got approval both in the Lithuanian research community and abroad. The number of this year presentations is 81. The number of registered participants is 113 from 13 countries. In 2010, the Institute of Mathematics and Informatics became a member of Vilnius University, the largest university of Lithuania. In 2017, the institute changes its name into the Institute of Data Science and Digital Technologies. This name reflects recent activities of the institute. The renewed institute has eight research groups: Cognitive Computing, Image and Signal Analysis, Cyber-Social Systems Engineering, Statistics and Probability, Global Optimization, Intelligent Technologies, Education Systems, Blockchain Technologies. The main goal of the workshop is to introduce the research undertaken at Lithuanian and foreign universities in the fields of data science and software engineering. Annual organization of the workshop allows the fast interchanging of new ideas among the research community. Even 11 companies supported the workshop this year. This means that the topics of the workshop are actual for business, too. Topics of the workshop cover big data, bioinformatics, data science, blockchain technologies, deep learning, digital technologies, high-performance computing, visualization methods for multidimensional data, machine learning, medical informatics, ontological engineering, optimization in data science, business rules, and software engineering. Seeking to facilitate relations between science and business, a special session and panel discussion is organized this year about topical business problems that may be solved together with the research community. This book gives an overview of all presentations of DAMSS-2018

Crossref

Vilnius University Proceedings

Archivio istituzionale della ricerca - Università di Ferrara

pHealth 2021. Proc. of the 18th Internat. Conf. on Wearable Micro and Nano Technologies for Personalised Health, 8-10 November 2021, Genoa, Italy

Author
Publication venue: 'IOS Press'
Publication date: 01/01/2021
Field of study

Smart mobile systems – microsystems, smart textiles, smart implants, sensor-controlled medical devices – together with related body, local and wide-area networks up to cloud services, have become important enablers for telemedicine and the next generation of healthcare services. The multilateral benefits of pHealth technologies offer enormous potential for all stakeholder communities, not only in terms of improvements in medical quality and industrial competitiveness, but also for the management of healthcare costs and, last but not least, the improvement of patient experience. This book presents the proceedings of pHealth 2021, the 18th in a series of conferences on wearable micro and nano technologies for personalized health with personal health management systems, hosted by the University of Genoa, Italy, and held as an online event from 8 – 10 November 2021. The conference focused on digital health ecosystems in the transformation of healthcare towards personalized, participative, preventive, predictive precision medicine (5P medicine). The book contains 46 peer-reviewed papers (1 keynote, 5 invited papers, 33 full papers, and 7 poster papers). Subjects covered include the deployment of mobile technologies, micro-nano-bio smart systems, bio-data management and analytics, autonomous and intelligent systems, the Health Internet of Things (HIoT), as well as potential risks for security and privacy, and the motivation and empowerment of patients in care processes. Providing an overview of current advances in personalized health and health management, the book will be of interest to all those working in the field of healthcare today

University of Regensburg Publication Server

The Study on Automatic Annotation using Structural/Linguistic Characteristics of biomedical documents

Author: 남세진
Publication venue: 서울대학교 대학원
Publication date: 01/08/2015
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 치의과학과 의료경영정보학전공, 2015. 8. 김홍기.자동 어노테이션에 대한 연구는 급속도로 증가하는 의생명 분야의 논문 과 임상 문서들을 더욱 정확하게 검색하거나 필요한 정보만을 추출할 수 있게 하는 기반이 된다는 점에서 중요하다. 본 연구에서는, 그 중 연구 활 동에서 필수적인 논문 검색과 환자의 질병에 대한 진단, 검사, 그리고 처 방 등을 기록하는데 필수적인 임상서식의 작성에 초점을 맞추어, 이에 필 요한 어노테이션 기술을 연구하였다. 이 두 가지 활동은 의생명 분야의 대 표 문서인 논문과 임상서식을 대상으로 일상적으로 일어나는 것이며, 이 러한 활동이 효율적으로 개선되는 것은 의생명 분야에서 중요한 의미를 가진다. 먼저, 텍스트 형식의 연구 논문에 대해서는 연구 활동의 방향 설정에 중 요한 역할을 하는 초록을 대상으로, 의생명 분야에서 주로 사용하는 IMRAD(Introduction, Methods, Results, and Discussion)로의 자동 태깅을 연구하였다. 이 연구에서는, 기존 언어학 분야에서 의생명 분야의 논문을 대상으로 이룬 결과와 컴퓨터 과학 분야에서 진행돼온 결과를 기 반으로, 계산 비용이 적으면서도 높은 성능을 내는 새로운 자동 태깅 시스 템을 제안하고 개발하였다. 본 연구에서 제안한 방법을 사용하는 경우, 문 장에서 뽑아낸 17개의 특징만으로도 비구조화된 초록을 Accuracy 77.0 ~ 90.3%의 성능으로 분류할 수 있었다. 또한, 기존 연구들에서 사용한 특 징들과 함께 사용했을 때는 최대 Accuracy 91.7%의 성능을 보여주었다. 임상 문서의 경우, EMR(Electronic Medical Record)을 시스템을 사용하는 환경에서는 임상 서식을 통해 생성되는 경우가 대부분이므로, 임 상 서식을 대상으로 자동 태깅을 시도하였다. 임상 서식은 연구 초록과는 달리 이미 구조화된 형식을 가지고 있으므로, 본 연구에서는 이 구조 안에 내재된 전문가의 지식을 태깅하고자 하였다. 이를 위해 새로운 지식모델 과 이를 이용한 임상 서식 작성 지원 시스템인 STEP(Smart Clinical Document Template Editing and Production System)을 개발하였다. STEP의 시스템의 활용성을 검증하기 위해서는 임상 서식 작성 도구를 개 발하여, 지식 모델을 통해 구축된 지식베이스가 임상 서식의 작성을 개선 시킬 수 있음을 보였다. 연구 결과는 의생명 분야의 연구자들에게 대규모의 의생명 관련 논문과 임상에서 지속적으로 생산되는 임상 문서가 더욱 정확하게 검색되고 재사 용될 수 있음을 보여주고 있다. 이러한 결과는 의생명 분야 전반에서 연구 자들의 활동을 개선시킬 수 있다는 점에서 중요하다. 마지막으로, 본 연구 의 성과가 다른 연구자들에게도 활용될 수 있도록, 연구 과정에서 추출한 언어 자원과 결과를 확인할 수 있는 시스템을 웹으로 공개하였다.초 록....................................................................................................i 목 차..................................................................................................iii I. 서론................................................................................................1 1. 연구 배경 ......................................................................................1 2. 연구 목적 ......................................................................................5 3. 논문의 구성....................................................................................6 II. 구조화된 초록의 언어적 특징 추출..................................................7 1. 연구 배경 .....................................................................................7 2. 연구 목적 .....................................................................................9 3. 관련 연구 .....................................................................................9 4. 연구 방법 ................................................................................... 12 4.1. 데이터 코퍼스 ......................................................................... 13 4.2. 섹션 정규화............................................................................. 14 4.3. 섹션 맵핑 ............................................................................... 17 4.4. 언어적 특징 추출 ..................................................................... 18 5. 결과 ......................................................................................... 20 5.1. 섹션별 동사/동사구의 사용 특징 .................................................. 20 5.2. 섹션별 N-gram의 사용 특징 ...................................................... 22 5.3. 섹션별 명사(구)의 사용 특징 ....................................................... 24 5.4. 언어적 특징들의 섹션 구별력 ...................................................... 27 6. 결론 .......................................................................................... 41 III. 언어적 특징을 이용한 초록 문장 분류................................................. 44 1. 연구 배경 ................................................................................... 44 2. 연구 목적 ................................................................................... 45 3. 관련 연구 ................................................................................... 45 4. 연구 방법 ................................................................................... 48 4.1. Feature Set 구성 ................................................................... 48 4.2. 테스트 문서 집합 ...................................................................... 52 4.3. SVM을 이용한 학습 및 평가 ....................................................... 53 5. 연구 결과 ................................................................................... 54 5.1. 언어적 특징별 성능.....................................................................54 5.2. 특징 그룹 조합별 성능 ............................................................... 56 6. 논의 .......................................................................................... 65 IV. 의생명 초록 문장 자동 태깅 시스템.............................................. 67 1. 시스템 소개 ................................................................................ 67 2. 서비스 구성 ................................................................................ 67 2.1. INTRODUCTION...................................................................67 2.2 LEXICAL FEATURES ............................................................. 69 2.3 RESULTS................................................................................71 2.4 ONLINE DEMO.......................................................................73 3. Use Cases ............................................................................... 76 V. 구조적 특징을 이용한 임상 서식의 태깅 ..................................... 78 1. 연구 배경.................................................................................... 78 2. 연구 목표.................................................................................... 80 3. 임상 서식의 태깅을 위한 지식 모델 ................................................... 80 3.1. 온톨로지 ................................................................................ 80 3.2. 개념 모델 ............................................................................... 81 3.3. CDT 온톨로지......................................................................... 85 4. CDT 온톨로지를 이용한 임상서식 태깅 ............................................. 90 5. 결론 .......................................................................................... 93 VI. 임상 서식 지식베이스 기반의 서식 작성 지원 시스템 ............... 94 1. 시스템 소개 ................................................................................ 94 2. 시스템 구성 ................................................................................ 95 2.1. 지식 베이스 관리 모듈 ............................................................... 96 2.2. 핵심 모듈 ............................................................................... 96 2.3. 웹 사용자 인터페이스 .............................................................. 101 2.4. Web Services 인터페이스 ..................................................... 106 3. Use Case ...............................................................................108 4. 결론 ........................................................................................110 VII. 결론 .......................................................................................113 VIII. 연구의 제한점 및 제언 ...............................................................116 참고문헌 .......................................................................................118 부록 ............................................................................................129 Abstract .....................................................................................133Docto

SNU Open Repository and Archive

Mining the Medical and Patent Literature to Support Healthcare and Pharmacovigilance

Author: Gurulingappa Harsha
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Recent advancements in healthcare practices and the increasing use of information technology in the medical domain has lead to the rapid generation of free-text data in forms of scientific articles, e-health records, patents, and document inventories. This has urged the development of sophisticated information retrieval and information extraction technologies. A fundamental requirement for the automatic processing of biomedical text is the identification of information carrying units such as the concepts or named entities. In this context, this work focuses on the identification of medical disorders (such as diseases and adverse effects) which denote an important category of concepts in the medical text. Two methodologies were investigated in this regard and they are dictionary-based and machine learning-based approaches. Futhermore, the capabilities of the concept recognition techniques were systematically exploited to build a semantic search platform for the retrieval of e-health records and patents. The system facilitates conventional text search as well as semantic and ontological searches. Performance of the adapted retrieval platform for e-health records and patents was evaluated within open assessment challenges (i.e. TRECMED and TRECCHEM respectively) wherein the system was best rated in comparison to several other competing information retrieval platforms. Finally, from the medico-pharma perspective, a strategy for the identification of adverse drug events from medical case reports was developed. Qualitative evaluation as well as an expert validation of the developed system's performance showed robust results. In conclusion, this thesis presents approaches for efficient information retrieval and information extraction from various biomedical literature sources in the support of healthcare and pharmacovigilance. The applied strategies have potential to enhance the literature-searches performed by biomedical, healthcare, and patent professionals. The applied strategies have potential to enhance the literature-searches performed by biomedical, healthcare, and patent professionals. This can promote the literature-based knowledge discovery, improve the safety and effectiveness of medical practices, and drive the research and development in medical and healthcare arena

bonndoc – Der Publikationsserver der Universität Bonn

Designing m-Learning for Junior Registrars:activation of a Theoretical Model of Clinical Knowledge

Author: Boye Niels
Kanstrup Anne Marie
Nøhr Christian
Publication venue: 'IOS Press'
Publication date: 01/01/2007
Field of study

VBN

Diffusion of Electronic Health Records:six Years of Empirical Data

Author: Andersen Stig Kjær
Bernstein Knut
Bruun-Rasmussen Morten
Nøhr Christian
Vingtoft Søren
Publication venue: 'IOS Press'
Publication date: 01/01/2007
Field of study

VBN