633 research outputs found

    ExaCT: automatic extraction of clinical trial characteristics from journal publications

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Clinical trials are one of the most important sources of evidence for guiding evidence-based practice and the design of new trials. However, most of this information is available only in free text - e.g., in journal publications - which is labour intensive to process for systematic reviews, meta-analyses, and other evidence synthesis studies. This paper presents an automatic information extraction system, called ExaCT, that assists users with locating and extracting key trial characteristics (e.g., eligibility criteria, sample size, drug dosage, primary outcomes) from full-text journal articles reporting on randomized controlled trials (RCTs).</p> <p>Methods</p> <p>ExaCT consists of two parts: an information extraction (IE) engine that searches the article for text fragments that best describe the trial characteristics, and a web browser-based user interface that allows human reviewers to assess and modify the suggested selections. The IE engine uses a statistical text classifier to locate those sentences that have the highest probability of describing a trial characteristic. Then, the IE engine's second stage applies simple rules to these sentences to extract text fragments containing the target answer. The same approach is used for all 21 trial characteristics selected for this study.</p> <p>Results</p> <p>We evaluated ExaCT using 50 previously unseen articles describing RCTs. The text classifier (<it>first stage</it>) was able to recover 88% of relevant sentences among its top five candidates (top5 recall) with the topmost candidate being relevant in 80% of cases (top1 precision). Precision and recall of the extraction rules (<it>second stage</it>) were 93% and 91%, respectively. Together, the two stages of the extraction engine were able to provide (partially) correct solutions in 992 out of 1050 test tasks (94%), with a majority of these (696) representing fully correct and complete answers.</p> <p>Conclusions</p> <p>Our experiments confirmed the applicability and efficacy of ExaCT. Furthermore, they demonstrated that combining a statistical method with 'weak' extraction rules can identify a variety of study characteristics. The system is flexible and can be extended to handle other characteristics and document types (e.g., study protocols).</p

    A Taxonomy of Academic Abstract Sentence Classification Modelling

    Get PDF
    Background: Abstract sentence classification modelling has the potential to advance literature discovery capability for the array of academic literature information systems, however, no artefact exists that categorises known models and identifies their key characteristics. Aims: To systematically categorise known abstract sentence classification models and make this knowledge readily available to future researchers and professionals concerned with abstract sentence classification model development and deployment. Method: An information systems taxonomy development methodology was adopted after a literature review to categorise 23 abstract sentence classification models identified from the literature. Corresponding dimensions and characteristics were derived from this process with the resulting taxonomy presented. Results: Abstract sentence classification modelling has evolved significantly with state-of-the-art models now leveraging neural networks to achieve high-performance sentence classification. The resulting taxonomy provides a novel means to observe the development of this research field and enables us to consider how such models can be further improved or deployed in real-world applications

    Are decision trees a feasible knowledge representation to guide extraction of critical information from randomized controlled trial reports?

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>This paper proposes the use of decision trees as the basis for automatically extracting information from published randomized controlled trial (RCT) reports. An exploratory analysis of RCT abstracts is undertaken to investigate the feasibility of using decision trees as a semantic structure. Quality-of-paper measures are also examined.</p> <p>Methods</p> <p>A subset of 455 abstracts (randomly selected from a set of 7620 retrieved from Medline from 1998 โ€“ 2006) are examined for the quality of RCT reporting, the identifiability of RCTs from abstracts, and the completeness and complexity of RCT abstracts with respect to key decision tree elements. Abstracts were manually assigned to 6 sub-groups distinguishing whether they were primary RCTs versus other design types. For primary RCT studies, we analyzed and annotated the reporting of intervention comparison, population assignment and outcome values. To measure completeness, the frequencies by which complete intervention, population and outcome information are reported in abstracts were measured. A qualitative examination of the reporting language was conducted.</p> <p>Results</p> <p>Decision tree elements are manually identifiable in the majority of primary RCT abstracts. 73.8% of a random subset was primary studies with a single population assigned to two or more interventions. 68% of these primary RCT abstracts were structured. 63% contained pharmaceutical interventions. 84% reported the total number of study subjects. In a subset of 21 abstracts examined, 71% reported numerical outcome values.</p> <p>Conclusion</p> <p>The manual identifiability of decision tree elements in the abstract suggests that decision trees could be a suitable construct to guide machine summarisation of RCTs. The presence of decision tree elements could also act as an indicator for RCT report quality in terms of completeness and uniformity.</p

    Making decisions based on context: models and applications in cognitive sciences and natural language processing

    Full text link
    It is known that humans are capable of making decisions based on context and generalizing what they have learned. This dissertation considers two related problem areas and proposes different models that take context information into account. By including the context, the proposed models exhibit strong performance in each of the problem areas considered. The first problem area focuses on a context association task studied in cognitive science, which evaluates the ability of a learning agent to associate specific stimuli with an appropriate response in particular spatial contexts. Four neural circuit models are proposed to model how the stimulus and context information are processed to produce a response. The neural networks are trained by modifying the strength of neural connections (weights) using principles of Hebbian learning. Such learning is considered biologically plausible, in contrast to back propagation techniques that do not have a solid neurophysiological basis. A series of theoretical results for the neural circuit models are established, guaranteeing convergence to an optimal configuration when all the stimulus-context pairs are provided during training. Among all the models, a specific model based on ideas from recommender systems trained with a primal-dual update rule, achieves perfect performance in learning and generalizing the mapping from context-stimulus pairs to correct responses. The second problem area considered in the thesis focuses on clinical natural language processing (NLP). A particular application is the development of deep-learning models for analyzing radiology reports. Four NLP tasks are considered including anatomy named entity recognition, negation detection, incidental finding detection, and clinical concept extraction. A hierarchical Recurrent Neural Network (RNN) is proposed for anatomy named entity recognition, which is then used to produce a set of features for incidental finding detection of pulmonary nodules. A clinical context word embedding model is obtained, which is used with an RNN to model clinical concept extraction. Finally, feature-enriched RNN and transformer-based models with contextual word embedding are proposed for negation detection. All these models take the (clinical) context information into account. The models are evaluated on different datasets and are shown to achieve strong performance, largely outperforming the state-of-art

    Preface

    Get PDF
    DAMSS-2018 is the jubilee 10th international workshop on data analysis methods for software systems, organized in Druskininkai, Lithuania, at the end of the year. The same place and the same time every year. Ten years passed from the first workshop. History of the workshop starts from 2009 with 16 presentations. The idea of such workshop came up at the Institute of Mathematics and Informatics. Lithuanian Academy of Sciences and the Lithuanian Computer Society supported this idea. This idea got approval both in the Lithuanian research community and abroad. The number of this year presentations is 81. The number of registered participants is 113 from 13 countries. In 2010, the Institute of Mathematics and Informatics became a member of Vilnius University, the largest university of Lithuania. In 2017, the institute changes its name into the Institute of Data Science and Digital Technologies. This name reflects recent activities of the institute. The renewed institute has eight research groups: Cognitive Computing, Image and Signal Analysis, Cyber-Social Systems Engineering, Statistics and Probability, Global Optimization, Intelligent Technologies, Education Systems, Blockchain Technologies. The main goal of the workshop is to introduce the research undertaken at Lithuanian and foreign universities in the fields of data science and software engineering. Annual organization of the workshop allows the fast interchanging of new ideas among the research community. Even 11 companies supported the workshop this year. This means that the topics of the workshop are actual for business, too. Topics of the workshop cover big data, bioinformatics, data science, blockchain technologies, deep learning, digital technologies, high-performance computing, visualization methods for multidimensional data, machine learning, medical informatics, ontological engineering, optimization in data science, business rules, and software engineering. Seeking to facilitate relations between science and business, a special session and panel discussion is organized this year about topical business problems that may be solved together with the research community. This book gives an overview of all presentations of DAMSS-2018.DAMSS-2018 is the jubilee 10th international workshop on data analysis methods for software systems, organized in Druskininkai, Lithuania, at the end of the year. The same place and the same time every year. Ten years passed from the first workshop. History of the workshop starts from 2009 with 16 presentations. The idea of such workshop came up at the Institute of Mathematics and Informatics. Lithuanian Academy of Sciences and the Lithuanian Computer Society supported this idea. This idea got approval both in the Lithuanian research community and abroad. The number of this year presentations is 81. The number of registered participants is 113 from 13 countries. In 2010, the Institute of Mathematics and Informatics became a member of Vilnius University, the largest university of Lithuania. In 2017, the institute changes its name into the Institute of Data Science and Digital Technologies. This name reflects recent activities of the institute. The renewed institute has eight research groups: Cognitive Computing, Image and Signal Analysis, Cyber-Social Systems Engineering, Statistics and Probability, Global Optimization, Intelligent Technologies, Education Systems, Blockchain Technologies. The main goal of the workshop is to introduce the research undertaken at Lithuanian and foreign universities in the fields of data science and software engineering. Annual organization of the workshop allows the fast interchanging of new ideas among the research community. Even 11 companies supported the workshop this year. This means that the topics of the workshop are actual for business, too. Topics of the workshop cover big data, bioinformatics, data science, blockchain technologies, deep learning, digital technologies, high-performance computing, visualization methods for multidimensional data, machine learning, medical informatics, ontological engineering, optimization in data science, business rules, and software engineering. Seeking to facilitate relations between science and business, a special session and panel discussion is organized this year about topical business problems that may be solved together with the research community. This book gives an overview of all presentations of DAMSS-2018

    pHealth 2021. Proc. of the 18th Internat. Conf. on Wearable Micro and Nano Technologies for Personalised Health, 8-10 November 2021, Genoa, Italy

    Get PDF
    Smart mobile systems โ€“ microsystems, smart textiles, smart implants, sensor-controlled medical devices โ€“ together with related body, local and wide-area networks up to cloud services, have become important enablers for telemedicine and the next generation of healthcare services. The multilateral benefits of pHealth technologies offer enormous potential for all stakeholder communities, not only in terms of improvements in medical quality and industrial competitiveness, but also for the management of healthcare costs and, last but not least, the improvement of patient experience. This book presents the proceedings of pHealth 2021, the 18th in a series of conferences on wearable micro and nano technologies for personalized health with personal health management systems, hosted by the University of Genoa, Italy, and held as an online event from 8 โ€“ 10 November 2021. The conference focused on digital health ecosystems in the transformation of healthcare towards personalized, participative, preventive, predictive precision medicine (5P medicine). The book contains 46 peer-reviewed papers (1 keynote, 5 invited papers, 33 full papers, and 7 poster papers). Subjects covered include the deployment of mobile technologies, micro-nano-bio smart systems, bio-data management and analytics, autonomous and intelligent systems, the Health Internet of Things (HIoT), as well as potential risks for security and privacy, and the motivation and empowerment of patients in care processes. Providing an overview of current advances in personalized health and health management, the book will be of interest to all those working in the field of healthcare today

    The Study on Automatic Annotation using Structural/Linguistic Characteristics of biomedical documents

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์น˜์˜๊ณผํ•™๊ณผ ์˜๋ฃŒ๊ฒฝ์˜์ •๋ณดํ•™์ „๊ณต, 2015. 8. ๊น€ํ™๊ธฐ.์ž๋™ ์–ด๋…ธํ…Œ์ด์…˜์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๋Š” ๊ธ‰์†๋„๋กœ ์ฆ๊ฐ€ํ•˜๋Š” ์˜์ƒ๋ช… ๋ถ„์•ผ์˜ ๋…ผ๋ฌธ ๊ณผ ์ž„์ƒ ๋ฌธ์„œ๋“ค์„ ๋”์šฑ ์ •ํ™•ํ•˜๊ฒŒ ๊ฒ€์ƒ‰ํ•˜๊ฑฐ๋‚˜ ํ•„์š”ํ•œ ์ •๋ณด๋งŒ์„ ์ถ”์ถœํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•˜๋Š” ๊ธฐ๋ฐ˜์ด ๋œ๋‹ค๋Š” ์ ์—์„œ ์ค‘์š”ํ•˜๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š”, ๊ทธ ์ค‘ ์—ฐ๊ตฌ ํ™œ ๋™์—์„œ ํ•„์ˆ˜์ ์ธ ๋…ผ๋ฌธ ๊ฒ€์ƒ‰๊ณผ ํ™˜์ž์˜ ์งˆ๋ณ‘์— ๋Œ€ํ•œ ์ง„๋‹จ, ๊ฒ€์‚ฌ, ๊ทธ๋ฆฌ๊ณ  ์ฒ˜ ๋ฐฉ ๋“ฑ์„ ๊ธฐ๋กํ•˜๋Š”๋ฐ ํ•„์ˆ˜์ ์ธ ์ž„์ƒ์„œ์‹์˜ ์ž‘์„ฑ์— ์ดˆ์ ์„ ๋งž์ถ”์–ด, ์ด์— ํ•„ ์š”ํ•œ ์–ด๋…ธํ…Œ์ด์…˜ ๊ธฐ์ˆ ์„ ์—ฐ๊ตฌํ•˜์˜€๋‹ค. ์ด ๋‘ ๊ฐ€์ง€ ํ™œ๋™์€ ์˜์ƒ๋ช… ๋ถ„์•ผ์˜ ๋Œ€ ํ‘œ ๋ฌธ์„œ์ธ ๋…ผ๋ฌธ๊ณผ ์ž„์ƒ์„œ์‹์„ ๋Œ€์ƒ์œผ๋กœ ์ผ์ƒ์ ์œผ๋กœ ์ผ์–ด๋‚˜๋Š” ๊ฒƒ์ด๋ฉฐ, ์ด ๋Ÿฌํ•œ ํ™œ๋™์ด ํšจ์œจ์ ์œผ๋กœ ๊ฐœ์„ ๋˜๋Š” ๊ฒƒ์€ ์˜์ƒ๋ช… ๋ถ„์•ผ์—์„œ ์ค‘์š”ํ•œ ์˜๋ฏธ๋ฅผ ๊ฐ€์ง„๋‹ค. ๋จผ์ €, ํ…์ŠคํŠธ ํ˜•์‹์˜ ์—ฐ๊ตฌ ๋…ผ๋ฌธ์— ๋Œ€ํ•ด์„œ๋Š” ์—ฐ๊ตฌ ํ™œ๋™์˜ ๋ฐฉํ–ฅ ์„ค์ •์— ์ค‘ ์š”ํ•œ ์—ญํ• ์„ ํ•˜๋Š” ์ดˆ๋ก์„ ๋Œ€์ƒ์œผ๋กœ, ์˜์ƒ๋ช… ๋ถ„์•ผ์—์„œ ์ฃผ๋กœ ์‚ฌ์šฉํ•˜๋Š” IMRAD(Introduction, Methods, Results, and Discussion)๋กœ์˜ ์ž๋™ ํƒœ๊น…์„ ์—ฐ๊ตฌํ•˜์˜€๋‹ค. ์ด ์—ฐ๊ตฌ์—์„œ๋Š”, ๊ธฐ์กด ์–ธ์–ดํ•™ ๋ถ„์•ผ์—์„œ ์˜์ƒ๋ช… ๋ถ„์•ผ์˜ ๋…ผ๋ฌธ์„ ๋Œ€์ƒ์œผ๋กœ ์ด๋ฃฌ ๊ฒฐ๊ณผ์™€ ์ปดํ“จํ„ฐ ๊ณผํ•™ ๋ถ„์•ผ์—์„œ ์ง„ํ–‰๋ผ์˜จ ๊ฒฐ๊ณผ๋ฅผ ๊ธฐ ๋ฐ˜์œผ๋กœ, ๊ณ„์‚ฐ ๋น„์šฉ์ด ์ ์œผ๋ฉด์„œ๋„ ๋†’์€ ์„ฑ๋Šฅ์„ ๋‚ด๋Š” ์ƒˆ๋กœ์šด ์ž๋™ ํƒœ๊น… ์‹œ์Šค ํ…œ์„ ์ œ์•ˆํ•˜๊ณ  ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ, ๋ฌธ ์žฅ์—์„œ ๋ฝ‘์•„๋‚ธ 17๊ฐœ์˜ ํŠน์ง•๋งŒ์œผ๋กœ๋„ ๋น„๊ตฌ์กฐํ™”๋œ ์ดˆ๋ก์„ Accuracy 77.0 ~ 90.3%์˜ ์„ฑ๋Šฅ์œผ๋กœ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋˜ํ•œ, ๊ธฐ์กด ์—ฐ๊ตฌ๋“ค์—์„œ ์‚ฌ์šฉํ•œ ํŠน ์ง•๋“ค๊ณผ ํ•จ๊ป˜ ์‚ฌ์šฉํ–ˆ์„ ๋•Œ๋Š” ์ตœ๋Œ€ Accuracy 91.7%์˜ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ์ž„์ƒ ๋ฌธ์„œ์˜ ๊ฒฝ์šฐ, EMR(Electronic Medical Record)์„ ์‹œ์Šคํ…œ์„ ์‚ฌ์šฉํ•˜๋Š” ํ™˜๊ฒฝ์—์„œ๋Š” ์ž„์ƒ ์„œ์‹์„ ํ†ตํ•ด ์ƒ์„ฑ๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋Œ€๋ถ€๋ถ„์ด๋ฏ€๋กœ, ์ž„ ์ƒ ์„œ์‹์„ ๋Œ€์ƒ์œผ๋กœ ์ž๋™ ํƒœ๊น…์„ ์‹œ๋„ํ•˜์˜€๋‹ค. ์ž„์ƒ ์„œ์‹์€ ์—ฐ๊ตฌ ์ดˆ๋ก๊ณผ๋Š” ๋‹ฌ๋ฆฌ ์ด๋ฏธ ๊ตฌ์กฐํ™”๋œ ํ˜•์‹์„ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฏ€๋กœ, ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ด ๊ตฌ์กฐ ์•ˆ์— ๋‚ด์žฌ๋œ ์ „๋ฌธ๊ฐ€์˜ ์ง€์‹์„ ํƒœ๊น…ํ•˜๊ณ ์ž ํ•˜์˜€๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ์ƒˆ๋กœ์šด ์ง€์‹๋ชจ๋ธ ๊ณผ ์ด๋ฅผ ์ด์šฉํ•œ ์ž„์ƒ ์„œ์‹ ์ž‘์„ฑ ์ง€์› ์‹œ์Šคํ…œ์ธ STEP(Smart Clinical Document Template Editing and Production System)์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. STEP์˜ ์‹œ์Šคํ…œ์˜ ํ™œ์šฉ์„ฑ์„ ๊ฒ€์ฆํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ž„์ƒ ์„œ์‹ ์ž‘์„ฑ ๋„๊ตฌ๋ฅผ ๊ฐœ ๋ฐœํ•˜์—ฌ, ์ง€์‹ ๋ชจ๋ธ์„ ํ†ตํ•ด ๊ตฌ์ถ•๋œ ์ง€์‹๋ฒ ์ด์Šค๊ฐ€ ์ž„์ƒ ์„œ์‹์˜ ์ž‘์„ฑ์„ ๊ฐœ์„  ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ์„ ๋ณด์˜€๋‹ค. ์—ฐ๊ตฌ ๊ฒฐ๊ณผ๋Š” ์˜์ƒ๋ช… ๋ถ„์•ผ์˜ ์—ฐ๊ตฌ์ž๋“ค์—๊ฒŒ ๋Œ€๊ทœ๋ชจ์˜ ์˜์ƒ๋ช… ๊ด€๋ จ ๋…ผ๋ฌธ๊ณผ ์ž„์ƒ์—์„œ ์ง€์†์ ์œผ๋กœ ์ƒ์‚ฐ๋˜๋Š” ์ž„์ƒ ๋ฌธ์„œ๊ฐ€ ๋”์šฑ ์ •ํ™•ํ•˜๊ฒŒ ๊ฒ€์ƒ‰๋˜๊ณ  ์žฌ์‚ฌ ์šฉ๋  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฐ๊ณผ๋Š” ์˜์ƒ๋ช… ๋ถ„์•ผ ์ „๋ฐ˜์—์„œ ์—ฐ๊ตฌ ์ž๋“ค์˜ ํ™œ๋™์„ ๊ฐœ์„ ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์—์„œ ์ค‘์š”ํ•˜๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ๋ณธ ์—ฐ๊ตฌ ์˜ ์„ฑ๊ณผ๊ฐ€ ๋‹ค๋ฅธ ์—ฐ๊ตฌ์ž๋“ค์—๊ฒŒ๋„ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ๋„๋ก, ์—ฐ๊ตฌ ๊ณผ์ •์—์„œ ์ถ”์ถœํ•œ ์–ธ์–ด ์ž์›๊ณผ ๊ฒฐ๊ณผ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋Š” ์‹œ์Šคํ…œ์„ ์›น์œผ๋กœ ๊ณต๊ฐœํ•˜์˜€๋‹ค.์ดˆ ๋ก....................................................................................................i ๋ชฉ ์ฐจ..................................................................................................iii I. ์„œ๋ก ................................................................................................1 1. ์—ฐ๊ตฌ ๋ฐฐ๊ฒฝ ......................................................................................1 2. ์—ฐ๊ตฌ ๋ชฉ์  ......................................................................................5 3. ๋…ผ๋ฌธ์˜ ๊ตฌ์„ฑ....................................................................................6 II. ๊ตฌ์กฐํ™”๋œ ์ดˆ๋ก์˜ ์–ธ์–ด์  ํŠน์ง• ์ถ”์ถœ..................................................7 1. ์—ฐ๊ตฌ ๋ฐฐ๊ฒฝ .....................................................................................7 2. ์—ฐ๊ตฌ ๋ชฉ์  .....................................................................................9 3. ๊ด€๋ จ ์—ฐ๊ตฌ .....................................................................................9 4. ์—ฐ๊ตฌ ๋ฐฉ๋ฒ• ................................................................................... 12 4.1. ๋ฐ์ดํ„ฐ ์ฝ”ํผ์Šค ......................................................................... 13 4.2. ์„น์…˜ ์ •๊ทœํ™”............................................................................. 14 4.3. ์„น์…˜ ๋งตํ•‘ ............................................................................... 17 4.4. ์–ธ์–ด์  ํŠน์ง• ์ถ”์ถœ ..................................................................... 18 5. ๊ฒฐ๊ณผ ......................................................................................... 20 5.1. ์„น์…˜๋ณ„ ๋™์‚ฌ/๋™์‚ฌ๊ตฌ์˜ ์‚ฌ์šฉ ํŠน์ง• .................................................. 20 5.2. ์„น์…˜๋ณ„ N-gram์˜ ์‚ฌ์šฉ ํŠน์ง• ...................................................... 22 5.3. ์„น์…˜๋ณ„ ๋ช…์‚ฌ(๊ตฌ)์˜ ์‚ฌ์šฉ ํŠน์ง• ....................................................... 24 5.4. ์–ธ์–ด์  ํŠน์ง•๋“ค์˜ ์„น์…˜ ๊ตฌ๋ณ„๋ ฅ ...................................................... 27 6. ๊ฒฐ๋ก  .......................................................................................... 41 III. ์–ธ์–ด์  ํŠน์ง•์„ ์ด์šฉํ•œ ์ดˆ๋ก ๋ฌธ์žฅ ๋ถ„๋ฅ˜................................................. 44 1. ์—ฐ๊ตฌ ๋ฐฐ๊ฒฝ ................................................................................... 44 2. ์—ฐ๊ตฌ ๋ชฉ์  ................................................................................... 45 3. ๊ด€๋ จ ์—ฐ๊ตฌ ................................................................................... 45 4. ์—ฐ๊ตฌ ๋ฐฉ๋ฒ• ................................................................................... 48 4.1. Feature Set ๊ตฌ์„ฑ ................................................................... 48 4.2. ํ…Œ์ŠคํŠธ ๋ฌธ์„œ ์ง‘ํ•ฉ ...................................................................... 52 4.3. SVM์„ ์ด์šฉํ•œ ํ•™์Šต ๋ฐ ํ‰๊ฐ€ ....................................................... 53 5. ์—ฐ๊ตฌ ๊ฒฐ๊ณผ ................................................................................... 54 5.1. ์–ธ์–ด์  ํŠน์ง•๋ณ„ ์„ฑ๋Šฅ.....................................................................54 5.2. ํŠน์ง• ๊ทธ๋ฃน ์กฐํ•ฉ๋ณ„ ์„ฑ๋Šฅ ............................................................... 56 6. ๋…ผ์˜ .......................................................................................... 65 IV. ์˜์ƒ๋ช… ์ดˆ๋ก ๋ฌธ์žฅ ์ž๋™ ํƒœ๊น… ์‹œ์Šคํ…œ.............................................. 67 1. ์‹œ์Šคํ…œ ์†Œ๊ฐœ ................................................................................ 67 2. ์„œ๋น„์Šค ๊ตฌ์„ฑ ................................................................................ 67 2.1. INTRODUCTION...................................................................67 2.2 LEXICAL FEATURES ............................................................. 69 2.3 RESULTS................................................................................71 2.4 ONLINE DEMO.......................................................................73 3. Use Cases ............................................................................... 76 V. ๊ตฌ์กฐ์  ํŠน์ง•์„ ์ด์šฉํ•œ ์ž„์ƒ ์„œ์‹์˜ ํƒœ๊น… ..................................... 78 1. ์—ฐ๊ตฌ ๋ฐฐ๊ฒฝ.................................................................................... 78 2. ์—ฐ๊ตฌ ๋ชฉํ‘œ.................................................................................... 80 3. ์ž„์ƒ ์„œ์‹์˜ ํƒœ๊น…์„ ์œ„ํ•œ ์ง€์‹ ๋ชจ๋ธ ................................................... 80 3.1. ์˜จํ†จ๋กœ์ง€ ................................................................................ 80 3.2. ๊ฐœ๋… ๋ชจ๋ธ ............................................................................... 81 3.3. CDT ์˜จํ†จ๋กœ์ง€......................................................................... 85 4. CDT ์˜จํ†จ๋กœ์ง€๋ฅผ ์ด์šฉํ•œ ์ž„์ƒ์„œ์‹ ํƒœ๊น… ............................................. 90 5. ๊ฒฐ๋ก  .......................................................................................... 93 VI. ์ž„์ƒ ์„œ์‹ ์ง€์‹๋ฒ ์ด์Šค ๊ธฐ๋ฐ˜์˜ ์„œ์‹ ์ž‘์„ฑ ์ง€์› ์‹œ์Šคํ…œ ............... 94 1. ์‹œ์Šคํ…œ ์†Œ๊ฐœ ................................................................................ 94 2. ์‹œ์Šคํ…œ ๊ตฌ์„ฑ ................................................................................ 95 2.1. ์ง€์‹ ๋ฒ ์ด์Šค ๊ด€๋ฆฌ ๋ชจ๋“ˆ ............................................................... 96 2.2. ํ•ต์‹ฌ ๋ชจ๋“ˆ ............................................................................... 96 2.3. ์›น ์‚ฌ์šฉ์ž ์ธํ„ฐํŽ˜์ด์Šค .............................................................. 101 2.4. Web Services ์ธํ„ฐํŽ˜์ด์Šค ..................................................... 106 3. Use Case ...............................................................................108 4. ๊ฒฐ๋ก  ........................................................................................110 VII. ๊ฒฐ๋ก  .......................................................................................113 VIII. ์—ฐ๊ตฌ์˜ ์ œํ•œ์  ๋ฐ ์ œ์–ธ ...............................................................116 ์ฐธ๊ณ ๋ฌธํ—Œ .......................................................................................118 ๋ถ€๋ก ............................................................................................129 Abstract .....................................................................................133Docto

    Mining the Medical and Patent Literature to Support Healthcare and Pharmacovigilance

    Get PDF
    Recent advancements in healthcare practices and the increasing use of information technology in the medical domain has lead to the rapid generation of free-text data in forms of scientific articles, e-health records, patents, and document inventories. This has urged the development of sophisticated information retrieval and information extraction technologies. A fundamental requirement for the automatic processing of biomedical text is the identification of information carrying units such as the concepts or named entities. In this context, this work focuses on the identification of medical disorders (such as diseases and adverse effects) which denote an important category of concepts in the medical text. Two methodologies were investigated in this regard and they are dictionary-based and machine learning-based approaches. Futhermore, the capabilities of the concept recognition techniques were systematically exploited to build a semantic search platform for the retrieval of e-health records and patents. The system facilitates conventional text search as well as semantic and ontological searches. Performance of the adapted retrieval platform for e-health records and patents was evaluated within open assessment challenges (i.e. TRECMED and TRECCHEM respectively) wherein the system was best rated in comparison to several other competing information retrieval platforms. Finally, from the medico-pharma perspective, a strategy for the identification of adverse drug events from medical case reports was developed. Qualitative evaluation as well as an expert validation of the developed system's performance showed robust results. In conclusion, this thesis presents approaches for efficient information retrieval and information extraction from various biomedical literature sources in the support of healthcare and pharmacovigilance. The applied strategies have potential to enhance the literature-searches performed by biomedical, healthcare, and patent professionals. The applied strategies have potential to enhance the literature-searches performed by biomedical, healthcare, and patent professionals. This can promote the literature-based knowledge discovery, improve the safety and effectiveness of medical practices, and drive the research and development in medical and healthcare arena
    • โ€ฆ
    corecore