1,855 research outputs found

    Model Dan Metoda Arsitektur Pada Sistem Tanya Jawab Medis

    Full text link
    Pada makalah ini, akan dilakukan survey beberapa penelitian yang membahas mengenai sistem tanya jawab dengan domain pada bidang medis (medical question answering = MedQuAn). Sistem MedQuAn mengolah pertanyaan yang diajukan dalam bentuk teks bahasa alami dan kemudian sistem akan memberikan jawaban yang relevan. Makalah ini mencoba menelaah modul konseptual MedQuAn, bahwa sistem tanya jawab terdiri dari tiga komponen inti yang berbeda beserta metoda/ pendekatan yang digunakan. Ketiga komponen inti tersebut adalah klasifikasi pertanyaan, pencarian dokumen, dan ekstraksi jawaban. Hasil akhir dari survey ini adalah sebuah kontribusi untuk pengembangan penelitian di masa mendatang di domain MedQuAn khususnya untuk sistem tanya jawab medis dengan menggunakan bahasa Indonesia

    Model dan Metoda Arsitektur pada Sistem Tanya Jawab Medis

    Get PDF
    Pada makalah ini, akan dilakukan survey beberapa penelitian yang membahas mengenai sistem tanya jawab dengan domain pada bidang medis (medical question answering = MedQuAn). Sistem MedQuAn mengolah pertanyaan yang diajukan dalam bentuk teks bahasa alami dan kemudian sistem akan memberikan jawaban yang relevan. Makalah ini mencoba menelaah modul konseptual MedQuAn, bahwa sistem tanya jawab terdiri dari tiga komponen inti yang berbeda beserta metoda/ pendekatan yang digunakan. Ketiga komponen inti tersebut adalah klasifikasi pertanyaan, pencarian dokumen, dan ekstraksi jawaban. Hasil akhir dari survey ini adalah sebuah kontribusi untuk pengembangan penelitian di masa mendatang di domain MedQuAn khususnya untuk sistem tanya jawab medis dengan menggunakan bahasa Indonesia

    Extractive Summarisation of Medical Documents

    Get PDF
    Background Evidence Based Medicine (EBM) practice requires practitioners to extract evidence from published medical research when answering clinical queries. Due to the time-consuming nature of this practice, there is a strong motivation for systems that can automatically summarise medical documents and help practitioners find relevant information. Aim The aim of this work is to propose an automatic query-focused, extractive summarisation approach that selects informative sentences from medical documents. MethodWe use a corpus that is specifically designed for summarisation in the EBM domain. We use approximately half the corpus for deriving important statistics associated with the best possible extractive summaries. We take into account factors such as sentence position, length, sentence content, and the type of the query posed. Using the statistics from the first set, we evaluate our approach on a separate set. Evaluation of the qualities of the generated summaries is performed automatically using ROUGE, which is a popular tool for evaluating automatic summaries. Results Our summarisation approach outperforms all baselines (best baseline score: 0.1594; our score 0.1653). Further improvements are achieved when query types are taken into account. Conclusion The quality of extractive summarisation in the medical domain can be significantly improved by incorporating domain knowledge and statistics derived from a specialised corpus. Such techniques can therefore be applied for content selection in end-to-end summarisation systems

    Are decision trees a feasible knowledge representation to guide extraction of critical information from randomized controlled trial reports?

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>This paper proposes the use of decision trees as the basis for automatically extracting information from published randomized controlled trial (RCT) reports. An exploratory analysis of RCT abstracts is undertaken to investigate the feasibility of using decision trees as a semantic structure. Quality-of-paper measures are also examined.</p> <p>Methods</p> <p>A subset of 455 abstracts (randomly selected from a set of 7620 retrieved from Medline from 1998 – 2006) are examined for the quality of RCT reporting, the identifiability of RCTs from abstracts, and the completeness and complexity of RCT abstracts with respect to key decision tree elements. Abstracts were manually assigned to 6 sub-groups distinguishing whether they were primary RCTs versus other design types. For primary RCT studies, we analyzed and annotated the reporting of intervention comparison, population assignment and outcome values. To measure completeness, the frequencies by which complete intervention, population and outcome information are reported in abstracts were measured. A qualitative examination of the reporting language was conducted.</p> <p>Results</p> <p>Decision tree elements are manually identifiable in the majority of primary RCT abstracts. 73.8% of a random subset was primary studies with a single population assigned to two or more interventions. 68% of these primary RCT abstracts were structured. 63% contained pharmaceutical interventions. 84% reported the total number of study subjects. In a subset of 21 abstracts examined, 71% reported numerical outcome values.</p> <p>Conclusion</p> <p>The manual identifiability of decision tree elements in the abstract suggests that decision trees could be a suitable construct to guide machine summarisation of RCTs. The presence of decision tree elements could also act as an indicator for RCT report quality in terms of completeness and uniformity.</p

    Sentiment classification with case-base approach

    Get PDF
    L'augmentation de la croissance des rĂ©seaux, des blogs et des utilisateurs des sites d'examen sociaux font d'Internet une Ă©norme source de donnĂ©es, en particulier sur la façon dont les gens pensent, sentent et agissent envers diffĂ©rentes questions. Ces jours-ci, les opinions des gens jouent un rĂŽle important dans la politique, l'industrie, l'Ă©ducation, etc. Alors, les gouvernements, les grandes et petites industries, les instituts universitaires, les entreprises et les individus cherchent Ă  Ă©tudier des techniques automatiques fin d’extraire les informations dont ils ont besoin dans les larges volumes de donnĂ©es. L’analyse des sentiments est une vĂ©ritable rĂ©ponse Ă  ce besoin. Elle est une application de traitement du langage naturel et linguistique informatique qui se compose de techniques de pointe telles que l'apprentissage machine et les modĂšles de langue pour capturer les Ă©valuations positives, nĂ©gatives ou neutre, avec ou sans leur force, dans des texte brut. Dans ce mĂ©moire, nous Ă©tudions une approche basĂ©e sur les cas pour l'analyse des sentiments au niveau des documents. Notre approche basĂ©e sur les cas gĂ©nĂšre un classificateur binaire qui utilise un ensemble de documents classifies, et cinq lexiques de sentiments diffĂ©rents pour extraire la polaritĂ© sur les scores correspondants aux commentaires. Puisque l'analyse des sentiments est en soi une tĂąche dĂ©pendante du domaine qui rend le travail difficile et coĂ»teux, nous appliquons une approche «cross domain» en basant notre classificateur sur les six diffĂ©rents domaines au lieu de le limiter Ă  un seul domaine. Pour amĂ©liorer la prĂ©cision de la classification, nous ajoutons la dĂ©tection de la nĂ©gation comme une partie de notre algorithme. En outre, pour amĂ©liorer la performance de notre approche, quelques modifications innovantes sont appliquĂ©es. Il est intĂ©ressant de mentionner que notre approche ouvre la voie Ă  nouveaux dĂ©veloppements en ajoutant plus de lexiques de sentiment et ensembles de donnĂ©es Ă  l'avenir.Increasing growth of the social networks, blogs, and user review sites make Internet a huge source of data especially about how people think, feel, and act toward different issues. These days, people opinions play an important role in the politic, industry, education, etc. Thus governments, large and small industries, academic institutes, companies, and individuals are looking for investigating automatic techniques to extract their desire information from large amount of data. Sentiment analysis is one true answer to this need. Sentiment analysis is an application of natural language processing and computational linguistic that consists of advanced techniques such as machine learning and language model approaches to capture the evaluative factors such as positive, negative, or neutral, with or without their strength, from plain texts. In this thesis we study a case-based approach on cross-domain for sentiment analysis on the document level. Our case-based algorithm generates a binary classifier that uses a set of the processed cases, and five different sentiment lexicons to extract the polarity along the corresponding scores from the reviews. Since sentiment analysis inherently is a domain dependent task that makes it problematic and expensive work, we use a cross-domain approach by training our classifier on the six different domains instead of limiting it to one domain. To improve the accuracy of the classifier, we add negation detection as a part of our algorithm. Moreover, to improve the performance of our approach, some innovative modifications are applied. It is worth to mention that our approach allows for further developments by adding more sentiment lexicons and data sets in the future

    CREATE: Concept Representation and Extraction from Heterogeneous Evidence

    Get PDF
    Traditional information retrieval methodology is guided by document retrieval paradigm, where relevant documents are returned in response to user queries. This paradigm faces serious drawback if the desired result is not explicitly present in a single document. The problem becomes more obvious when a user tries to obtain complete information about a real world entity, such as person, company, location etc. In such cases, various facts about the target entity or concept need to be gathered from multiple document sources. In this work, we present a method to extract information about a target entity based on the concept retrieval paradigm that focuses on extracting and blending information related to a concept from multiple sources if necessary. The paradigm is built around a generic notion of concept which is defined as any item that can be thought of as a topic of interest. Concepts may correspond to any real world entity such as restaurant, person, city, organization, etc, or any abstract item such as news topic, event, theory, etc. Web is a heterogeneous collection of data in different forms such as facts, news, opinions etc. We propose different models for different forms of data, all of which work towards the same goal of concept centric retrieval. We motivate our work based on studies about current trends and demands for information seeking. The framework helps in understanding the intent of content, i.e. opinion versus fact. Our work has been conducted on free text data in English. Nevertheless, our framework can be easily transferred to other languages

    Extractive Summarisation of Medical Documents

    Get PDF
    Background Evidence Based Medicine (EBM) practice requires practitioners to extract evidence from published medical research when answering clinical queries. Due to the time-consuming nature of this practice, there is a strong motivation for systems that can automatically summarise medical documents and help practitioners find relevant information. Aim The aim of this work is to propose an automatic query-focused, extractive summarisation approach that selects informative sentences from medical documents. Method We use a corpus that is specifically designed for summarisation in the EBM domain. We use approximately half the corpus for deriving important statistics associated with the best possible extractive summaries. We take into account factors such as sentence position, length, sentence content, and the type of the query posed. Using the statistics from the first set, we evaluate our approach on a separate set. Evaluation of the qualities of the generated summaries is performed automatically using ROUGE, which is a popular tool for evaluating automatic summaries. Results Our summarisation approach outperforms all baselines (best baseline score: 0.1594; our score 0.1653). Further improvements are achieved when query types are taken into account. Conclusion The quality of extractive summarisation in the medical domain can be significantly improved by incorporating domain knowledge and statistics derived from a specialised corpus. Such techniques can therefore be applied for content selection in end-to-end summarisation systems

    An NLP Analysis of Health Advice Giving in the Medical Research Literature

    Get PDF
    Health advice – clinical and policy recommendations – plays a vital role in guiding medical practices and public health policies. Whether or not authors should give health advice in medical research publications is a controversial issue. The proponents of actionable research advocate for the more efficient and effective transmission of science evidence into practice. The opponents are concerned about the quality of health advice in individual research papers, especially that in observational studies. Arguments both for and against giving advice in individual studies indicate a strong need for identifying and accessing health advice, for either practical use or quality evaluation purposes. However, current information services do not support the direct retrieval of health advice. Compared to other natural language processing (NLP) applications, health advice has not been computationally modeled as a language construct either. A new information service for directly accessing health advice should be able to reduce information barriers and to provide external assessment in science communication. This dissertation work built an annotated corpus of scientific claims that distinguishes health advice according to its occurrence and strength. The study developed NLP-based prediction models to identify health advice in the PubMed literature. Using the annotated corpus and prediction models, the study answered research questions regarding the practice of advice giving in medical research literature. To test and demonstrate the potential use of the prediction model, it was used to retrieve health advice regarding the use of hydroxychloroquine (HCQ) as a treatment for COVID-19 from LitCovid, a large COVID-19 research literature database curated by the National Institutes of Health. An evaluation of sentences extracted from both abstracts and discussions showed that BERT-based pre-trained language models performed well at detecting health advice. The health advice prediction model may be combined with existing health information service systems to provide more convenient navigation of a large volume of health literature. Findings from the study also show researchers are careful not to give advice solely in abstracts. They also tend to give weaker and non-specific advice in abstracts than in discussions. In addition, the study found that health advice has appeared consistently in the abstracts of observational studies over the past 25 years. In the sample, 41.2% of the studies offered health advice in their conclusions, which is lower than earlier estimations based on analyses of much smaller samples processed manually. In the abstracts of observational studies, journals with a lower impact are more likely to give health advice than those with a higher impact, suggesting the significance of the role of journals as gatekeepers of science communication. For the communities of natural language processing, information science, and public health, this work advances knowledge of the automated recognition of health advice in scientific literature. The corpus and code developed for the study have been made publicly available to facilitate future efforts in health advice retrieval and analysis. Furthermore, this study discusses the ways in which researchers give health advice in medical research articles, knowledge of which could be an essential step towards curbing potential exaggeration in the current global science communication. It also contributes to ongoing discussions of the integrity of scientific output. This study calls for caution in advice-giving in medical research literature, especially in abstracts alone. It also calls for open access to medical research publications, so that health researchers and practitioners can fully review the advice in scientific outputs and its implications. More evaluative strategies that can increase the overall quality of health advice in research articles are needed by journal editors and reviewers, given their gatekeeping role in science communication
    • 

    corecore