1,013 research outputs found
A Corpus of Potentially Contradictory Research Claims from Cardiovascular Research Abstracts
Background: Research literature in biomedicine and related fields contains a huge number
of claims, such as the effectiveness of treatments. These claims are not always consistent and
may even contradict each other. Being able to identify contradictory claims is important for
those who rely on the biomedical literature. Automated methods to identify and resolve them
are required to cope with the amount of information available. However, research in this area
has been hampered by a lack of suitable resources. We describe a methodology to develop a
corpus which addresses this gap by providing examples of potentially contradictory claims and
demonstrate how it can be applied to identify these claims from Medline abstracts related to the
topic of cardiovascular disease.
Methods A set of systematic reviews concerned with four topics in cardiovascular disease were
identified from Medline and analysed to determine whether the abstracts they reviewed contained
contradictory research claims. For each review, annotators were asked to analyse these abstracts
to identify claims within them that answered the question addressed in the review. The annotators
were also asked to indicate how the claim related to that question and the type of the claim.
Results: A total of 259 abstracts associated with 24 systematic reviews were used to form
the corpus. Agreement between the annotators was high, suggesting that the information they
provided is reliable.
Conclusions: The paper describes a methodology for constructing a corpus containing contradictory
research claims from the biomedical literature. The corpus is made available to enable
further research into this area and support the development of automated approaches to contradiction
identification
The Detection of Contradictory Claims in Biomedical Abstracts
Research claims in the biomedical domain are not always consistent, and may even be contradictory. This thesis explores contradictions between research claims in order to
determine whether or not it is possible to develop a solution to automate the detection of such phenomena. Such a solution will help decision-makers, including researchers, to alleviate the effects of contradictory claims on their decisions.
This study develops two methodologies to construct corpora of contradictions. The first methodology utilises systematic reviews to construct a manually-annotated corpus
of contradictions. The second methodology uses a different approach to construct a corpus of contradictions which does not rely on human annotation. This methodology is proposed to overcome the limitations of the manual annotation approach.
Moreover, this thesis proposes a pipeline to detect contradictions in abstracts. The pipeline takes a question and a list of research abstracts which may contain answers
to it. The output of the pipeline is a list of sentences extracted from abstracts which answer the question, where each sentence is annotated with an assertion value with
respect to the question. Claims which feature opposing assertion values are considered as potentially contradictory claims.
The research demonstrates that automating the detection of contradictory claims in research abstracts is a feasible problem
An NLP Analysis of Health Advice Giving in the Medical Research Literature
Health advice – clinical and policy recommendations – plays a vital role in guiding medical practices and public health policies. Whether or not authors should give health advice in medical research publications is a controversial issue. The proponents of actionable research advocate for the more efficient and effective transmission of science evidence into practice. The opponents are concerned about the quality of health advice in individual research papers, especially that in observational studies. Arguments both for and against giving advice in individual studies indicate a strong need for identifying and accessing health advice, for either practical use or quality evaluation purposes. However, current information services do not support the direct retrieval of health advice. Compared to other natural language processing (NLP) applications, health advice has not been computationally modeled as a language construct either. A new information service for directly accessing health advice should be able to reduce information barriers and to provide external assessment in science communication.
This dissertation work built an annotated corpus of scientific claims that distinguishes health advice according to its occurrence and strength. The study developed NLP-based prediction models to identify health advice in the PubMed literature. Using the annotated corpus and prediction models, the study answered research questions regarding the practice of advice giving in medical research literature. To test and demonstrate the potential use of the prediction model, it was used to retrieve health advice regarding the use of hydroxychloroquine (HCQ) as a treatment for COVID-19 from LitCovid, a large COVID-19 research literature database curated by the National Institutes of Health.
An evaluation of sentences extracted from both abstracts and discussions showed that BERT-based pre-trained language models performed well at detecting health advice. The health advice prediction model may be combined with existing health information service systems to provide more convenient navigation of a large volume of health literature. Findings from the study also show researchers are careful not to give advice solely in abstracts. They also tend to give weaker and non-specific advice in abstracts than in discussions. In addition, the study found that health advice has appeared consistently in the abstracts of observational studies over the past 25 years. In the sample, 41.2% of the studies offered health advice in their conclusions, which is lower than earlier estimations based on analyses of much smaller samples processed manually. In the abstracts of observational studies, journals with a lower impact are more likely to give health advice than those with a higher impact, suggesting the significance of the role of journals as gatekeepers of science communication.
For the communities of natural language processing, information science, and public health, this work advances knowledge of the automated recognition of health advice in scientific literature. The corpus and code developed for the study have been made publicly available to facilitate future efforts in health advice retrieval and analysis. Furthermore, this study discusses the ways in which researchers give health advice in medical research articles, knowledge of which could be an essential step towards curbing potential exaggeration in the current global science communication. It also contributes to ongoing discussions of the integrity of scientific output.
This study calls for caution in advice-giving in medical research literature, especially in abstracts alone. It also calls for open access to medical research publications, so that health researchers and practitioners can fully review the advice in scientific outputs and its implications. More evaluative strategies that can increase the overall quality of health advice in research articles are needed by journal editors and reviewers, given their gatekeeping role in science communication
Cross-lingual argument mining in the medical domain
Nowadays the medical domain is receiving more and more attention in the applications involving Artificial Intelligence. Clinicians have to deal with an enormous amount of unstructured textual data to make a conclusion about patient's health in their everyday life. Argument mining helps to provide a structure to such data by detecting argumentative components in the text and classifying the relations between them. However, as it is the case for many tasks in Natural Language Processing in general and in medical text processing in particular, the large majority of the work on computational argumentation has been done only for English. This is also the case with the only dataset available for argumentation in the medical domain, namely, the annotated medical data of abstracts of Randomized Controlled Trials (RCT) from the MEDLINE database. In order to mitigate the lack of annotated data for other languages, we empirically investigate several strategies to perform argument mining and classification in medical texts for a language for which no annotated data is available. This thesis shows that automatically translating and project annotations from English to a target language (Spanish) is an effective way to generate annotated data without manual intervention. Furthermore, our experiments demonstrate that the translation and projection approach outperforms zero-shot cross-lingual approaches using a large masked multilingual language model. Finally, we show how the automatically generated data in Spanish can also be used to improve results in the original English evaluation setting
Citationally Enhanced Semantic Literature Based Discovery
We are living within the age of information. The ever increasing flow of data and publications poses a monumental bottleneck to scientific progress as despite the amazing abilities of the human mind, it is woefully inadequate in processing such a vast quantity of multidimensional information. The small bits of flotsam and jetsam that we leverage belies the amount of useful information beneath the surface. It is imperative that automated tools exist to better search, retrieve, and summarize this content. Combinations of document indexing and search engines can quickly find you a document whose content best matches your query - if the information is all contained within a single document. But it doesn’t draw connections, make hypotheses, or find knowledge hidden across multiple documents. Literature-based discovery is an approach that can uncover hidden interrelationships between topics by extracting information from existing published scientific literature. The proposed study utilizes a semantic-based approach that builds a graph of related concepts between two user specified sets of topics using semantic predications. In addition, the study includes properties of bibliographically related documents and statistical properties of concepts to further enhance the quality of the proposed intermediate terms. Our results show an improvement in precision-recall when incorporating citations
Recommended from our members
The classification of gene products in the molecular biology domain: Realism, objectivity, and the limitations of the Gene Ontology
Background: Controlled vocabularies in the molecular biology domain exist to facilitate data integration across database resources. One such tool is the Gene Ontology (GO), a classification designed to act as a universal index for gene products from any species. The Gene Ontology is used extensively in annotating gene products and analysing gene expression data, yet very little research exists from a library and information science perspective exploring the design principles, philosophy and social role of ontologies in biology.
Aim: To explore how molecular biologists, in creating the Gene Ontology, devised guidelines and rules for determining which scientific concepts are included in the ontology, and the criteria for how these concepts are represented.
Methods: A domain analysis approach was used to devise a mixed methodology to study the design of the Gene Ontology. Concept analysis of a GO term and a critical discourse analysis of GO developer mailing list texts were used to test whether ontological realism is a tenable basis for constructing objective ontologies. A comparison of the current GO vocabulary construction guidelines and a study of the reasons why GO terms are removed from the ontology further explored the justifications for the design of the Gene Ontology. Finally, a content analysis of published GO papers examined how authors use and cite GO data and terminology.
Results: Gene Ontology terms can be presented according to different epistemologies for concepts, indicating that ontological realism is not the only way objective ontologies can be designed. Social roles and the exercise of power were found to play an important role in determining ontology content, and poor synonym control, a lack of clear warrant for deciding terminology and arbitrary decisions to delete and invent new terms undermine the objectivity and universal applicability of the Gene Ontology. Authors exhibited poor compliance with GO data citation policies, and in re-wording and misquoting GO terminology, risk exacerbating the semantic problems this controlled vocabulary was designed to solve.
Conclusions: The failure of the Gene Ontology to define what is meant by a molecular function, the exercise of power by GO developers in clearing contentious concepts from the ontology, and the strict adherence to ontological realism, which marginalises social and subjective ways of classifying scientific concepts, limits the utility of the ontology as a tool to unify the molecular biology domain. These limitations to the Gene Ontology design could be overcome with the development of lighter, pluralistic, user-controlled ‘open ontologies’ for gene products that can work alongside more traditional, ‘top-down’ developed vocabularies
Abstracts 2012: Highlights of Student Research and Creative Endeavors
https://csuepress.columbusstate.edu/abstracts/1004/thumbnail.jp
AI Hallucinations: A Misnomer Worth Clarifying
As large language models continue to advance in Artificial Intelligence (AI),
text generation systems have been shown to suffer from a problematic phenomenon
termed often as "hallucination." However, with AI's increasing presence across
various domains including medicine, concerns have arisen regarding the use of
the term itself. In this study, we conducted a systematic review to identify
papers defining "AI hallucination" across fourteen databases. We present and
analyze definitions obtained across all databases, categorize them based on
their applications, and extract key points within each category. Our results
highlight a lack of consistency in how the term is used, but also help identify
several alternative terms in the literature. We discuss implications of these
and call for a more unified effort to bring consistency to an important
contemporary AI issue that can affect multiple domains significantly
2023 SOARS Conference Program
Program for the 2023 Showcase of Osprey Advancements in Research and Scholarship (SOARS
- …