3 research outputs found
Evaluating Wikipedia as a source of information for disease understanding
The increasing availability of biological data is improving our understanding
of diseases and providing new insight into their underlying relationships.
Thanks to the improvements on both text mining techniques and computational
capacity, the combination of biological data with semantic information obtained
from medical publications has proven to be a very promising path. However, the
limitations in the access to these data and their lack of structure pose
challenges to this approach. In this document we propose the use of Wikipedia -
the free online encyclopedia - as a source of accessible textual information
for disease understanding research. To check its validity, we compare its
performance in the determination of relationships between diseases with that of
PubMed, one of the most consulted data sources of medical texts. The obtained
results suggest that the information extracted from Wikipedia is as relevant as
that obtained from PubMed abstracts (i.e. the free access portion of its
articles), although further research is proposed to verify its reliability for
medical studies.Comment: 6 pages, 5 figures, 5 tables, published at IEEE CBMS 2018, 2018 IEEE
31st International Symposium on Computer-Based Medical Systems (CBMS
Extracting diagnostic knowledge from MedLine Plus: a comparison between MetaMap and cTAKES Approaches
The development of diagnostic decision support systems (DDSS) requires having a reliable and
consistent knowledge base about diseases and their symptoms, signs and diagnostic tests. Physicians are
typically the source of this knowledge, but it is not always possible to obtain all the desired information from
them. Other valuable sources are medical books and articles describing the diagnosis of diseases, but again, extracting this
information is a hard and time-consuming task. In this paper we present the results of our research, in which we have used
Web scraping, natural language processing techniques, a variety of publicly available sources of diagnostic knowledge
and two widely known medical concept identifiers, MetaMap and cTAKES, to extract diagnostic criteria for infectious
diseases from MedLine Plus articles. A performance comparison of MetaMap and cTAKES is also presented