39,101 research outputs found
Extracting Conceptual Terms from Medical Documents
Automated biomedical concept recognition is important for biomedical document retrieval and text mining research. In this paper, we describe a two-step concept extraction technique for documents in biomedical domain. Step one includes noun phrase extraction, which can automatically extract noun phrases from medical documents. Extracted noun phrases are used as concept term candidates which become inputs of next step. Step two includes keyphrase extraction, which can automatically identify important topical terms from candidate terms. Experiments were conducted to evaluate results of both steps. The experiment results show that our noun phrase extractor is effective in identifying noun phrases from medical documents, so is the keyphrase extractor in identifying document conceptual terms
Ontologies and Information Extraction
This report argues that, even in the simplest cases, IE is an ontology-driven
process. It is not a mere text filtering method based on simple pattern
matching and keywords, because the extracted pieces of texts are interpreted
with respect to a predefined partial domain model. This report shows that
depending on the nature and the depth of the interpretation to be done for
extracting the information, more or less knowledge must be involved. This
report is mainly illustrated in biology, a domain in which there are critical
needs for content-based exploration of the scientific literature and which
becomes a major application domain for IE
A Relation Extraction Approach for Clinical Decision Support
In this paper, we investigate how semantic relations between concepts
extracted from medical documents can be employed to improve the retrieval of
medical literature. Semantic relations explicitly represent relatedness between
concepts and carry high informative power that can be leveraged to improve the
effectiveness of retrieval functionalities of clinical decision support
systems. We present preliminary results and show how relations are able to
provide a sizable increase of the precision for several topics, albeit having
no impact on others. We then discuss some future directions to minimize the
impact of negative results while maximizing the impact of good results.Comment: 4 pages, 1 figure, DTMBio-KMH 2018, in conjunction with ACM 27th
Conference on Information and Knowledge Management (CIKM), October 22-26
2018, Lingotto, Turin, Ital
Terminology Extraction for and from Communications in Multi-disciplinary Domains
Terminology extraction generally refers to methods and systems for identifying term candidates in a uni-disciplinary and uni-lingual
environment such as engineering, medical, physical and geological sciences, or administration, business and leisure. However, as
human enterprises get more and more complex, it has become increasingly important for teams in one discipline to collaborate with
others from not only a non-cognate discipline but also speaking a different language. Disaster mitigation and recovery, and conflict
resolution are amongst the areas where there is a requirement to use standardised multilingual terminology for communication. This
paper presents a feasibility study conducted to build terminology (and ontology) in the domain of disaster management and is part of the
broader work conducted for the EU project Sland \ub4 ail (FP7 607691). We have evaluated CiCui (for Chinese name \ub4 \u8bcd\u8403, which translates to
words gathered), a corpus-based text analytic system that combine frequency, collocation and linguistic analyses to extract candidates
terminologies from corpora comprised of domain texts from diverse sources. CiCui was assessed against four terminology extraction
systems and the initial results show that it has an above average precision in extracting terms
Improving Term Extraction with Terminological Resources
Studies of different term extractors on a corpus of the biomedical domain
revealed decreasing performances when applied to highly technical texts. The
difficulty or impossibility of customising them to new domains is an additional
limitation. In this paper, we propose to use external terminologies to
influence generic linguistic data in order to augment the quality of the
extraction. The tool we implemented exploits testified terms at different steps
of the process: chunking, parsing and extraction of term candidates.
Experiments reported here show that, using this method, more term candidates
can be acquired with a higher level of reliability. We further describe the
extraction process involving endogenous disambiguation implemented in the term
extractor YaTeA
- …