1,985 research outputs found
Knowledge-based Biomedical Data Science 2019
Knowledge-based biomedical data science (KBDS) involves the design and
implementation of computer systems that act as if they knew about biomedicine.
Such systems depend on formally represented knowledge in computer systems,
often in the form of knowledge graphs. Here we survey the progress in the last
year in systems that use formally represented knowledge to address data science
problems in both clinical and biological domains, as well as on approaches for
creating knowledge graphs. Major themes include the relationships between
knowledge graphs and machine learning, the use of natural language processing,
and the expansion of knowledge-based approaches to novel domains, such as
Chinese Traditional Medicine and biodiversity.Comment: Manuscript 43 pages with 3 tables; Supplemental material 43 pages
with 3 table
A Semantic Approach to Negation Detection and Word Disambiguation with Natural Language Processing
This study aims to demonstrate the methods for detecting negations in a
sentence by uniquely evaluating the lexical structure of the text via
word-sense disambiguation. The proposed framework examines all the unique
features in the various expressions within a text to resolve the contextual
usage of all tokens and decipher the effect of negation on sentiment analysis.
The application of popular expression detectors skips this important step,
thereby neglecting the root words caught in the web of negation and making text
classification difficult for machine learning and sentiment analysis. This
study adopts the Natural Language Processing (NLP) approach to discover and
antonimize words that were negated for better accuracy in text classification
using a knowledge base provided by an NLP library called WordHoard. Early
results show that our initial analysis improved on traditional sentiment
analysis, which sometimes neglects negations or assigns an inverse polarity
score. The SentiWordNet analyzer was improved by 35%, the Vader analyzer by 20%
and the TextBlob by 6%
Recommended from our members
Confucius computer: a philosophical digital agent for intergenerational philosophical play
Confucianism is commonly defined as “... a system of philosophical, ethical and political thought based on the teachings of Confucius,” which originated through the teachings of Confucius during the sixth-century BCE. It is a way of life or a philosophy of human nature that considers human relationships as the foundation of the society. Confucius teachings had highly influenced the development of several cultures in Asia, making Confucianism an intangible cultural heritage. In this paper, we are re-acquainting users with an intangible heritage that is part of their everyday, by developing a system that permits experiencing Confucius teachings virtually and interactively. The system can measure philosophical intent of the human and generate meaningful philosophical answers. It is also aimed for intergenerational sharing of Confucius heritage through a simple interactive process with the virtual sage making the experience enjoyable and entertaining. Previous research in natural language processing (NLP) mainly focused on the understanding and delivering of human natural language accurately. In this research, we explored how to apply NLP to model the knowledge and teachings of Confucius, through the natural conversation between human and computer. This virtual Confucius, a chat agent that generates outputs based on Confucius teachings, using a series of algorithms and techniques to improve the matching accuracy between user input and computer output, introduces a novel way of interacting with intangible cultures. Our user evaluation results revealed that there is a positive correlation between relevance and enjoyment, finding their experiences interacting with virtual Confucius very encouraging. Adults who participated in experiencing the virtual Confucius together with their children believed that this system has the potential to improve intergenerational interactions through shared play
Automatic Concept Extraction in Semantic Summarization Process
The Semantic Web offers a generic infrastructure for interchange, integration and creative reuse of structured data, which can help to cross some of the boundaries that Web 2.0 is facing. Currently, Web 2.0 offers poor query possibilities apart from searching by keywords or tags. There has been a great deal of interest in the development of semantic-based systems to facilitate knowledge representation and extraction and content integration [1], [2]. Semantic-based approach to retrieving relevant material can be useful to address issues like trying to determine the type or the quality of the information suggested from a personalized environment. In this context, standard keyword search has a very limited effectiveness. For example, it cannot filter for the type of information, the level of information or the quality of information.
Potentially, one of the biggest application areas of content-based exploration might be personalized searching framework (e.g., [3],[4]). Whereas search engines provide nowadays largely anonymous information, new framework might highlight or recommend web pages related to key concepts. We can consider semantic information representation as an important step towards a wide efficient manipulation and retrieval of information [5], [6], [7]. In the digital library community a flat list of attribute/value pairs is often assumed to be available. In the Semantic Web community, annotations are often assumed to be an instance of an ontology. Through the ontologies the system will express key entities and relationships describing resources in a formal machine-processable representation. An ontology-based knowledge representation could be used for content analysis and object recognition, for reasoning processes and for enabling user-friendly and intelligent multimedia content search and retrieval.
Text summarization has been an interesting and active research area since the 60’s. The definition and assumption are that a small portion or several keywords of the original long document can represent the whole informatively and/or indicatively. Reading or processing this shorter version of the document would save time and other resources [8]. This property is especially true and urgently needed at present due to the vast availability of information. Concept-based approach to represent dynamic and unstructured information can be useful to address issues like trying to determine the key concepts and to summarize the information exchanged within a personalized environment.
In this context, a concept is represented with a Wikipedia article. With millions of articles and thousands of contributors, this online repository of knowledge is the largest and fastest growing encyclopedia in existence.
The problem described above can then be divided into three steps:
• Mapping of a series of terms with the most appropriate Wikipedia article (disambiguation).
• Assigning a score for each item identified on the basis of its importance in the given context.
• Extraction of n items with the highest score.
Text summarization can be applied to many fields: from information retrieval to text mining processes and text display. Also in personalized searching framework text summarization could be very useful.
The chapter is organized as follows: the next Section introduces personalized searching framework as one of the possible application areas of automatic concept extraction systems. Section three describes the summarization process, providing details on system architecture, used methodology and tools. Section four provides an overview about document summarization approaches that have been recently developed. Section five summarizes a number of real-world applications which might benefit from WSD. Section six introduces Wikipedia and WordNet as used in our project. Section seven describes the logical structure of the project, describing software components and databases. Finally, Section eight provides some consideration..
- …