39,579 research outputs found
Semantically factoid question answering using fuzzy SVM named entity recognition
Named Entity Recognition (NER) and Question Answering (QA) are fundamental tasks and they are the cores of natural language processing (NLP) system. NER, a sub problem of Information Extraction (IE), involves recognizing and extracting name entities like Persons, Locations, Organizations, Dates and Times from electronics resources and text. Question Answering (QA) is a type of Information Retrieval (IR), attempts to deal with a wide range of question. In this paper we propose a semantically Factoid Question Answering model using Fuzzy Support Vector Machine Named Entity Recognizer component called FSVM. In this model we applied the FSVM NE recognizer to filter Question Answering system results have token by IR and return exact expect result to the user. This paper shows how the Fuzzy NER can applied in information retrieval (IR) systems in applications like Question Answering (QA). We show a model to improve precision in QA by semantically NER and reducing Answer Finder input data
Hi, how can I help you?: Automating enterprise IT support help desks
Question answering is one of the primary challenges of natural language
understanding. In realizing such a system, providing complex long answers to
questions is a challenging task as opposed to factoid answering as the former
needs context disambiguation. The different methods explored in the literature
can be broadly classified into three categories namely: 1) classification
based, 2) knowledge graph based and 3) retrieval based. Individually, none of
them address the need of an enterprise wide assistance system for an IT support
and maintenance domain. In this domain the variance of answers is large ranging
from factoid to structured operating procedures; the knowledge is present
across heterogeneous data sources like application specific documentation,
ticket management systems and any single technique for a general purpose
assistance is unable to scale for such a landscape. To address this, we have
built a cognitive platform with capabilities adopted for this domain. Further,
we have built a general purpose question answering system leveraging the
platform that can be instantiated for multiple products, technologies in the
support domain. The system uses a novel hybrid answering model that
orchestrates across a deep learning classifier, a knowledge graph based context
disambiguation module and a sophisticated bag-of-words search system. This
orchestration performs context switching for a provided question and also does
a smooth hand-off of the question to a human expert if none of the automated
techniques can provide a confident answer. This system has been deployed across
675 internal enterprise IT support and maintenance projects.Comment: To appear in IAAI 201
Corpora and evaluation tools for multilingual named entity grammar development
We present an effort for the development of multilingual named entity grammars in a unification-based finite-state formalism (SProUT). Following an extended version of the MUC7 standard, we have developed Named Entity Recognition grammars for German, Chinese, Japanese, French, Spanish, English, and Czech. The grammars recognize person names, organizations, geographical locations, currency, time and date expressions. Subgrammars and gazetteers are shared as much as possible for the grammars of the different languages. Multilingual corpora from the business domain are used for grammar development and evaluation. The annotation format (named entity and other linguistic information) is described. We present an evaluation tool which provides detailed statistics and diagnostics, allows for partial matching of annotations, and supports user-defined mappings between different annotation and grammar output formats
Enriching Knowledge Bases with Counting Quantifiers
Information extraction traditionally focuses on extracting relations between
identifiable entities, such as . Yet, texts
often also contain Counting information, stating that a subject is in a
specific relation with a number of objects, without mentioning the objects
themselves, for example, "California is divided into 58 counties". Such
counting quantifiers can help in a variety of tasks such as query answering or
knowledge base curation, but are neglected by prior work. This paper develops
the first full-fledged system for extracting counting information from text,
called CINEX. We employ distant supervision using fact counts from a knowledge
base as training seeds, and develop novel techniques for dealing with several
challenges: (i) non-maximal training seeds due to the incompleteness of
knowledge bases, (ii) sparse and skewed observations in text sources, and (iii)
high diversity of linguistic patterns. Experiments with five human-evaluated
relations show that CINEX can achieve 60% average precision for extracting
counting information. In a large-scale experiment, we demonstrate the potential
for knowledge base enrichment by applying CINEX to 2,474 frequent relations in
Wikidata. CINEX can assert the existence of 2.5M facts for 110 distinct
relations, which is 28% more than the existing Wikidata facts for these
relations.Comment: 16 pages, The 17th International Semantic Web Conference (ISWC 2018
- …