1,428 research outputs found

    A Survey of Paraphrasing and Textual Entailment Methods

    Full text link
    Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer natural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language expressions, such that a human who reads (and trusts) the first element of a pair would most likely infer that the other element is also true. Paraphrasing can be seen as bidirectional textual entailment and methods from the two areas are often similar. Both kinds of methods are useful, at least in principle, in a wide range of natural language processing applications, including question answering, summarization, text generation, and machine translation. We summarize key ideas from the two areas by considering in turn recognition, generation, and extraction methods, also pointing to prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of Informatics, Athens University of Economics and Business, Greece, 201

    Video on the semantic web: experiences with media streams

    Get PDF
    In this paper, we report our experiences with the use of SemanticWeb technology for annotating digital video material.Web technology is used to transform a large, existing video ontology embedded in an annotation tool into a commonly accessible format. The recombination of existing video material is then used as an example application, in which the video metadata enables the retrieval of video footage based on both content descriptions and cinematographic concepts, such as establishing and reaction shots. The paper focuses on the practical issues of porting ontological information to the Semantic Web, the multimedia-specific issues of video annotation, and requirements for Semantic Web query and access patterns. It thereby explicitly aims at providing input to the two new W3C Semantic Web Working Groups (Best Practices and Deployment; Data Access)

    Query Expansion Algorithm with Metadata Support for Ontology Matching

    Get PDF
    Semantic Web, the next generation Web, stands out from the traditional Web by incorporating meaning to the information that is accessible to the users. Hence in effect a Web of Data is formed, represented through Ontologies. Most of the data in the traditional Web is being stored in the form of relational databases. Hence for the common man to start with ontologies, this paper tries to propose a mechanism that efficiently stores an entire ontology as a database. To move along with this transition from the traditional Web to Semantic Web, all data must be converted to a form that complies to the Semantic Web concepts. Hence this paper also proposes a mechanism to represent databases as ontology by determining the relationship between the various database components. The system proposed also tries to integrate knowledge of various databases and existing ontologies leading to a global ontology that can be used in various contexts

    An ontology for clinical questions about the contents of patient notes

    Get PDF
    AbstractObjectiveMany studies have been completed on question classification in the open domain, however only limited work focuses on the medical domain. As well, to the best of our knowledge, most of these medical question classifications were designed for literature based question and answering systems. This paper focuses on a new direction, which is to design a novel question processing and classification model for answering clinical questions applied to electronic patient notes.MethodsThere are four main steps in the work. Firstly, a relatively large set of clinical questions was collected from staff in an Intensive Care Unit. Then, a clinical question taxonomy was designed for question and answering purposes. Subsequently an annotation guideline was created and used to annotate the question set. Finally, a multilayer classification model was built to classify the clinical questions.ResultsThrough the initial classification experiments, we realized that the general features cannot contribute to high performance of a minimum classifier (a small data set with multiple classes). Thus, an automatic knowledge discovery and knowledge reuse process was designed to boost the performance by extracting and expanding the specific features of the questions. In the evaluation, the results show around 90% accuracy can be achieved in the answerable subclass classification and generic question templates classification. On the other hand, the machine learning method does not perform well at identifying the category of unanswerable questions, due to the asymmetric distribution.ConclusionsIn this paper, a comprehensive study on clinical questions has been completed. A major outcome of this work is the multilayer classification model. It serves as a major component of a patient records based clinical question and answering system as our studies continue. As well, the question collections can be reused by the research community to improve the efficiency of their own question and answering systems

    Video on the semantic web : experiences with media streams

    Get PDF
    In this paper, we report our experiences with the use of SemanticWeb technology for annotating digital video material.Web technology is used to transform a large, existing video ontology embedded in an annotation tool into a commonly accessible format. The recombination of existing video material is then used as an example application, in which the video metadata enables the retrieval of video footage based on both content descriptions and cinematographic concepts, such as establishing and reaction shots. The paper focuses on the practical issues of porting ontological information to the Semantic Web, the multimedia-specific issues of video annotation, and requirements for Semantic Web query and access patterns. It thereby explicitly aims at providing input to the two new W3C Semantic Web Working Groups (Best Practices and Deployment; Data Access)

    A Logic-based Approach for Recognizing Textual Entailment Supported by Ontological Background Knowledge

    Full text link
    We present the architecture and the evaluation of a new system for recognizing textual entailment (RTE). In RTE we want to identify automatically the type of a logical relation between two input texts. In particular, we are interested in proving the existence of an entailment between them. We conceive our system as a modular environment allowing for a high-coverage syntactic and semantic text analysis combined with logical inference. For the syntactic and semantic analysis we combine a deep semantic analysis with a shallow one supported by statistical models in order to increase the quality and the accuracy of results. For RTE we use logical inference of first-order employing model-theoretic techniques and automated reasoning tools. The inference is supported with problem-relevant background knowledge extracted automatically and on demand from external sources like, e.g., WordNet, YAGO, and OpenCyc, or other, more experimental sources with, e.g., manually defined presupposition resolutions, or with axiomatized general and common sense knowledge. The results show that fine-grained and consistent knowledge coming from diverse sources is a necessary condition determining the correctness and traceability of results.Comment: 25 pages, 10 figure

    Towards a Universal Wordnet by Learning from Combined Evidenc

    Get PDF
    Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification
    • …
    corecore