17 research outputs found
OntoAna: Domain Ontology for Human Anatomy
Today, we can find many search engines which provide us with information
which is more operational in nature. None of the search engines provide domain
specific information. This becomes very troublesome to a novice user who wishes
to have information in a particular domain. In this paper, we have developed an
ontology which can be used by a domain specific search engine. We have
developed an ontology on human anatomy, which captures information regarding
cardiovascular system, digestive system, skeleton and nervous system. This
information can be used by people working in medical and health care domain.Comment: Proceedings of 5th CSI National Conference on Education and Research.
Organized by Lingayay University, Faridabad. Sponsored by Computer Society of
India and IEEE Delhi Chapter. Proceedings published by Lingayay University
Pres
Leveraging different meronym discovery methods for bridging resolution in French
International audienceThis paper presents a statistical system for resolving bridging descriptions in French, a language for which current lexical resources have a very low overage. The system is similar to that developed for English by Poesio but it was enriched to integrate meronymic information extracted automatically from both web queries and raw text using syntactic patterns. Through various experiments on the DEDE corpus, we show that although still mediocre the performance of our system compare favorably to those obtained by Poesio for English. In addition, our evaluation indicates that the different meronym extraction methods have a cumulative effect, but that the text pattern-based extraction method is more robust and leads to higher accuracy than the web-based approach
Learning Taxonomic Relations from Heterogeneous Evidence
Cimiano P, Schmidt-Thieme L, Pivk A, Staab S. Learning Taxonomic Relations from Heterogeneous Evidence. In: Proceedings of the ECAI 2004 Ontology Learning and Population Workshop. 2004
Gimme The Context: Context-driven automatic semantic annotation with C-PANKOW
Cimiano P, Ladwig G, Staab S. Gimme The Context: Context-driven automatic semantic annotation with C-PANKOW. In: Ellis A, Hagino T, eds. Proceedings of the 14th international conference on World Wide Web, WWW 2005. ACM Press; 2005: 332-341
Comparing knowledge sources for nominal anaphora resolution
We compare two ways of obtaining lexical knowledge for antecedent selection in other-anaphora
and definite noun phrase coreference. Specifically, we compare an algorithm that relies on links
encoded in the manually created lexical hierarchy WordNet and an algorithm that mines corpora
by means of shallow lexico-semantic patterns. As corpora we use the British National
Corpus (BNC), as well as the Web, which has not been previously used for this task. Our
results show that (a) the knowledge encoded in WordNet is often insufficient, especially for
anaphor-antecedent relations that exploit subjective or context-dependent knowledge; (b) for
other-anaphora, the Web-based method outperforms the WordNet-based method; (c) for definite
NP coreference, the Web-based method yields results comparable to those obtained using
WordNet over the whole dataset and outperforms the WordNet-based method on subsets of the
dataset; (d) in both case studies, the BNC-based method is worse than the other methods because
of data sparseness. Thus, in our studies, the Web-based method alleviated the lexical knowledge
gap often encountered in anaphora resolution, and handled examples with context-dependent relations
between anaphor and antecedent. Because it is inexpensive and needs no hand-modelling
of lexical knowledge, it is a promising knowledge source to integrate in anaphora resolution systems
Anaphora resolution for Arabic machine translation :a case study of nafs
PhD ThesisIn the age of the internet, email, and social media there is an increasing need for processing online information, for example, to support education and business. This has led to the rapid development of natural language processing technologies such as computational linguistics, information retrieval, and data mining. As a branch of computational linguistics, anaphora resolution has attracted much interest. This is reflected in the large number of papers on the topic published in journals such as Computational Linguistics. Mitkov (2002) and Ji et al. (2005) have argued that the overall quality of anaphora resolution systems remains low, despite practical advances in the area, and that major challenges include dealing with real-world knowledge and accurate parsing.
This thesis investigates the following research question: can an algorithm be found for the resolution of the anaphor nafs in Arabic text which is accurate to at least 90%, scales linearly with text size, and requires a minimum of knowledge resources? A resolution algorithm intended to satisfy these criteria is proposed. Testing on a corpus of contemporary Arabic shows that it does indeed satisfy the criteria.Egyptian Government
Learning Ontology Relations by Combining Corpus-Based Techniques and Reasoning on Data from Semantic Web Sources
The manual construction of formal domain conceptualizations (ontologies) is labor-intensive. Ontology learning, by contrast, provides (semi-)automatic ontology generation from input data such as domain text. This thesis proposes a novel approach for learning labels of non-taxonomic ontology relations. It combines corpus-based techniques with reasoning on Semantic Web data. Corpus-based methods apply vector space similarity of verbs co-occurring with labeled and unlabeled relations to calculate relation label suggestions from a set of candidates. A meta ontology in combination with Semantic Web sources such as DBpedia and OpenCyc allows reasoning to improve the suggested labels. An extensive formal evaluation demonstrates the superior accuracy of the presented hybrid approach
Towards Multilingual Coreference Resolution
The current work investigates the problems that occur when coreference resolution is considered as a multilingual task. We assess the issues that arise when a framework using the mention-pair coreference resolution model and memory-based learning for the resolution process are used. Along the way, we revise three essential subtasks of coreference resolution: mention detection, mention head detection and feature selection. For each of these aspects we propose various multilingual solutions including both heuristic, rule-based and machine learning methods. We carry out a detailed analysis that includes eight different languages (Arabic, Catalan, Chinese, Dutch, English, German, Italian and Spanish) for which datasets were provided by the only two multilingual shared tasks on coreference resolution held so far: SemEval-2 and CoNLL-2012. Our investigation shows that, although complex, the coreference resolution task can be targeted in a multilingual and even language independent way. We proposed machine learning methods for each of the subtasks that are affected by the transition, evaluated and compared them to the performance of rule-based and heuristic approaches. Our results confirmed that machine learning provides the needed flexibility for the multilingual task and that the minimal requirement for a language independent system is a part-of-speech annotation layer provided for each of the approached languages. We also showed that the performance of the system can be improved by introducing other layers of linguistic annotations, such as syntactic parses (in the form of either constituency or dependency parses), named entity information, predicate argument structure, etc. Additionally, we discuss the problems occurring in the proposed approaches and suggest possibilities for their improvement