Search CORE

1,077 research outputs found

Hybrid Approach to English-Hindi Name Entity Transliteration

Author: Mathur Shruti
Saxena Varun Prakash
Publication venue
Publication date: 28/03/2014
Field of study

Machine translation (MT) research in Indian languages is still in its infancy. Not much work has been done in proper transliteration of name entities in this domain. In this paper we address this issue. We have used English-Hindi language pair for our experiments and have used a hybrid approach. At first we have processed English words using a rule based approach which extracts individual phonemes from the words and then we have applied statistical approach which converts the English into its equivalent Hindi phoneme and in turn the corresponding Hindi word. Through this approach we have attained 83.40% accuracy.Comment: Proceedings of IEEE Students' Conference on Electrical, Electronics and Computer Sciences 201

arXiv.org e-Print Archive

Crossref

From Semantic Search & Integration to Analytics

Author: Industrial Experiences
Sheth Amit
Publication venue: Dagstuhl Seminar Proceedings. 04391 - Semantic Interoperability and Integration
Publication date: 01/01/2005
Field of study

Dagstuhl Research Online Publication Server

Generating Pseudo-ground Truth for Predicting New Concepts in Social Streams

Author: Buitinck L.
de Rijke M.
Graus D.
Tsagkias M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

International Migration, Integration and Social Cohesion online publications

Enhanced retrieval using semantic technologies: Ontology based retrieval as a new search paradigm? - Considerations based on new projects at the Bavarian State Library

Author: Gillitzer Berthold
Publication venue
Publication date: 15/07/2013
Field of study

KITopen

Recommended from our members

Extracting Arabic composite names using genitive principles of Arabic grammar

Author: Khalil H
Miltan M
Osman T
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/06/2020
Field of study

Named Entity Recognition (NER) is a basic prerequisite of using Natural Language Processing (NLP) for information retrieval. Arabic NER is especially challenging as the language is morphologically rich and has short vowels with no capitalisation convention. This article presents a novel rule-based approach that uses linguistic grammar-based techniques to extract Arabic composite names from Arabic text. Our approach uniquely exploits the genitive Arabic grammar rules; in particular, the rules regarding the identification of definite nouns (معرفة) and indefinite nouns (نكرة) to support the process of extracting composite names. Based on domain knowledge and Arabic Genitive Rules (AGR), the developed approach formalises a set of syntactical rules and linguistic patterns that initially use genitive patterns to classify definiteness within phrases and then extracts proper composite names from the unstructured text. The developed novel approach does not place any constraints on the length of the Arabic composite name and our initial experimentation demonstrated high recall and precision results when the NER algorithm was applied to a financial domain corpus

Nottingham Trent Institutional Repository (IRep)

Adaptive Semantic Annotation of Entity and Concept Mentions in Text

Author: Mendes Pablo N.
Publication venue: CORE Scholar
Publication date: 01/01/2013
Field of study

The recent years have seen an increase in interest for knowledge repositories that are useful across applications, in contrast to the creation of ad hoc or application-specific databases. These knowledge repositories figure as a central provider of unambiguous identifiers and semantic relationships between entities. As such, these shared entity descriptions serve as a common vocabulary to exchange and organize information in different formats and for different purposes. Therefore, there has been remarkable interest in systems that are able to automatically tag textual documents with identifiers from shared knowledge repositories so that the content in those documents is described in a vocabulary that is unambiguously understood across applications. Tagging textual documents according to these knowledge bases is a challenging task. It involves recognizing the entities and concepts that have been mentioned in a particular passage and attempting to resolve eventual ambiguity of language in order to choose one of many possible meanings for a phrase. There has been substantial work on recognizing and disambiguating entities for specialized applications, or constrained to limited entity types and particular types of text. In the context of shared knowledge bases, since each application has potentially very different needs, systems must have unprecedented breadth and flexibility to ensure their usefulness across applications. Documents may exhibit different language and discourse characteristics, discuss very diverse topics, or require the focus on parts of the knowledge repository that are inherently harder to disambiguate. In practice, for developers looking for a system to support their use case, is often unclear if an existing solution is applicable, leading those developers to trial-and-error and ad hoc usage of multiple systems in an attempt to achieve their objective. In this dissertation, I propose a conceptual model that unifies related techniques in this space under a common multi-dimensional framework that enables the elucidation of strengths and limitations of each technique, supporting developers in their search for a suitable tool for their needs. Moreover, the model serves as the basis for the development of flexible systems that have the ability of supporting document tagging for different use cases. I describe such an implementation, DBpedia Spotlight, along with extensions that we performed to the knowledge base DBpedia to support this implementation. I report evaluations of this tool on several well known data sets, and demonstrate applications to diverse use cases for further validation

OhioLINK Electronic Thesis and Dissertation Center

CORE

Multi-Faceted Search and Navigation of Biological Databases

Author: Mahoui M.
Oklak M.
Perumal. N
Publication venue: 'IntechOpen'
Publication date: 23/08/2011
Field of study

IntechOpen

Crossref

Cognitive aspects-based short text representation with named entity, concept and knowledge

Author: Cao L
Hou W
Liu Q
Publication venue: 'MDPI AG'
Publication date: 14/01/2021
Field of study

© 2020 by the authors. Short text is widely seen in applications including Internet of Things (IoT). The appropriate representation and classification of short text could be severely disrupted by the sparsity and shortness of short text. One important solution is to enrich short text representation by involving cognitive aspects of text, including semantic concept, knowledge, and category. In this paper, we propose a named Entity-based Concept Knowledge-Aware (ECKA) representation model which incorporates semantic information into short text representation. ECKA is a multi-level short text semantic representation model, which extracts the semantic features from the word, entity, concept and knowledge levels by CNN, respectively. Since word, entity, concept and knowledge entity in the same short text have different cognitive informativeness for short text classification, attention networks are formed to capture these category-related attentive representations from the multi-level textual features, respectively. The final multi-level semantic representations are formed by concatenating all of these individual-level representations, which are used for text classification. Experiments on three tasks demonstrate our method significantly outperforms the state-of-the-art methods

OPUS - University of Technology Sydney

Inter-Personal Relation Extraction Model Based on Bidirectional GRU and Attention Mechanism

Author: Dai Zhenjin
Li Gangmin
Li Yuming
Ni Pin
Wang Xutao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/04/2020
Field of study

Inter-Personal Relationship Extraction is an important part of knowledge extraction and is also the fundamental work of constructing the knowledge graph of people's relationships. Compared with the traditional pattern recognition methods, the deep learning methods are more prominent in the relation extraction (RE) tasks. At present, the research of Chinese relation extraction technology is mainly based on the method of kernel function and Distant Supervision. In this paper, we propose a Chinese relation extraction model based on Bidirectional GRU network and Attention mechanism. Combining with the structural characteristics of the Chinese language, the input vector is input in the form of word vectors. Aiming at the problem of context memory, a Bidirectional GRU neural network is used to fuse the input vectors. The feature information of the word level is extracted from a sentence, and the sentence feature is extracted through the Attention mechanism of the word level. To verify the feasibility of this method, we use the distant supervision method to extract data from websites and compare it with existing relationship extraction methods. The experimental results show that Bi-directional GRU with Attention mechanism model can make full use of all the feature information of sentences, and the accuracy of Bi-directional GRU model is significantly higher than that of other neural network models without Attention mechanism

Crossref

UCL Discovery