541 research outputs found

    mARC: Memory by Association and Reinforcement of Contexts

    Full text link
    This paper introduces the memory by Association and Reinforcement of Contexts (mARC). mARC is a novel data modeling technology rooted in the second quantization formulation of quantum mechanics. It is an all-purpose incremental and unsupervised data storage and retrieval system which can be applied to all types of signal or data, structured or unstructured, textual or not. mARC can be applied to a wide range of information clas-sification and retrieval problems like e-Discovery or contextual navigation. It can also for-mulated in the artificial life framework a.k.a Conway "Game Of Life" Theory. In contrast to Conway approach, the objects evolve in a massively multidimensional space. In order to start evaluating the potential of mARC we have built a mARC-based Internet search en-gine demonstrator with contextual functionality. We compare the behavior of the mARC demonstrator with Google search both in terms of performance and relevance. In the study we find that the mARC search engine demonstrator outperforms Google search by an order of magnitude in response time while providing more relevant results for some classes of queries

    Hybrid Approach to English-Hindi Name Entity Transliteration

    Full text link
    Machine translation (MT) research in Indian languages is still in its infancy. Not much work has been done in proper transliteration of name entities in this domain. In this paper we address this issue. We have used English-Hindi language pair for our experiments and have used a hybrid approach. At first we have processed English words using a rule based approach which extracts individual phonemes from the words and then we have applied statistical approach which converts the English into its equivalent Hindi phoneme and in turn the corresponding Hindi word. Through this approach we have attained 83.40% accuracy.Comment: Proceedings of IEEE Students' Conference on Electrical, Electronics and Computer Sciences 201

    Enhanced sequence labeling based on latent variable conditional random fields

    Get PDF
    Natural language processing is a useful processing technique of language data, such as text and speech. Sequence labeling represents the upstream task of many natural language processing tasks, such as machine translation, text classification, and sentiment classification. In this paper, the focus is on the sequence labeling task, in which semantic labels are assigned to each unit of a given input sequence. Two frameworks of latent variable conditional random fields (CRF) models (called LVCRF-I and LVCRF-II) are proposed, which use the encoding schema as a latent variable to capture the latent structure of the hidden variables and the observed data. Among the two designed models, the LVCRF-I model focuses on the sentence level, while the LVCRF-II works in the word level, to choose the best encoding schema for a given input sequence automatically without handcraft features. In the experiments, the two proposed models are verified by four sequence prediction tasks, including named entity recognition (NER), chunking, reference parsing and POS tagging. The proposed frameworks achieve better performance without using other handcraft features than the conventional CRF model. Moreover, these designed frameworks can be viewed as a substitution of the conventional CRF models. In the commonly used LSTM-CRF models, the CRF layer can be replaced with our proposed framework as they use the same training and inference procedure. The experimental results show that the proposed models exhibit latent variable and provide competitive and robust performance on all three sequence prediction tasks

    A Machine Learning Based Analytical Framework for Semantic Annotation Requirements

    Full text link
    The Semantic Web is an extension of the current web in which information is given well-defined meaning. The perspective of Semantic Web is to promote the quality and intelligence of the current web by changing its contents into machine understandable form. Therefore, semantic level information is one of the cornerstones of the Semantic Web. The process of adding semantic metadata to web resources is called Semantic Annotation. There are many obstacles against the Semantic Annotation, such as multilinguality, scalability, and issues which are related to diversity and inconsistency in content of different web pages. Due to the wide range of domains and the dynamic environments that the Semantic Annotation systems must be performed on, the problem of automating annotation process is one of the significant challenges in this domain. To overcome this problem, different machine learning approaches such as supervised learning, unsupervised learning and more recent ones like, semi-supervised learning and active learning have been utilized. In this paper we present an inclusive layered classification of Semantic Annotation challenges and discuss the most important issues in this field. Also, we review and analyze machine learning applications for solving semantic annotation problems. For this goal, the article tries to closely study and categorize related researches for better understanding and to reach a framework that can map machine learning techniques into the Semantic Annotation challenges and requirements
    • …
    corecore