35,284 research outputs found

    Information Extraction, Data Integration, and Uncertain Data Management: The State of The Art

    Get PDF
    Information Extraction, data Integration, and uncertain data management are different areas of research that got vast focus in the last two decades. Many researches tackled those areas of research individually. However, information extraction systems should have integrated with data integration methods to make use of the extracted information. Handling uncertainty in extraction and integration process is an important issue to enhance the quality of the data in such integrated systems. This article presents the state of the art of the mentioned areas of research and shows the common grounds and how to integrate information extraction and data integration under uncertainty management cover

    mARC: Memory by Association and Reinforcement of Contexts

    Full text link
    This paper introduces the memory by Association and Reinforcement of Contexts (mARC). mARC is a novel data modeling technology rooted in the second quantization formulation of quantum mechanics. It is an all-purpose incremental and unsupervised data storage and retrieval system which can be applied to all types of signal or data, structured or unstructured, textual or not. mARC can be applied to a wide range of information clas-sification and retrieval problems like e-Discovery or contextual navigation. It can also for-mulated in the artificial life framework a.k.a Conway "Game Of Life" Theory. In contrast to Conway approach, the objects evolve in a massively multidimensional space. In order to start evaluating the potential of mARC we have built a mARC-based Internet search en-gine demonstrator with contextual functionality. We compare the behavior of the mARC demonstrator with Google search both in terms of performance and relevance. In the study we find that the mARC search engine demonstrator outperforms Google search by an order of magnitude in response time while providing more relevant results for some classes of queries

    Enhancing hyperspectral image unmixing with spatial correlations

    Get PDF
    This paper describes a new algorithm for hyperspectral image unmixing. Most of the unmixing algorithms proposed in the literature do not take into account the possible spatial correlations between the pixels. In this work, a Bayesian model is introduced to exploit these correlations. The image to be unmixed is assumed to be partitioned into regions (or classes) where the statistical properties of the abundance coefficients are homogeneous. A Markov random field is then proposed to model the spatial dependency of the pixels within any class. Conditionally upon a given class, each pixel is modeled by using the classical linear mixing model with additive white Gaussian noise. This strategy is investigated the well known linear mixing model. For this model, the posterior distributions of the unknown parameters and hyperparameters allow ones to infer the parameters of interest. These parameters include the abundances for each pixel, the means and variances of the abundances for each class, as well as a classification map indicating the classes of all pixels in the image. To overcome the complexity of the posterior distribution of interest, we consider Markov chain Monte Carlo methods that generate samples distributed according to the posterior of interest. The generated samples are then used for parameter and hyperparameter estimation. The accuracy of the proposed algorithms is illustrated on synthetic and real data.Comment: Manuscript accepted for publication in IEEE Trans. Geoscience and Remote Sensin

    Context and Keyword Extraction in Plain Text Using a Graph Representation

    Full text link
    Document indexation is an essential task achieved by archivists or automatic indexing tools. To retrieve relevant documents to a query, keywords describing this document have to be carefully chosen. Archivists have to find out the right topic of a document before starting to extract the keywords. For an archivist indexing specialized documents, experience plays an important role. But indexing documents on different topics is much harder. This article proposes an innovative method for an indexing support system. This system takes as input an ontology and a plain text document and provides as output contextualized keywords of the document. The method has been evaluated by exploiting Wikipedia's category links as a termino-ontological resources

    Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme

    Full text link
    Joint extraction of entities and relations is an important task in information extraction. To tackle this problem, we firstly propose a novel tagging scheme that can convert the joint extraction task to a tagging problem. Then, based on our tagging scheme, we study different end-to-end models to extract entities and their relations directly, without identifying entities and relations separately. We conduct experiments on a public dataset produced by distant supervision method and the experimental results show that the tagging based methods are better than most of the existing pipelined and joint learning methods. What's more, the end-to-end model proposed in this paper, achieves the best results on the public dataset
    • 

    corecore