Search CORE

35,284 research outputs found

Information Extraction, Data Integration, and Uncertain Data Management: The State of The Art

Author: Habib Mena B.
Keulen Maurice van
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2011
Field of study

Information Extraction, data Integration, and uncertain data management are different areas of research that got vast focus in the last two decades. Many researches tackled those areas of research individually. However, information extraction systems should have integrated with data integration methods to make use of the extracted information. Handling uncertainty in extraction and integration process is an important issue to enhance the quality of the data in such integrated systems. This article presents the state of the art of the mentioned areas of research and shows the common grounds and how to integrate information extraction and data integration under uncertainty management cover

Maastricht University Research Portal

University of Twente Research Information

mARC: Memory by Association and Reinforcement of Contexts

Author: Descourt Patrice
Rimoux Norbert
Publication venue
Publication date: 10/12/2013
Field of study

This paper introduces the memory by Association and Reinforcement of Contexts (mARC). mARC is a novel data modeling technology rooted in the second quantization formulation of quantum mechanics. It is an all-purpose incremental and unsupervised data storage and retrieval system which can be applied to all types of signal or data, structured or unstructured, textual or not. mARC can be applied to a wide range of information clas-sification and retrieval problems like e-Discovery or contextual navigation. It can also for-mulated in the artificial life framework a.k.a Conway "Game Of Life" Theory. In contrast to Conway approach, the objects evolve in a massively multidimensional space. In order to start evaluating the potential of mARC we have built a mARC-based Internet search en-gine demonstrator with contextual functionality. We compare the behavior of the mARC demonstrator with Google search both in terms of performance and relevance. In the study we find that the mARC search engine demonstrator outperforms Google search by an order of magnitude in response time while providing more relevant results for some classes of queries

arXiv.org e-Print Archive

CiteSeerX

Enhancing hyperspectral image unmixing with spatial correlations

Author: Dobigeon Nicolas
Eches Olivier
Tourneret Jean-Yves
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/08/2010
Field of study

This paper describes a new algorithm for hyperspectral image unmixing. Most of the unmixing algorithms proposed in the literature do not take into account the possible spatial correlations between the pixels. In this work, a Bayesian model is introduced to exploit these correlations. The image to be unmixed is assumed to be partitioned into regions (or classes) where the statistical properties of the abundance coefficients are homogeneous. A Markov random field is then proposed to model the spatial dependency of the pixels within any class. Conditionally upon a given class, each pixel is modeled by using the classical linear mixing model with additive white Gaussian noise. This strategy is investigated the well known linear mixing model. For this model, the posterior distributions of the unknown parameters and hyperparameters allow ones to infer the parameters of interest. These parameters include the abundances for each pixel, the means and variances of the abundances for each class, as well as a classification map indicating the classes of all pixels in the image. To overcome the complexity of the posterior distribution of interest, we consider Markov chain Monte Carlo methods that generate samples distributed according to the posterior of interest. The generated samples are then used for parameter and hyperparameter estimation. The accuracy of the proposed algorithms is illustrated on synthetic and real data.Comment: Manuscript accepted for publication in IEEE Trans. Geoscience and Remote Sensin

arXiv.org e-Print Archive

CiteSeerX

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL Descartes

Context and Keyword Extraction in Plain Text Using a Graph Representation

Author: Chahine Carlo Abi
Chaignaud Nathalie
Kotowicz Jean-Philippe
Pécuchet Jean-Pierre
Publication venue
Publication date: 30/11/2008
Field of study

Document indexation is an essential task achieved by archivists or automatic indexing tools. To retrieve relevant documents to a query, keywords describing this document have to be carefully chosen. Archivists have to find out the right topic of a document before starting to extract the keywords. For an archivist indexing specialized documents, experience plays an important role. But indexing documents on different topics is much harder. This article proposes an innovative method for an indexing support system. This system takes as input an ontology and a plain text document and provides as output contextualized keywords of the document. The method has been evaluated by exploiting Wikipedia's category links as a termino-ontological resources

arXiv.org e-Print Archive

HAL - Normandie Université

Crossref

Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme

Author: Bao Hongyun
Hao Yuexing
Wang Feng
Xu Bo
Zheng Suncong
Zhou Peng
Publication venue
Publication date: 01/01/2017
Field of study

Joint extraction of entities and relations is an important task in information extraction. To tackle this problem, we firstly propose a novel tagging scheme that can convert the joint extraction task to a tagging problem. Then, based on our tagging scheme, we study different end-to-end models to extract entities and their relations directly, without identifying entities and relations separately. We conduct experiments on a public dataset produced by distant supervision method and the experimental results show that the tagging based methods are better than most of the existing pipelined and joint learning methods. What's more, the end-to-end model proposed in this paper, achieves the best results on the public dataset

arXiv.org e-Print Archive

Crossref