Search CORE

86,981 research outputs found

Entity-Relationship Search over the Web

Author: Milic-Frayling Natasa
Rodrigues Eduarda Mendes
Saleiro Pedro
Soares Carlos
Publication venue
Publication date: 07/10/2018
Field of study

Entity-Relationship (E-R) Search is a complex case of Entity Search where the goal is to search for multiple unknown entities and relationships connecting them. We assume that a E-R query can be decomposed as a sequence of sub-queries each containing keywords related to a specific entity or relationship. We adopt a probabilistic formulation of the E-R search problem. When creating specific representations for entities (e.g. context terms) and for pairs of entities (i.e. relationships) it is possible to create a graph of probabilistic dependencies between sub-queries and entity plus relationship representations. To the best of our knowledge this represents the first probabilistic model of E-R search. We propose and develop a novel supervised Early Fusion-based model for E-R search, the Entity-Relationship Dependence Model (ERDM). It uses Markov Random Field to model term dependencies of E-R sub-queries and entity/relationship documents. We performed experiments with more than 800M entities and relationships extractions from ClueWeb-09-B with FACC1 entity linking. We obtained promising results using 3 different query collections comprising 469 E-R queries, with results showing that it is possible to perform E-R search without using fix and pre-defined entity and relationship types, enabling a wide range of queries to be addressed

arXiv.org e-Print Archive

Incorporating prior knowledge in medical image segmentation: a survey

Author: Hamarneh Ghassan
Nosrati Masoud S.
Publication venue
Publication date: 04/07/2016
Field of study

Medical image segmentation, the task of partitioning an image into meaningful parts, is an important step toward automating medical image analysis and is at the crux of a variety of medical imaging applications, such as computer aided diagnosis, therapy planning and delivery, and computer aided interventions. However, the existence of noise, low contrast and objects' complexity in medical images are critical obstacles that stand in the way of achieving an ideal segmentation system. Incorporating prior knowledge into image segmentation algorithms has proven useful for obtaining more accurate and plausible results. This paper surveys the different types of prior knowledge that have been utilized in different segmentation frameworks. We focus our survey on optimization-based methods that incorporate prior information into their frameworks. We review and compare these methods in terms of the types of prior employed, the domain of formulation (continuous vs. discrete), and the optimization techniques (global vs. local). We also created an interactive online database of existing works and categorized them based on the type of prior knowledge they use. Our website is interactive so that researchers can contribute to keep the database up to date. We conclude the survey by discussing different aspects of designing an energy functional for image segmentation, open problems, and future perspectives.Comment: Survey paper, 30 page

arXiv.org e-Print Archive

Attributes Coupling based Item Enhanced Matrix Factorization Technique for Recommender Systems

Author: Gao Yang
Wang Can
Yu Yonghong
Publication venue
Publication date: 04/05/2014
Field of study

Recommender system has attracted lots of attentions since it helps users alleviate the information overload problem. Matrix factorization technique is one of the most widely employed collaborative filtering techniques in the research of recommender systems due to its effectiveness and efficiency in dealing with very large user-item rating matrices. Recently, based on the intuition that additional information provides useful insights for matrix factorization techniques, several recommendation algorithms have utilized additional information to improve the performance of matrix factorization methods. However, the majority focus on dealing with the cold start user problem and ignore the cold start item problem. In addition, there are few suitable similarity measures for these content enhanced matrix factorization approaches to compute the similarity between categorical items. In this paper, we propose attributes coupling based item enhanced matrix factorization method by incorporating item attribute information into matrix factorization technique as well as adapting the coupled object similarity to capture the relationship between items. Item attribute information is formed as an item relationship regularization term to regularize the process of matrix factorization. Specifically, the similarity between items is measured by the Coupled Object Similarity considering coupling between items. Experimental results on two real data sets show that our proposed method outperforms state-of-the-art recommendation algorithms and can effectively cope with the cold start item problem when more item attribute information is available.Comment: 15 page

arXiv.org e-Print Archive

A Probabilistic Calculus of Actions

Author: Pearl Judea
Publication venue
Publication date: 27/02/2013
Field of study

We present a symbolic machinery that admits both probabilistic and causal information about a given domain and produces probabilistic statements about the effect of actions and the impact of observations. The calculus admits two types of conditioning operators: ordinary Bayes conditioning, P(y|X = x), which represents the observation X = x, and causal conditioning, P(y|do(X = x)), read the probability of Y = y conditioned on holding X constant (at x) by deliberate action. Given a mixture of such observational and causal sentences, together with the topology of the causal graph, the calculus derives new conditional probabilities of both types, thus enabling one to quantify the effects of actions (and policies) from partially specified knowledge bases, such as Bayesian networks in which some conditional probabilities may not be available.Comment: Appears in Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI1994

arXiv.org e-Print Archive

Improving Statistical Multimedia Information Retrieval Model by using Ontology

Author: Jain Vishal
Narula Gagandeep Singh
Publication venue
Publication date: 21/03/2017
Field of study

A typical IR system that delivers and stores information is affected by problem of matching between user query and available content on web. Use of Ontology represents the extracted terms in form of network graph consisting of nodes, edges, index terms etc. The above mentioned IR approaches provide relevance thus satisfying users query. The paper also emphasis on analyzing multimedia documents and performs calculation for extracted terms using different statistical formulas. The proposed model developed reduces semantic gap and satisfies user needs efficiently

arXiv.org e-Print Archive

Gray-box optimization and factorized distribution algorithms: where two worlds collide

Author: Santana Roberto
Publication venue
Publication date: 10/07/2017
Field of study

The concept of gray-box optimization, in juxtaposition to black-box optimization, revolves about the idea of exploiting the problem structure to implement more efficient evolutionary algorithms (EAs). Work on factorized distribution algorithms (FDAs), whose factorizations are directly derived from the problem structure, has also contributed to show how exploiting the problem structure produces important gains in the efficiency of EAs. In this paper we analyze the general question of using problem structure in EAs focusing on confronting work done in gray-box optimization with related research accomplished in FDAs. This contrasted analysis helps us to identify, in current studies on the use problem structure in EAs, two distinct analytical characterizations of how these algorithms work. Moreover, we claim that these two characterizations collide and compete at the time of providing a coherent framework to investigate this type of algorithms. To illustrate this claim, we present a contrasted analysis of formalisms, questions, and results produced in FDAs and gray-box optimization. Common underlying principles in the two approaches, which are usually overlooked, are identified and discussed. Besides, an extensive review of previous research related to different uses of the problem structure in EAs is presented. The paper also elaborates on some of the questions that arise when extending the use of problem structure in EAs, such as the question of evolvability, high cardinality of the variables and large definition sets, constrained and multi-objective problems, etc. Finally, emergent approaches that exploit neural models to capture the problem structure are covered.Comment: 33 pages, 9 tables, 3 figures. This paper covers some of the topics of the talk "When the gray box was opened, model-based evolutionary algorithms were already there" presented in the Model-Based Evolutionary Algorithms workshop on July 20, 2016, in Denve

arXiv.org e-Print Archive

Semi-Automatic Terminology Ontology Learning Based on Topic Modeling

Author: Dhar Amit Kumar
Rani Monika
Vyas O. P.
Publication venue
Publication date: 05/08/2017
Field of study

Ontologies provide features like a common vocabulary, reusability, machine-readable content, and also allows for semantic search, facilitate agent interaction and ordering & structuring of knowledge for the Semantic Web (Web 3.0) application. However, the challenge in ontology engineering is automatic learning, i.e., the there is still a lack of fully automatic approach from a text corpus or dataset of various topics to form ontology using machine learning techniques. In this paper, two topic modeling algorithms are explored, namely LSI & SVD and Mr.LDA for learning topic ontology. The objective is to determine the statistical relationship between document and terms to build a topic ontology and ontology graph with minimum human intervention. Experimental analysis on building a topic ontology and semantic retrieving corresponding topic ontology for the user's query demonstrating the effectiveness of the proposed approach

arXiv.org e-Print Archive

SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver

Author: Donti Priya L.
Kolter Zico
Wang Po-Wei
Wilder Bryan
Publication venue
Publication date: 28/05/2019
Field of study

Integrating logical reasoning within deep learning architectures has been a major goal of modern AI systems. In this paper, we propose a new direction toward this goal by introducing a differentiable (smoothed) maximum satisfiability (MAXSAT) solver that can be integrated into the loop of larger deep learning systems. Our (approximate) solver is based upon a fast coordinate descent approach to solving the semidefinite program (SDP) associated with the MAXSAT problem. We show how to analytically differentiate through the solution to this SDP and efficiently solve the associated backward pass. We demonstrate that by integrating this solver into end-to-end learning systems, we can learn the logical structure of challenging problems in a minimally supervised fashion. In particular, we show that we can learn the parity function using single-bit supervision (a traditionally hard task for deep networks) and learn how to play 9x9 Sudoku solely from examples. We also solve a "visual Sudok" problem that maps images of Sudoku puzzles to their associated logical solutions by combining our MAXSAT solver with a traditional convolutional architecture. Our approach thus shows promise in integrating logical structures within deep learning.Comment: Accepted at ICML'19. The code can be found at https://github.com/locuslab/satne

arXiv.org e-Print Archive

Literature Review Of Attribute Level And Structure Level Data Linkage Techniques

Author: Gollapalli Mohammed
Publication venue
Publication date: 07/10/2015
Field of study

Data Linkage is an important step that can provide valuable insights for evidence-based decision making, especially for crucial events. Performing sensible queries across heterogeneous databases containing millions of records is a complex task that requires a complete understanding of each contributing databases schema to define the structure of its information. The key aim is to approximate the structure and content of the induced data into a concise synopsis in order to extract and link meaningful data-driven facts. We identify such problems as four major research issues in Data Linkage: associated costs in pair-wise matching, record matching overheads, semantic flow of information restrictions, and single order classification limitations. In this paper, we give a literature review of research in Data Linkage. The purpose for this review is to establish a basic understanding of Data Linkage, and to discuss the background in the Data Linkage research domain. Particularly, we focus on the literature related to the recent advancements in Approximate Matching algorithms at Attribute Level and Structure Level. Their efficiency, functionality and limitations are critically analysed and open-ended problems have been exposed.Comment: 20 page

arXiv.org e-Print Archive

Parameterized Neural Network Language Models for Information Retrieval

Author: Despres Nicolas
Lamprier Sylvain
Piwowarski Benjamin
Publication venue
Publication date: 06/10/2015
Field of study

Information Retrieval (IR) models need to deal with two difficult issues, vocabulary mismatch and term dependencies. Vocabulary mismatch corresponds to the difficulty of retrieving relevant documents that do not contain exact query terms but semantically related terms. Term dependencies refers to the need of considering the relationship between the words of the query when estimating the relevance of a document. A multitude of solutions has been proposed to solve each of these two problems, but no principled model solve both. In parallel, in the last few years, language models based on neural networks have been used to cope with complex natural language processing tasks like emotion and paraphrase detection. Although they present good abilities to cope with both term dependencies and vocabulary mismatch problems, thanks to the distributed representation of words they are based upon, such models could not be used readily in IR, where the estimation of one language model per document (or query) is required. This is both computationally unfeasible and prone to over-fitting. Based on a recent work that proposed to learn a generic language model that can be modified through a set of document-specific parameters, we explore use of new neural network models that are adapted to ad-hoc IR tasks. Within the language model IR framework, we propose and study the use of a generic language model as well as a document-specific language model. Both can be used as a smoothing component, but the latter is more adapted to the document at hand and has the potential of being used as a full document language model. We experiment with such models and analyze their results on TREC-1 to 8 datasets

arXiv.org e-Print Archive