Search CORE

75,323 research outputs found

Similarity of Semantic Relations

Author: Morris Jane
Peter D. Turney
Publication venue
Publication date: 01/01/2006
Field of study

There are at least two kinds of similarity. Relational similarity is correspondence between relations, in contrast with attributional similarity, which is correspondence between attributes. When two words have a high degree of attributional similarity, we call them synonyms. When two pairs of words have a high degree of relational similarity, we say that their relations are analogous. For example, the word pair mason:stone is analogous to the pair carpenter:wood. This paper introduces Latent Relational Analysis (LRA), a method for measuring relational similarity. LRA has potential applications in many areas, including information extraction, word sense disambiguation, and information retrieval. Recently the Vector Space Model (VSM) of information retrieval has been adapted to measuring relational similarity, achieving a score of 47% on a collection of 374 college-level multiple-choice word analogy questions. In the VSM approach, the relation between a pair of words is characterized by a vector of frequencies of predefined patterns in a large corpus. LRA extends the VSM approach in three ways: (1) the patterns are derived automatically from the corpus, (2) the Singular Value Decomposition (SVD) is used to smooth the frequency data, and (3) automatically generated synonyms are used to explore variations of the word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the average human score of 57%. On the related problem of classifying semantic relations, LRA achieves similar gains over the VSM

arXiv.org e-Print Archive

CiteSeerX

NRC Publications Archive

Crossref

CogPrints Cognitive Sciences Eprint Archive

Human-Level Performance on Word Analogy Questions by Latent Relational Analysis

Author: Turney Peter D.
Publication venue
Publication date: 01/01/2004
Field of study

This paper introduces Latent Relational Analysis (LRA), a method for measuring relational similarity. LRA has potential applications in many areas, including information extraction, word sense disambiguation, machine translation, and information retrieval. Relational similarity is correspondence between relations, in contrast with attributional similarity, which is correspondence between attributes. When two words have a high degree of attributional similarity, we call them synonyms. When two pairs of words have a high degree of relational similarity, we say that their relations are analogous. For example, the word pair mason/stone is analogous to the pair carpenter/wood; the relations between mason and stone are highly similar to the relations between carpenter and wood. Past work on semantic similarity measures has mainly been concerned with attributional similarity. For instance, Latent Semantic Analysis (LSA) can measure the degree of similarity between two words, but not between two relations. Recently the Vector Space Model (VSM) of information retrieval has been adapted to the task of measuring relational similarity, achieving a score of 47% on a collection of 374 college-level multiple-choice word analogy questions. In the VSM approach, the relation between a pair of words is characterized by a vector of frequencies of predefined patterns in a large corpus. LRA extends the VSM approach in three ways: (1) the patterns are derived automatically from the corpus (they are not predefined), (2) the Singular Value Decomposition (SVD) is used to smooth the frequency data (it is also used this way in LSA), and (3) automatically generated synonyms are used to explore reformulations of the word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the average human score of 57%. On the related problem of classifying noun-modifier relations, LRA achieves similar gains over the VSM, while using a smaller corpus

arXiv.org e-Print Archive

NRC Publications Archive

Flexible and efficient IR using array databases

Author: Arjen P. de Vries
Marcin Zukowski
Peter Boncz
Roberto Cornacchia
Sándor Héman
Publication venue: Springer Nature
Publication date: 01/01/2007
Field of study

textabstractThe Matrix Framework is a recent proposal by IR researchers to flexibly represent all important information retrieval models in a single multi-dimensional array framework. Computational support for exactly this framework is provided by the array database system SRAM (Sparse Relational Array Mapping) that works on top of a DBMS. Information retrieval models can be specified in its comprehension-based array query language, in a way that directly corresponds to the underlying mathematical formulas. SRAM efficiently stores sparse arrays in (compressed) relational tables and translates and optimizes array queries into relational queries. In this work, we describe a number of array query optimization rules and demonstrate their effect on text retrieval in the TREC TeraByte track (TREC-TB) efficiency task, using the Okapi BM25 model as our example. It turns out that these optimization rules enable SRAM to automatically translate the BM25 array queries into the relational equivalent of inverted list processing including compression, score materialization and quantization, such as employed by custom-built IR systems. The use of the high-performance MonetDB/X100 relational backend, that provides transparent database compression, allows the system to achieve very fast response times with good precision and low resource usage

Springer - Publisher Connector

CWI's Institutional Repository

Flexible and efficient IR using array databases

Author: Boncz P.A. (Peter)
Cornacchia R. (Roberto)
Héman S. (Sándor)
Vries A.P. (Arjen) de
Zukowski M. (Marcin)
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2008
Field of study

The Matrix Framework is a recent proposal by IR researchers to flexibly represent all important information retrieval models in a single multi-dimensional array framework. Computational support for exactly this framework is provided by the array database system SRAM (Sparse Relational Array Mapping) that works on top of a DBMS. Information retrieval models can be specified in its comprehension-based array query language, in a way that directly corresponds to the underlying mathematical formulas. SRAM efficiently stores sparse arrays in (compressed) relational tables and translates and optimizes array queries into relational queries. In this work, we describe a number of array query optimization rules and demonstrate their effect on text retrieval in the TREC TeraByte track (TREC-TB) efficiency task, using the Okapi BM25 model as our example. It turns out that these optimization rules enable SRAM to automatically translate the BM25 array queries into the relational equivalent of inverted list processing including compression, score materialization and quantization, such as employed by custom-built IR systems. The use of the high-performance MonetDB/X100 relational backend, that provides transparent database compression, allows the system to achieve very fast response times with good precision and low resource usage

CWI's Institutional Repository

Review of performance of various Big Databases

Author: Mallika Wadhwa, Er. Amrit Kaur
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/06/2017
Field of study

Relational databases have been the main model for information data storage, retrieval and administration.A relational database is a table-based data system where there is no scalability, insignificant information duplication, computationally costly table joins and trouble in managing complex information. The greatest inspiration of NoSQL is adaptability. NoSQL information stores are broadly used to store and recover potentially a lot of information.In this paper, we assess four most famous NoSQL databases: Cassandra, MongoDB, and CouchDB

International Journal on Recent and Innovation Trends in Computing and Communication

A Relational Model for Environmental and Water Resources Data

Author: Bandaragoda
Bandaragoda
Blöschl
Bose
Bouganim
David G. Tarboton
David R. Maidment
Goodall
Gray
Helsel
Ilya Zaslavsky
Jeffery S. Horsburgh
Michener
Pokorný
Tomasic
Publication venue: Hosted by Utah State University Libraries
Publication date: 08/05/2008
Field of study

Environmental observations are fundamental to hydrology and water resources, and the way these data are organized and manipulated either enables or inhibits the analyses that can be performed. The Observations Data Model presented here provides a new and consistent format for the storage and retrieval of point environmental observations in a relational database designed to facilitate integrated analysis of large data sets collected by multiple investigators. Within this data model, observations are stored with sufficient ancillary information (metadata) about the observations to allow them to be unambiguously interpreted and to provide traceable heritage from raw measurements to useable information. The design is based upon a relational database model that exposes each single observation as a record, taking advantage of the capability in relational database systems for querying based upon data values and enabling cross‐dimension data retrieval and analysis. This paper presents the design principles and features of the Observations Data Model and illustrates how it can be used to enhance the organization, publication, and analysis of point observations data while retaining a simple relational format. The contribution of the data model to water resources is that it represents a new, systematic way to organize and share data that overcomes many of the syntactic and semantic differences between heterogeneous data sets, thereby facilitating an integrated understanding of water resources based on more extensive and fully specified information

Crossref

DigitalCommons@USU

Photograph indexing and retrieval using star-graphs

Author: Chiaramella Yves
Martinet Jean
Mulhem Philippe
Ounis Iadh
Publication venue: HAL CCSD
Publication date: 01/01/2003
Field of study

International audienceWe present in this paper a relational approach for indexing and retrieving photographs from a collection. Instead of using simple keywords as an indexing language, we propose to use star-graphs as document descriptors. A star-graph is a conceptual graph that contains a single relation, with some concepts linked to it. They are elementary pieces of information describing combinations of concepts. We use star-graphs as descriptors - or index terms - for image content representation. This allows for relational indexing and expression of complex user needs, in comparison to classical text retrieval, where simple keywords are generally used as document descriptors. We present a document representation model, a weighting scheme for star-graphs inspired by the tf.idf used in text retrieval. We have applied our model to image retrieval, and show the system evaluation results

HAL - Lille 3

CiteSeerX

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

StarSpace: Embed All The Things!

Author: Adams Keith
Bordes Antoine
Chopra Sumit
Fisch Adam
Weston Jason
Wu Ledell
Publication venue
Publication date: 20/11/2017
Field of study

We present StarSpace, a general-purpose neural embedding model that can solve a wide variety of problems: labeling tasks such as text classification, ranking tasks such as information retrieval/web search, collaborative filtering-based or content-based recommendation, embedding of multi-relational graphs, and learning word, sentence or document level embeddings. In each case the model works by embedding those entities comprised of discrete features and comparing them against each other -- learning similarities dependent on the task. Empirical results on a number of tasks show that StarSpace is highly competitive with existing methods, whilst also being generally applicable to new cases where those methods are not

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Artificial intelligence techniques for modeling database user behavior

Author: Graves Sara J.
Tanner Steve
Publication venue
Publication date
Field of study

The design and development of the adaptive modeling system is described. This system models how a user accesses a relational database management system in order to improve its performance by discovering use access patterns. In the current system, these patterns are used to improve the user interface and may be used to speed data retrieval, support query optimization and support a more flexible data representation. The system models both syntactic and semantic information about the user's access and employs both procedural and rule-based logic to manipulate the model

NASA Technical Reports Server

A Query Language for Information Graphs

Author: Betrabet Sangita C.
Chen Qi-Fan
Fox Edward A.
Publication venue
Publication date: 01/01/1993
Field of study

In this paper we propose a database model and query language for information retrieval systems. The information graph model and Graph Object Access Language (GOAL) allow integrated handling of data, information and knowledge along with a variety of specialized objects (e.g., for geographic or multimedia information systems). There is flexible support for hyperbases, thesauri, lexicons, and both relational and object-oriented types of DBMS applications. In this paper we give the first published account of our new, powerful model and language (GOAL), illustrate their use, and compare them with related work

Computer Science Technical Reports @Virginia Tech