87,632 research outputs found
An Empirical Comparison of Graph Databases.
Abstract-In recent years, more and more companies provide services that can not be anymore achieved efficiently using relational databases. As such, these companies are forced to use alternative database models such as XML databases, objectoriented databases, document-oriented databases and, more recently graph databases. Graph databases only exist for a few years. Although there have been some comparison attempts, they are mostly focused on certain aspects only. In this paper, we present a distributed graph database comparison framework and the results we obtained by comparing four important players in the graph databases market: Neo4j, OrientDB, Titan and DEX
Estudio y comparaciĂłn de bases de datos orientadas a grafos
90 p.Un grafo es básicamente un conjunto de puntos (vĂ©rtices) en el espacio, que están conectados por un conjunto de lĂneas (aristas). Como una de las formas más
generales de modelado de datos, un grafo permite representar fácilmente entidades,
sus atributos y sus relaciones.Las Bases de Datos Orientadas a Grafos (BDOG) se caracterizan porque las estructuras de datos para el esquema e instancia se basan en modelos de datos de grafo.Estos modelos se iniciaron en los años ochenta y a principios de los noventa, junto con modelos orientados a objetos. Su influencia decayĂł poco a poco con la apariciĂłn de nuevos modelos de bases de datos. Recientemente, la necesidad de gestionar la informaciĂłon a travĂ©s de una estructura de grafo y las limitaciones de las bases de datos tradicionales (en particular el modelo relacional), para cubrir las necesidades de las aplicaciones actuales ha llevado al desarrollo de nuevas tecnologĂas, y por consiguiente ha restablecido la importancia de esta área. El objetivo principal de este estudio es realizar una comparaciĂłn sistematica de bases de datos de grafo.En este trabajo se presenta una revisiĂłn de las bases de datos de grafo actuales y su comparaciĂłn de acuerdo a algunas caracterìsticas de modelado de datos. Entre las caracterĂsticas evaluadas se incluyen: almacenamiento de datos, representaciĂłn de entidades y relaciones, operaciĂłn y manipulaciĂłn de datos (lenguajes de consulta de grafos e interfaces de programaciĂłn), y restricciones de integridad. Adicionalmente,se presenta una evaluaciçon empĂrica basada en pruebas de carga y consulta de datos. Este trabajo permite conocer y comparar, de manera teĂłrica y práctica,las capacidades de modelado y ejecuciĂłn entregadas por cada base de datos de grafo./ABSTRACT: A graph is basically a set of points (vertices) in space, which are connected by a set of lines (edges). As one of the most general forms of data modeling, a graph easily allows the representation of entities, their attributes and their relationships. Graph-oriented Databases (GODB) are characterized because their data structures for the scheme and instance are based on graph data models. These models
began in the eighties and early nineties, along with object-oriented models. Their
influence gradually faded with the emergence of new models of databases. Recently,
the need to manage information through a graph structure and the limitations of traditional databases (in particular the relational model), to meet the needs of current
applications has led to the development of new technologies, and therefore restored
the importance of this area. The main objective of this study is to perform a systematic comparison of graph databases.
This work presents a review of the current graph databases and their comparison
according to well-defined data modeling features. Among the evaluated features we
include: data storage, representation of entities and relationships, data operation and manipulation (graph query languages and application programming interfaces), and
integrity constraints. Additionally, we present an empirical evaluation based on load
and query data testing. This work allows to know and compare, from a theoretical and practical point of view, the modeling and execution capabilities provided by each graph database
An introduction to Graph Data Management
A graph database is a database where the data structures for the schema
and/or instances are modeled as a (labeled)(directed) graph or generalizations
of it, and where querying is expressed by graph-oriented operations and type
constructors. In this article we present the basic notions of graph databases,
give an historical overview of its main development, and study the main current
systems that implement them
Exploiting citation networks for large-scale author name disambiguation
We present a novel algorithm and validation method for disambiguating author
names in very large bibliographic data sets and apply it to the full Web of
Science (WoS) citation index. Our algorithm relies only upon the author and
citation graphs available for the whole period covered by the WoS. A pair-wise
publication similarity metric, which is based on common co-authors,
self-citations, shared references and citations, is established to perform a
two-step agglomerative clustering that first connects individual papers and
then merges similar clusters. This parameterized model is optimized using an
h-index based recall measure, favoring the correct assignment of well-cited
publications, and a name-initials-based precision using WoS metadata and
cross-referenced Google Scholar profiles. Despite the use of limited metadata,
we reach a recall of 87% and a precision of 88% with a preference for
researchers with high h-index values. 47 million articles of WoS can be
disambiguated on a single machine in less than a day. We develop an h-index
distribution model, confirming that the prediction is in excellent agreement
with the empirical data, and yielding insight into the utility of the h-index
in real academic ranking scenarios.Comment: 14 pages, 5 figure
Security Economics: A Guide for Data Availability and Needs
The rapid and accelerating development of security economics has generated great demand for more and better data to accommodate the empirical research agenda. The present paper serves as a guide to policy makers and researchers for security-related databases. The paper focuses on two main issues. Firstly, it takes stock of the existing databases, highlighting their main components and also performs a brief statistical comparison. Secondly, it discusses data shortages and needs that are considered essential for enhancing our understanding of the complex phenomenon of terrorism as well as designing and evaluating policy.
Reference face graph for face recognition
Face recognition has been studied extensively; however, real-world face recognition still remains a challenging task. The demand for unconstrained practical face recognition is rising with the explosion of online multimedia such as social networks, and video surveillance footage where face analysis is of significant importance. In this paper, we approach face recognition in the context of graph theory. We recognize an unknown face using an external reference face graph (RFG). An RFG is generated and recognition of a given face is achieved by comparing it to the faces in the constructed RFG. Centrality measures are utilized to identify distinctive faces in the reference face graph. The proposed RFG-based face recognition algorithm is robust to the changes in pose and it is also alignment free. The RFG recognition is used in conjunction with DCT locality sensitive hashing for efficient retrieval to ensure scalability. Experiments are conducted on several publicly available databases and the results show that the proposed approach outperforms the state-of-the-art methods without any preprocessing necessities such as face alignment. Due to the richness in the reference set construction, the proposed method can also handle illumination and expression variation
Social inertia in collaboration networks
This work is a study of the properties of collaboration networks employing
the formalism of weighted graphs to represent their one-mode projection. The
weight of the edges is directly the number of times that a partnership has been
repeated. This representation allows us to define the concept of "social
inertia" that measures the tendency of authors to keep on collaborating with
previous partners. We use a collection of empirical datasets to analyze several
aspects of the social inertia: 1) its probability distribution, 2) its
correlation with other properties, and 3) the correlations of the inertia
between neighbors in the network. We also contrast these empirical results with
the predictions of a recently proposed theoretical model for the growth of
collaboration networks.Comment: 7 pages, 5 figure
- …