101,909 research outputs found
AsterixDB: A Scalable, Open Source BDMS
AsterixDB is a new, full-function BDMS (Big Data Management System) with a
feature set that distinguishes it from other platforms in today's open source
Big Data ecosystem. Its features make it well-suited to applications like web
data warehousing, social data storage and analysis, and other use cases related
to Big Data. AsterixDB has a flexible NoSQL style data model; a query language
that supports a wide range of queries; a scalable runtime; partitioned,
LSM-based data storage and indexing (including B+-tree, R-tree, and text
indexes); support for external as well as natively stored data; a rich set of
built-in types; support for fuzzy, spatial, and temporal types and queries; a
built-in notion of data feeds for ingestion of data; and transaction support
akin to that of a NoSQL store.
Development of AsterixDB began in 2009 and led to a mid-2013 initial open
source release. This paper is the first complete description of the resulting
open source AsterixDB system. Covered herein are the system's data model, its
query language, and its software architecture. Also included are a summary of
the current status of the project and a first glimpse into how AsterixDB
performs when compared to alternative technologies, including a parallel
relational DBMS, a popular NoSQL store, and a popular Hadoop-based SQL data
analytics platform, for things that both technologies can do. Also included is
a brief description of some initial trials that the system has undergone and
the lessons learned (and plans laid) based on those early "customer"
engagements
Exploring cooperative game mechanisms of scientific coauthorship networks
Scientific coauthorship, generated by collaborations and competitions among
researchers, reflects effective organizations of human resources. Researchers,
their expected benefits through collaborations, and their cooperative costs
constitute the elements of a game. Hence we propose a cooperative game model to
explore the evolution mechanisms of scientific coauthorship networks. The model
generates geometric hypergraphs, where the costs are modelled by space
distances, and the benefits are expressed by node reputations, i. e. geometric
zones that depend on node position in space and time. Modelled cooperative
strategies conditioned on positive benefit-minus-cost reflect the spatial
reciprocity principle in collaborations, and generate high clustering and
degree assortativity, two typical features of coauthorship networks. Modelled
reputations generate the generalized Poisson parts and fat tails appeared in
specific distributions of empirical data, e. g. paper team size distribution.
The combined effect of modelled costs and reputations reproduces the
transitions emerged in degree distribution, in the correlation between degree
and local clustering coefficient, etc. The model provides an example of how
individual strategies induce network complexity, as well as an application of
game theory to social affiliation networks
Cognitive node selection and assignment algorithms for weighted cooperative sensing in radar systems
Geographica: A Benchmark for Geospatial RDF Stores
Geospatial extensions of SPARQL like GeoSPARQL and stSPARQL have recently
been defined and corresponding geospatial RDF stores have been implemented.
However, there is no widely used benchmark for evaluating geospatial RDF stores
which takes into account recent advances to the state of the art in this area.
In this paper, we develop a benchmark, called Geographica, which uses both
real-world and synthetic data to test the offered functionality and the
performance of some prominent geospatial RDF stores
Technical support for creating an artificial intelligence system for feature extraction and experimental design
Techniques for classifying objects into groups or clases go under many different names including, most commonly, cluster analysis. Mathematically, the general problem is to find a best mapping of objects into an index set consisting of class identifiers. When an a priori grouping of objects exists, the process of deriving the classification rules from samples of classified objects is known as discrimination. When such rules are applied to objects of unknown class, the process is denoted classification. The specific problem addressed involves the group classification of a set of objects that are each associated with a series of measurements (ratio, interval, ordinal, or nominal levels of measurement). Each measurement produces one variable in a multidimensional variable space. Cluster analysis techniques are reviewed and methods for incuding geographic location, distance measures, and spatial pattern (distribution) as parameters in clustering are examined. For the case of patterning, measures of spatial autocorrelation are discussed in terms of the kind of data (nominal, ordinal, or interval scaled) to which they may be applied
- …