122,028 research outputs found
ニホンゴ トクユウ ノ ケンジョウ ヒョウゲン キヅカイ ノ ヒョウゲン ヲ ドウ エイヤク スルカ ニチ エイ ツウヤク ノ ジュギョウ デ アラワレル モンダイテン ヲ カンガエル
2011 Sixth International Conference on Digital Information Management (ICDIM) : Melbourne, Australia, September, 2011.09.26-2011.09.28In this paper, we propose a novel approach for private query; IPP (inner product predicate) method. Private query is a query processing protocol to obtain requesting tuples without exposing any information about what users request to third persons including service providers. Existing works about private query such as PIR, which ensure information theoretic safety, have severe restriction because they do not support range queries nor allow tuples having a same value in queried attributes. Our IPP method, on the other hands, focuses range queries mainly and it allows tuples having a same value in any attributes. IPP method employs a query transform by trusted clients (QT) scheme and proposes transformation algorithms which make the correlation between plain queries and transformed queries and the correlation between plain attribute values and transformed attribute values small enough. Thus, the transformed queries and attribute values have resistance to frequency analysis attacks which implies IPP method prevents attackers, who know the plain distribution of them, from computing the plain queries and attribute values from transformed values. IPP method adds perturbations to queries and attribute values and gives them a matrix based encryption to achieve the above property. We also confirm the computational cost on servers belongs to O(n) with the number of tuples n and is virtually no orrelation between the distributions of transformed queries and queried attribute values and the plain distributions of them by experimental evaluations
Contextual Query Using Bell Tests
Tests are essential in Information Retrieval and Data Mining in order to
evaluate the effectiveness of a query. An automatic measure tool intended to
exhibit the meaning of words in context has been developed and linked with
Quantum Theory, particularly entanglement. "Quantum like" experiments were
undertaken on semantic space based on the Hyperspace Analogue Language (HAL)
method. A quantum HAL model was implemented using state vectors issued from the
HAL matrix and query observables, testing a wide range of windows sizes. The
Bell parameter S, associating measures on two words in a document, was derived
showing peaks for specific window sizes. The peaks show maximum quantum
violation of the Bell inequalities and are document dependent. This new
correlation measure inspired by Quantum Theory could be promising for measuring
query relevance.Comment: 12 pages, 3 figure
A Backend Framework for the Efficient Management of Power System Measurements
Increased adoption and deployment of phasor measurement units (PMU) has
provided valuable fine-grained data over the grid. Analysis over these data can
provide insight into the health of the grid, thereby improving control over
operations. Realizing this data-driven control, however, requires validating,
processing and storing massive amounts of PMU data. This paper describes a PMU
data management system that supports input from multiple PMU data streams,
features an event-detection algorithm, and provides an efficient method for
retrieving archival data. The event-detection algorithm rapidly correlates
multiple PMU data streams, providing details on events occurring within the
power system. The event-detection algorithm feeds into a visualization
component, allowing operators to recognize events as they occur. The indexing
and data retrieval mechanism facilitates fast access to archived PMU data.
Using this method, we achieved over 30x speedup for queries with high
selectivity. With the development of these two components, we have developed a
system that allows efficient analysis of multiple time-aligned PMU data
streams.Comment: Published in Electric Power Systems Research (2016), not available
ye
Database Learning: Toward a Database that Becomes Smarter Every Time
In today's databases, previous query answers rarely benefit answering future
queries. For the first time, to the best of our knowledge, we change this
paradigm in an approximate query processing (AQP) context. We make the
following observation: the answer to each query reveals some degree of
knowledge about the answer to another query because their answers stem from the
same underlying distribution that has produced the entire dataset. Exploiting
and refining this knowledge should allow us to answer queries more
analytically, rather than by reading enormous amounts of raw data. Also,
processing more queries should continuously enhance our knowledge of the
underlying distribution, and hence lead to increasingly faster response times
for future queries.
We call this novel idea---learning from past query answers---Database
Learning. We exploit the principle of maximum entropy to produce answers, which
are in expectation guaranteed to be more accurate than existing sample-based
approximations. Empowered by this idea, we build a query engine on top of Spark
SQL, called Verdict. We conduct extensive experiments on real-world query
traces from a large customer of a major database vendor. Our results
demonstrate that Verdict supports 73.7% of these queries, speeding them up by
up to 23.0x for the same accuracy level compared to existing AQP systems.Comment: This manuscript is an extended report of the work published in ACM
SIGMOD conference 201
Fast Algorithms and Efficient Statistics: N-point Correlation Functions
We present here a new algorithm for the fast computation of N-point
correlation functions in large astronomical data sets. The algorithm is based
on kdtrees which are decorated with cached sufficient statistics thus allowing
for orders of magnitude speed-ups over the naive non-tree-based implementation
of correlation functions. We further discuss the use of controlled
approximations within the computation which allows for further acceleration. In
summary, our algorithm now makes it possible to compute exact, all-pairs,
measurements of the 2, 3 and 4-point correlation functions for cosmological
data sets like the Sloan Digital Sky Survey (SDSS; York et al. 2000) and the
next generation of Cosmic Microwave Background experiments (see Szapudi et al.
2000).Comment: To appear in Proceedings of MPA/MPE/ESO Conference "Mining the Sky",
July 31 - August 4, 2000, Garching, German
Knowledge-infused and Consistent Complex Event Processing over Real-time and Persistent Streams
Emerging applications in Internet of Things (IoT) and Cyber-Physical Systems
(CPS) present novel challenges to Big Data platforms for performing online
analytics. Ubiquitous sensors from IoT deployments are able to generate data
streams at high velocity, that include information from a variety of domains,
and accumulate to large volumes on disk. Complex Event Processing (CEP) is
recognized as an important real-time computing paradigm for analyzing
continuous data streams. However, existing work on CEP is largely limited to
relational query processing, exposing two distinctive gaps for query
specification and execution: (1) infusing the relational query model with
higher level knowledge semantics, and (2) seamless query evaluation across
temporal spaces that span past, present and future events. These allow
accessible analytics over data streams having properties from different
disciplines, and help span the velocity (real-time) and volume (persistent)
dimensions. In this article, we introduce a Knowledge-infused CEP (X-CEP)
framework that provides domain-aware knowledge query constructs along with
temporal operators that allow end-to-end queries to span across real-time and
persistent streams. We translate this query model to efficient query execution
over online and offline data streams, proposing several optimizations to
mitigate the overheads introduced by evaluating semantic predicates and in
accessing high-volume historic data streams. The proposed X-CEP query model and
execution approaches are implemented in our prototype semantic CEP engine,
SCEPter. We validate our query model using domain-aware CEP queries from a
real-world Smart Power Grid application, and experimentally analyze the
benefits of our optimizations for executing these queries, using event streams
from a campus-microgrid IoT deployment.Comment: 34 pages, 16 figures, accepted in Future Generation Computer Systems,
October 27, 201
- …