42,375 research outputs found
The Case for Learned Index Structures
Indexes are models: a B-Tree-Index can be seen as a model to map a key to the
position of a record within a sorted array, a Hash-Index as a model to map a
key to a position of a record within an unsorted array, and a BitMap-Index as a
model to indicate if a data record exists or not. In this exploratory research
paper, we start from this premise and posit that all existing index structures
can be replaced with other types of models, including deep-learning models,
which we term learned indexes. The key idea is that a model can learn the sort
order or structure of lookup keys and use this signal to effectively predict
the position or existence of records. We theoretically analyze under which
conditions learned indexes outperform traditional index structures and describe
the main challenges in designing learned index structures. Our initial results
show, that by using neural nets we are able to outperform cache-optimized
B-Trees by up to 70% in speed while saving an order-of-magnitude in memory over
several real-world data sets. More importantly though, we believe that the idea
of replacing core components of a data management system through learned models
has far reaching implications for future systems designs and that this work
just provides a glimpse of what might be possible
When private set intersection meets big data : an efficient and scalable protocol
Large scale data processing brings new challenges to the design of privacy-preserving protocols: how to meet the increasing requirements of speed and throughput of modern applications, and how to scale up smoothly when data being protected is big. Efficiency and scalability become critical criteria for privacy preserving protocols in the age of Big Data. In this paper, we present a new Private Set Intersection (PSI) protocol that is extremely efficient and highly scalable compared with existing protocols. The protocol is based on a novel approach that we call oblivious Bloom intersection. It has linear complexity and relies mostly on efficient symmetric key operations. It has high scalability due to the fact that most operations can be parallelized easily. The protocol has two versions: a basic protocol and an enhanced protocol, the security of the two variants is analyzed and proved in the semi-honest model and the malicious model respectively. A prototype of the basic protocol has been built. We report the result of performance evaluation and compare it against the two previously fastest PSI protocols. Our protocol is orders of magnitude faster than these two protocols. To compute the intersection of two million-element sets, our protocol needs only 41 seconds (80-bit security) and 339 seconds (256-bit security) on moderate hardware in parallel mode
Quality Assessment of Linked Datasets using Probabilistic Approximation
With the increasing application of Linked Open Data, assessing the quality of
datasets by computing quality metrics becomes an issue of crucial importance.
For large and evolving datasets, an exact, deterministic computation of the
quality metrics is too time consuming or expensive. We employ probabilistic
techniques such as Reservoir Sampling, Bloom Filters and Clustering Coefficient
estimation for implementing a broad set of data quality metrics in an
approximate but sufficiently accurate way. Our implementation is integrated in
the comprehensive data quality assessment framework Luzzu. We evaluated its
performance and accuracy on Linked Open Datasets of broad relevance.Comment: 15 pages, 2 figures, To appear in ESWC 2015 proceeding
Efficient and Privacy-Preserving Ride Sharing Organization for Transferable and Non-Transferable Services
Ride-sharing allows multiple persons to share their trips together in one
vehicle instead of using multiple vehicles. This can reduce the number of
vehicles in the street, which consequently can reduce air pollution, traffic
congestion and transportation cost. However, a ride-sharing organization
requires passengers to report sensitive location information about their trips
to a trip organizing server (TOS) which creates a serious privacy issue. In
addition, existing ride-sharing schemes are non-flexible, i.e., they require a
driver and a rider to have exactly the same trip to share a ride. Moreover,
they are non-scalable, i.e., inefficient if applied to large geographic areas.
In this paper, we propose two efficient privacy-preserving ride-sharing
organization schemes for Non-transferable Ride-sharing Services (NRS) and
Transferable Ride-sharing Services (TRS). In the NRS scheme, a rider can share
a ride from its source to destination with only one driver whereas, in TRS
scheme, a rider can transfer between multiple drivers while en route until he
reaches his destination. In both schemes, the ride-sharing area is divided into
a number of small geographic areas, called cells, and each cell has a unique
identifier. Each driver/rider should encrypt his trip's data and send an
encrypted ride-sharing offer/request to the TOS. In NRS scheme, Bloom filters
are used to compactly represent the trip information before encryption. Then,
the TOS can measure the similarity between the encrypted trips data to organize
shared rides without revealing either the users' identities or the location
information. In TRS scheme, drivers report their encrypted routes, an then the
TOS builds an encrypted directed graph that is passed to a modified version of
Dijkstra's shortest path algorithm to search for an optimal path of rides that
can achieve a set of preferences defined by the riders
Development of the premixing injector in burner system
The alternative fuel is good attention especially for renewable and prevention energy
such as biodiesel. Biodiesel fuel (BDF) has a potential for external combustion. BDF is
one of the hydrocarbon fuels. Palm oil Biodiesel is free from sulfur and produced by
esterification and transesterification reaction of vegetable oil with low molecular weight
alcohol, such as ethanol or methanol. The objectives of this research are design the mixing
injector fuel and water-fuel emulsion with air for open burner and analyze the behavior
of mixture spray formation between fuel (DF and BDF) and water-fuel emulsion. Premix
injector use for external combustion especially open burner system. The disadvantages
of BDF are high toxic emissions such as NOx, CO and particular matter (PM) and but it
can reduced the performance of burner system. High toxic emission can be solved by
using a new concept injector with mixing fuel-water emulsion and air. The additional
water for combustion process can reduce the NOx emissions, soot, and the flame
temperature. This research focuses the Spray angle, penetration, and flame length with
secondary and without secondary air. CPO biodiesel has longer penetration length and
spray area than diesel, but the spray angle is smaller than diesel. The different of flame
Image between pure fuel and water mix with fuel is the flame color. Water mix with fuel
has brightness color and shorter flame than pure fuel
- ā¦