10,388 research outputs found
Forecasting the cost of processing multi-join queries via hashing for main-memory databases (Extended version)
Database management systems (DBMSs) carefully optimize complex multi-join
queries to avoid expensive disk I/O. As servers today feature tens or hundreds
of gigabytes of RAM, a significant fraction of many analytic databases becomes
memory-resident. Even after careful tuning for an in-memory environment, a
linear disk I/O model such as the one implemented in PostgreSQL may make query
response time predictions that are up to 2X slower than the optimal multi-join
query plan over memory-resident data. This paper introduces a memory I/O cost
model to identify good evaluation strategies for complex query plans with
multiple hash-based equi-joins over memory-resident data. The proposed cost
model is carefully validated for accuracy using three different systems,
including an Amazon EC2 instance, to control for hardware-specific differences.
Prior work in parallel query evaluation has advocated right-deep and bushy
trees for multi-join queries due to their greater parallelization and
pipelining potential. A surprising finding is that the conventional wisdom from
shared-nothing disk-based systems does not directly apply to the modern
shared-everything memory hierarchy. As corroborated by our model, the
performance gap between the optimal left-deep and right-deep query plan can
grow to about 10X as the number of joins in the query increases.Comment: 15 pages, 8 figures, extended version of the paper to appear in
SoCC'1
Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey
Wireless sensor networks (WSNs) consist of autonomous and resource-limited
devices. The devices cooperate to monitor one or more physical phenomena within
an area of interest. WSNs operate as stochastic systems because of randomness
in the monitored environments. For long service time and low maintenance cost,
WSNs require adaptive and robust methods to address data exchange, topology
formulation, resource and power optimization, sensing coverage and object
detection, and security challenges. In these problems, sensor nodes are to make
optimized decisions from a set of accessible strategies to achieve design
goals. This survey reviews numerous applications of the Markov decision process
(MDP) framework, a powerful decision-making tool to develop adaptive algorithms
and protocols for WSNs. Furthermore, various solution methods are discussed and
compared to serve as a guide for using MDPs in WSNs
Isolating SDN Control Traffic with Layer-2 Slicing in 6TiSCH Industrial IoT Networks
Recent standardization efforts in IEEE 802.15.4-2015 Time Scheduled Channel
Hopping (TSCH) and the IETF 6TiSCH Working Group (WG), aim to provide
deterministic communications and efficient allocation of resources across
constrained Internet of Things (IoT) networks, particularly in Industrial IoT
(IIoT) scenarios. Within 6TiSCH, Software Defined Networking (SDN) has been
identified as means of providing centralized control in a number of key
situations. However, implementing a centralized SDN architecture in a Low Power
and Lossy Network (LLN) faces considerable challenges: not only is controller
traffic subject to jitter due to unreliable links and network contention, but
the overhead generated by SDN can severely affect the performance of other
traffic. This paper proposes using 6TiSCH tracks, a Layer-2 slicing mechanism
for creating dedicated forwarding paths across TSCH networks, in order to
isolate the SDN control overhead. Not only does this prevent control traffic
from affecting the performance of other data flows, but the properties of
6TiSCH tracks allows deterministic, low-latency SDN controller communication.
Using our own lightweight SDN implementation for Contiki OS, we firstly
demonstrate the effect of SDN control traffic on application data flows across
a 6TiSCH network. We then show that by slicing the network through the
allocation of dedicated resources along a SDN control path, tracks provide an
effective means of mitigating the cost of SDN control overhead in IEEE
802.15.4-2015 TSCH networks
Spatial optimization for land use allocation: accounting for sustainability concerns
Land-use allocation has long been an important area of research in regional science. Land-use patterns are fundamental to the functions of the biosphere, creating interactions that have substantial impacts on the environment. The spatial arrangement of land uses therefore has implications for activity and travel within a region. Balancing development, economic growth, social interaction, and the protection of the natural environment is at the heart of long-term sustainability. Since land-use patterns are spatially explicit in nature, planning and management necessarily must integrate geographical information system and spatial optimization in meaningful ways if efficiency goals and objectives are to be achieved. This article reviews spatial optimization approaches that have been relied upon to support land-use planning. Characteristics of sustainable land use, particularly compactness, contiguity, and compatibility, are discussed and how spatial optimization techniques have addressed these characteristics are detailed. In particular, objectives and constraints in spatial optimization approaches are examined
A Framework for Quality-Driven Delivery in Distributed Multimedia Systems
In this paper, we propose a framework for Quality-Driven Delivery (QDD) in distributed multimedia environments. Quality-driven delivery refers to the capacity of a system to deliver documents, or more generally objects, while considering the users expectations in terms of non-functional requirements. For this QDD framework, we propose a model-driven approach where we focus on QoS information modeling and transformation. QoS information models and meta-models are used during different QoS activities for mapping requirements to system constraints, for exchanging QoS information, for checking compatibility between QoS information and more generally for making QoS decisions. We also investigate which model transformation operators have to be implemented in order to support some QoS activities such as QoS mapping
From Cooperative Scans to Predictive Buffer Management
In analytical applications, database systems often need to sustain workloads
with multiple concurrent scans hitting the same table. The Cooperative Scans
(CScans) framework, which introduces an Active Buffer Manager (ABM) component
into the database architecture, has been the most effective and elaborate
response to this problem, and was initially developed in the X100 research
prototype. We now report on the the experiences of integrating Cooperative
Scans into its industrial-strength successor, the Vectorwise database product.
During this implementation we invented a simpler optimization of concurrent
scan buffer management, called Predictive Buffer Management (PBM). PBM is based
on the observation that in a workload with long-running scans, the buffer
manager has quite a bit of information on the workload in the immediate future,
such that an approximation of the ideal OPT algorithm becomes feasible. In the
evaluation on both synthetic benchmarks as well as a TPC-H throughput run we
compare the benefits of naive buffer management (LRU) versus CScans, PBM and
OPT; showing that PBM achieves benefits close to Cooperative Scans, while
incurring much lower architectural impact.Comment: VLDB201
PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development
This paper describes PlinyCompute, a system for development of
high-performance, data-intensive, distributed computing tools and libraries. In
the large, PlinyCompute presents the programmer with a very high-level,
declarative interface, relying on automatic, relational-database style
optimization to figure out how to stage distributed computations. However, in
the small, PlinyCompute presents the capable systems programmer with a
persistent object data model and API (the "PC object model") and associated
memory management system that has been designed from the ground-up for high
performance, distributed, data-intensive computing. This contrasts with most
other Big Data systems, which are constructed on top of the Java Virtual
Machine (JVM), and hence must at least partially cede performance-critical
concerns such as memory management (including layout and de/allocation) and
virtual method/function dispatch to the JVM. This hybrid approach---declarative
in the large, trusting the programmer's ability to utilize PC object model
efficiently in the small---results in a system that is ideal for the
development of reusable, data-intensive tools and libraries. Through extensive
benchmarking, we show that implementing complex objects manipulation and
non-trivial, library-style computations on top of PlinyCompute can result in a
speedup of 2x to more than 50x or more compared to equivalent implementations
on Spark.Comment: 48 pages, including references and Appendi
- …