Search CORE

293 research outputs found

Principles of Dataset Versioning: Exploring the Recreation/Storage Tradeoff

Author: Bhattacherjee Souvik
Chavan Amit
Deshpande Amol
Huang Silu
Parameswaran Aditya
Publication venue
Publication date: 19/05/2015
Field of study

The relative ease of collaborative data science and analysis has led to a proliferation of many thousands or millions of

versions

of the same datasets in many scientific and commercial domains, acquired or constructed at various stages of data analysis across many users, and often over long periods of time. Managing, storing, and recreating these dataset versions is a non-trivial task. The fundamental challenge here is the

storage-recreation\;trade-off

: the more storage we use, the faster it is to recreate or retrieve versions, while the less storage we use, the slower it is to recreate or retrieve versions. Despite the fundamental nature of this problem, there has been a surprisingly little amount of work on it. In this paper, we study this trade-off in a principled manner: we formulate six problems under various settings, trading off these quantities in various ways, demonstrate that most of the problems are intractable, and propose a suite of inexpensive heuristics drawing from techniques in delay-constrained scheduling, and spanning tree literature, to solve these problems. We have built a prototype version management system, that aims to serve as a foundation to our DATAHUB system for facilitating collaborative data science. We demonstrate, via extensive experiments, that our proposed heuristics provide efficient solutions in practical dataset versioning scenarios

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

The Traveling Salesman Problem Under Squared Euclidean Distances

Author: de Berg Mark
Sitters René
van Nijnatten Fred
Woeginger Gerhard J.
Wolff Alexander
Publication venue
Publication date: 01/01/2010
Field of study

Let

P

be a set of points in

\mathbb{R}^d

, and let

\alpha \ge 1

be a real number. We define the distance between two points

p,q\in P

|pq|^{\alpha}

, where

|pq|

denotes the standard Euclidean distance between

p

and

q

. We denote the traveling salesman problem under this distance function by TSP(

d,\alpha

). We design a 5-approximation algorithm for TSP(2,2) and generalize this result to obtain an approximation factor of

3^{\alpha-1}+\sqrt{6}^{\alpha}/3

for

d=2

and all

\alpha\ge2

. We also study the variant Rev-TSP of the problem where the traveling salesman is allowed to revisit points. We present a polynomial-time approximation scheme for Rev-TSP

(2,\alpha)

with

\alpha\ge2

, and we show that Rev-TSP

(d, \alpha)

is APX-hard if

d\ge3

and

\alpha>1

. The APX-hardness proof carries over to TSP

(d, \alpha)

for the same parameter ranges.Comment: 12 pages, 4 figures. (v2) Minor linguistic change

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Clustering and Hybrid Routing in Mobile Ad Hoc Networks

Author: Wang Lan
Publication venue: ODU Digital Commons
Publication date: 01/04/2005
Field of study

This dissertation focuses on clustering and hybrid routing in Mobile Ad Hoc Networks (MANET). Specifically, we study two different network-layer virtual infrastructures proposed for MANET: the explicit cluster infrastructure and the implicit zone infrastructure. In the first part of the dissertation, we propose a novel clustering scheme based on a number of properties of diameter-2 graphs to provide a general-purpose virtual infrastructure for MANET. Compared to virtual infrastructures with central nodes, our virtual infrastructure is more symmetric and stable, but still light-weight. In our clustering scheme, cluster initialization naturally blends into cluster maintenance, showing the unity between these two operations. We call our algorithm tree-based since cluster merge and split operations are performed based on a spanning tree maintained at some specific nodes. Extensive simulation results have shown the effectiveness of our clustering scheme when compared to other schemes proposed in the literature. In the second part of the dissertation, we propose TZRP (Two-Zone Routing Protocol) as a hybrid routing framework that can balance the tradeoffs between pure proactive, fuzzy proactive, and reactive routing approaches more effectively in a wide range of network conditions. In TZRP, each node maintains two zones: a Crisp Zone for proactive routing and efficient bordercasting, and a Fuzzy Zone for heuristic routing using imprecise locality information. The perimeter of the Crisp Zone is the boundary between pure proactive routing and fuzzy proactive routing, and the perimeter of the Fuzzy Zone is the boundary between proactive routing and reactive routing. By adjusting the sizes of these two zones, a reduced total routing control overhead can be achieved

Old Dominion University

A distributed topology control technique for low interference and energy efficiency in wireless sensor networks

Author: Chiwewe Tapiwa Moses
Publication venue: 'University of Pretoria - Department of Philosophy'
Publication date: 24/02/2011
Field of study

Wireless sensor networks are used in several multi-disciplinary areas covering a wide variety of applications. They provide distributed computing, sensing and communication in a powerful integration of capabilities. They have great long-term economic potential and have the ability to transform our lives. At the same time however, they pose several challenges – mostly as a result of their random deployment and non-renewable energy sources.Among the most important issues in wireless sensor networks are energy efficiency and radio interference. Topology control plays an important role in the design of wireless ad hoc and sensor networks; it is capable of constructing networks that have desirable characteristics such as sparser connectivity, lower transmission power and a smaller node degree.In this research a distributed topology control technique is presented that enhances energy efficiency and reduces radio interference in wireless sensor networks. Each node in the network makes local decisions about its transmission power and the culmination of these local decisions produces a network topology that preserves global connectivity. The topology that is produced consists of a planar graph that is a power spanner, it has lower node degrees and can be constructed using local information. The network lifetime is increased by reducing transmission power and the use of low node degrees reduces traffic interference. The approach to topology control that is presented in this document has an advantage over previously developed approaches in that it focuses not only on reducing either energy consumption or radio interference, but on reducing both of these obstacles. Results are presented of simulations that demonstrate improvements in performance. AFRIKAANS : Draadlose sensor netwerke word gebruik in verskeie multi-dissiplinêre areas wat 'n wye verskeidenheid toepassings dek. Hulle voorsien verspreide berekening, bespeuring en kommunikasie in 'n kragtige integrate van vermoëns. Hulle het goeie langtermyn ekonomiese potentiaal en die vermoë om ons lewens te herskep. Terselfdertyd lewer dit egter verskeie uitdagings op as gevolg van hul lukrake ontplooiing en nie-hernubare energie bronne. Van die belangrikste kwessies in draadlose sensor netwerke is energie-doeltreffendheid en radiosteuring. Topologie-beheer speel 'n belangrike rol in die ontwerp van draadlose informele netwerke en sensor netwerke en dit is geskik om netwerke aan te bring wat gewenste eienskappe het soos verspreide koppeling, laer transmissiekrag en kleiner nodus graad.In hierdie ondersoek word 'n verspreide topologie beheertegniek voorgelê wat energie-doeltreffendheid verhoog en radiosteuring verminder in draadlose sensor netwerke. Elke nodus in die netwerk maak lokale besluite oor sy transmissiekrag en die hoogtepunt van hierdie lokale besluite lewer 'n netwerk-topologie op wat globale verbintenis behou.Die topologie wat gelewer word is 'n tweedimensionele grafiek en 'n kragsleutel; dit het laer nodus grade en kan gebou word met lokale inligting. Die netwerk-leeftyd word vermeerder deur transmissiekrag te verminder en verkeer-steuring word verminder deur lae nodus grade. Die benadering tot topologie-beheer wat voorgelê word in hierdie skrif het 'n voordeel oor benaderings wat vroeër ontwikkel is omdat dit nie net op die vermindering van net energie verbruik of net radiosteuring fokus nie, maar op albei. Resultate van simulasies word voorgelê wat die verbetering in werkverrigting demonstreer.Dissertation (MEng)--University of Pretoria, 2010.Electrical, Electronic and Computer Engineeringunrestricte

UPSpace at the University of Pretoria

Fixed Orientation Interconnection Problems: Theory, Algorithms and Applications

Author: Zachariasen Martin
Publication venue: Museum Tusculanum
Publication date: 01/01/2009
Field of study

Copenhagen University Research Information System

Recommended from our members

Greedy Spanners in Euclidean Spaces Admit Sublinear Separators

Author: Le Hung
Than Cuong
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2023
Field of study

The greedy spanner in low dimensional Euclidean space is a fundamental geometric construction that has been extensively studied over three decades as it possesses the two most basic properties of a good spanner: constant maximum degree and constant lightness

ScholarWorks@UMass Amherst

On the design of architecture-aware algorithms for emerging applications

Author: Kang Seunghwa
Publication venue: Georgia Institute of Technology
Publication date: 30/01/2011
Field of study

This dissertation maps various kernels and applications to a spectrum of programming models and architectures and also presents architecture-aware algorithms for different systems. The kernels and applications discussed in this dissertation have widely varying computational characteristics. For example, we consider both dense numerical computations and sparse graph algorithms. This dissertation also covers emerging applications from image processing, complex network analysis, and computational biology. We map these problems to diverse multicore processors and manycore accelerators. We also use new programming models (such as Transactional Memory, MapReduce, and Intel TBB) to address the performance and productivity challenges in the problems. Our experiences highlight the importance of mapping applications to appropriate programming models and architectures. We also find several limitations of current system software and architectures and directions to improve those. The discussion focuses on system software and architectural support for nested irregular parallelism, Transactional Memory, and hybrid data transfer mechanisms. We believe that the complexity of parallel programming can be significantly reduced via collaborative efforts among researchers and practitioners from different domains. This dissertation participates in the efforts by providing benchmarks and suggestions to improve system software and architectures.Ph.D.Committee Chair: Bader, David; Committee Member: Hong, Bo; Committee Member: Riley, George; Committee Member: Vuduc, Richard; Committee Member: Wills, Scot

Scholarly Materials And Research @ Georgia Tech

Recommended from our members

Bridging the Theory-Practice Gap of Laplacian Linear Solvers

Author: Deweese Kevin
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

Solving Laplacian linear systems is an important task in a variety of practical and theoretical applications. Laplacians of structured graphs, such as two and three dimensional meshes, have long been important in finite element analysis and image processing. More recently, solving linear systems on the Laplacians of large graphs without mesh-like structure has emerged as an important computational task in network analysis. A number of theoretical solvers with good asymptotic complexity have been proposed over the past couple decades, but these ideas have not made their way into practical solvers. Nor is it clear that a class of challenging problems exist which would benefit from asymptotically fast solvers. Yet it seems that one of the following should be true: either existing solvers have tighter Big-O bounds than currently believed, or there are some problems where recent asymptotically fast (but theoretical) algorithms should be useful.This work considers the latter possibility; we aim to bridge the gap between theoretical and practical Laplacian algorithms by experimenting with Laplacian solvers and by searching for difficult test problems. We examine the performance of existing algorithms for solving Laplacian linear systems and identify the strengths and weaknesses of different methods on different test problems. We perform an extensive evaluation of the KOSZ solver, one of the recently proposed Õ(m) Laplacian algorithms. We test various extensions of KOSZ which we propose to try and improve its performance in practice. We introduce heavy path graphs, a novel class of graphs for experimenting with Laplacian solvers.To challenge existing solver implementations, we propose the use of genetic algorithms to create difficult test graphs for existing solvers. At the same time, these algorithms could be used to find graphs with good performance for recently proposed solvers. Searching for graphs which satisfy both objectives could be instrumental towards bridging the theory-practice gap of Laplacian solvers. We demonstrate the successful evolution of graphs which are difficult for conjugate gradient with diagonal scaling, while relatively simple for KOSZ. Such graph evolution techniques could be useful for finding graphs with a variety of combinatorial properties

eScholarship - University of California