2,060 research outputs found

    Faster Clustering via Preprocessing

    Full text link
    We examine the efficiency of clustering a set of points, when the encompassing metric space may be preprocessed in advance. In computational problems of this genre, there is a first stage of preprocessing, whose input is a collection of points MM; the next stage receives as input a query set Q⊂MQ\subset M, and should report a clustering of QQ according to some objective, such as 1-median, in which case the answer is a point a∈Ma\in M minimizing ∑q∈QdM(a,q)\sum_{q\in Q} d_M(a,q). We design fast algorithms that approximately solve such problems under standard clustering objectives like pp-center and pp-median, when the metric MM has low doubling dimension. By leveraging the preprocessing stage, our algorithms achieve query time that is near-linear in the query size n=∣Q∣n=|Q|, and is (almost) independent of the total number of points m=∣M∣m=|M|.Comment: 24 page

    Hierarchical Routing over Dynamic Wireless Networks

    Full text link
    Wireless network topologies change over time and maintaining routes requires frequent updates. Updates are costly in terms of consuming throughput available for data transmission, which is precious in wireless networks. In this paper, we ask whether there exist low-overhead schemes that produce low-stretch routes. This is studied by using the underlying geometric properties of the connectivity graph in wireless networks.Comment: 29 pages, 19 figures, a shorter version was published in the proceedings of the 2008 ACM Sigmetrics conferenc

    The Emergence of Sparse Spanners and Greedy Well-Separated Pair Decomposition

    Get PDF
    A spanner graph on a set of points in RdR^d contains a shortest path between any pair of points with length at most a constant factor of their Euclidean distance. In this paper we investigate new models and aim to interpret why good spanners 'emerge' in reality, when they are clearly built in pieces by agents with their own interests and the construction is not coordinated. Our main result is to show that if edges are built in an arbitrary order but an edge is built if and only if its endpoints are not 'close' to the endpoints of an existing edge, the graph is a (1 + \eps)-spanner with a linear number of edges, constant average degree, and the total edge length as a small logarithmic factor of the cost of the minimum spanning tree. As a side product, we show a simple greedy algorithm for constructing optimal size well-separated pair decompositions that may be of interest on its own

    Routing and search on large scale networks

    Get PDF
    In this thesis, we address two seemingly unrelated problems, namely routing in large wireless ad hoc networks and comparison based search in image databases. However, the underlying problem is in essence similar and we can use the same strategy to attack those two problems. In both cases, the intrinsic complexity of the problem is in some sense low, and we can exploit this fact to design efficient algorithms. A wireless ad hoc network is a communication network consisting of wireless devices such as for instance laptops or cell phones. The network does not have any fixed infrastructure, and hence nodes which cannot communicate directly over the wireless medium must use intermediate nodes as relays. This immediately raises the question of how to select the relay nodes. Ideally, one would like to find a path from the source to the destination which is as short as possible. The length of the found path, also called route, typically depends on how much signaling traffic is generated in order to establish the route. This is the fundamental trade-off that we will investigate in this thesis. As mentioned above, we try and exploit the fact that the communication network is intrinsically low-dimensional, or in other words has low complexity. We show that this is indeed the case for a large class of models and that we can design efficient algorithms for routing that use this property. Low dimensionality implies that we can well embed the network in a low-dimensional space, or build simple hierarchical decompositions of the network. We use both those techniques to design routing algorithms. Comparison based search in image databases is a new problem that can be defined as follows. Given a large database of images, can a human user retrieve an image which he has in mind, or at least an image similar to that image, without going sequentially through all images? More precisely, we ask whether we can search a database of images only by making comparisons between images. As a case in point, we ask whether we can find a query image q only by asking questions of the type "does image q look more like image A or image B"? The analogous to signaling traffic for wireless networks would here be the questions we can ask human users in a learning phase anterior to the search. In other words, we would like to ask as few questions as possible to pre-process and prepare the database, while guaranteeing a certain quality of the results obtained in the search phase. As the underlying image space is not necessarily metric, this raises new questions on how to search spaces for which only rank information can be obtained. The rank of A with respect to B is k, if A is B's kth nearest neighbor. In this setup, low-dimensionality is analogous to the homogeneity of the image space. As we will see, the homogeneity can be captured by properties of the rank relationships. In turn, homogeneous spaces can be well decomposed hierarchically using comparisons. Further, it allows us to design good hash functions. To design efficient algorithms for these two problems, we can apply the same techniques mutatis mutandis. In both cases, we relied on the intuition that the problem has a low intrinsic complexity, and that we can exploit this fact. Our results come in the form of simulation results and asymptotic bounds

    MapReduce and Streaming Algorithms for Diversity Maximization in Metric Spaces of Bounded Doubling Dimension

    Get PDF
    Given a dataset of points in a metric space and an integer kk, a diversity maximization problem requires determining a subset of kk points maximizing some diversity objective measure, e.g., the minimum or the average distance between two points in the subset. Diversity maximization is computationally hard, hence only approximate solutions can be hoped for. Although its applications are mainly in massive data analysis, most of the past research on diversity maximization focused on the sequential setting. In this work we present space and pass/round-efficient diversity maximization algorithms for the Streaming and MapReduce models and analyze their approximation guarantees for the relevant class of metric spaces of bounded doubling dimension. Like other approaches in the literature, our algorithms rely on the determination of high-quality core-sets, i.e., (much) smaller subsets of the input which contain good approximations to the optimal solution for the whole input. For a variety of diversity objective functions, our algorithms attain an (α+ϵ)(\alpha+\epsilon)-approximation ratio, for any constant ϵ>0\epsilon>0, where α\alpha is the best approximation ratio achieved by a polynomial-time, linear-space sequential algorithm for the same diversity objective. This improves substantially over the approximation ratios attainable in Streaming and MapReduce by state-of-the-art algorithms for general metric spaces. We provide extensive experimental evidence of the effectiveness of our algorithms on both real world and synthetic datasets, scaling up to over a billion points.Comment: Extended version of http://www.vldb.org/pvldb/vol10/p469-ceccarello.pdf, PVLDB Volume 10, No. 5, January 201

    A Fog-based Distributed Look-up Service for Intelligent Transportation Systems

    Get PDF
    Future intelligent transportation systems and applications are expected to greatly benefit from the integration with a cloud computing infrastructure for service reliability and efficiency. More recently, fog computing has been proposed as a new computing paradigm to support low-latency and location-aware services by moving the execution of application logic on devices at the edge of the network in proximity of the physical systems, e.g. in the roadside infrastructure or directly in the connected vehicles. Such distributed runtime environment can support low-latency communication with sensors and actuators thus allowing functions such as real-time monitoring and remote control, e.g. for remote telemetry of public transport vehicles or remote control under emergency situations, respectively. These applications will require support for some basic functionalities from the runtime. Among them, discovery of sensors and actuators will be a significant challenge considering the large variety of sensors and actuators and their mobility. In this paper, a discovery service specifically tailored for fog computing platforms with mobile nodes is proposed. Instead of adopting a centralized approach, we pro-pose an approach based on a distributed hash table to be implemented by fog nodes, exploiting their storage and computation capabilities. The proposed approach supports by design multiple attributes and range queries. A prototype of the proposed service has been implemented and evaluated experimentally

    Oblivious buy-at-bulk network design algorithms

    Get PDF
    Large-scale networks such as the Internet has emerged as arguably the most complex distributed communication network system. The mere size of such networks and all the various applications that run on it brings a large variety of challenging problems. Similar problems lie in any network - transportation, logistics, oil/gas pipeline etc where efficient paths are needed to route the flow of demands. This dissertation studies the computation of efficient paths from the demand sources to their respective destination(s). We consider the buy-at-bulk network design problem in which we wish to compute efficient paths for carrying demands from a set of source nodes to a set of destination nodes. In designing networks, it is important to realize economies of scale. This is can be achieved by aggregating the flow of demands. We want the routing to be oblivious: no matter how many source nodes are there and no matter where they are in the network, the demands from the sources has to be routed in a near-optimal fashion. Moreover, we want the aggregation function f to be unknown, assuming that it is a concave function of the total flow on the edge. The total cost of a solution is determined by the amount of demand routed through each edge. We address questions such as how we can (obliviously) route flows and get competitive algorithms for this problem. We study the approximability of the resulting buy-at-bulk network design problem. Our aim is to _x000C_find minimum-cost paths for all the demands to the sink(s) under two assumptions: (1) The demand set is unknown, that is, the number of source nodes that has demand to send is unknown. (2) The aggregation cost function at intermediate edges is also unknown. We consider di_x000B_fferent types of graphs (doubling-dimension, planar and minor-free) and provide approximate solutions for each of them. For the case of doubling graphs and minor-free graphs, we construct a single spanning tree for the single-source buy-at-bulk network design problem. For the case of planar graphs, we have built a set of paths with an asymptotically tight competitive ratio

    Towards scalable Community Networks topologies

    Get PDF
    Community Networks (CNs) are grassroots bottom-up initiatives that build local infrastructures, normally using Wi-Fi technology, to bring broadband networking in areas with inadequate offer of traditional infrastructures such as ADSL, FTTx or wide-band cellular (LTE, 5G). Albeit they normally operate as access networks to the Internet, CNs are ad-hoc networks that evolve based on local requirements and constraints, often including additional local services on top of Internet access. These networks grow in highly decentralized manner that radically deviates from the top-down network planning practiced in commercial mobile networks, depending, on the one hand, on the willingness of people to participate, and, on the other hand, on the feasibility of wireless links connecting the houses of potential participants with each other. In this paper, we present a novel methodology and its implementation into an automated tool, which enables the exercise of (light) centralized control to the dynamic and otherwise spontaneous CN growth process. The goal of the methodology is influencing the choices to connect a new node to the CN so that it can grow with more balance and to a larger size. Input to our methodology are open source resources about the physical terrain of the CN deployment area, such as Open Street Map and very detailed (less than 1 m resolution) LIDAR-based data about buildings layout and height, as well as technical descriptions and pricing data about off-the-shelf networking devices that are made available by manufacturers. Data related to demographics can be easily added to refine the environment description. With these data at hand, the tool can estimate the technical and economic feasibility of adding new nodes to the CN and actively assist new CN users in selecting proper equipment and CN node(s) to connect with to improve the CN scalability. We test our methodology in four different areas representing standard territorial characterization categories: urban, suburban, intermediate, and rural. In all four cases our tool shows that CNs scale to much larger size using the assisted, network-aware methodology when compared with de facto practices. Results also show that the CNs deployed with the assisted methodology are more balanced and have a lower per-node cost for the same per-node guaranteed bandwidth. Moreover, this is achieved with fewer devices per node, which means that the network is cheaper to build and easier to maintain.Peer ReviewedPostprint (author's final draft

    Correlated multi-streaming in distributed interactive multimedia systems

    Get PDF
    Distributed Interactive Multimedia Environments (DIMEs) enable geographically distributed people to interact with each other in a joint media-rich virtual environment for a wide range of activities, such as art performance, medical consultation, sport training, etc. The real-time collaboration is made possible by exchanging a set of multi-modal sensory streams over the network in real time. The characterization and evaluation of such multi-stream interactive environments is challenging because the traditional Quality of Service metrics (e.g., delay, jitter) are limited to a per stream basis. In this work, we present a novel ???Bundle of Streams??? concept to de???ne correlated multi-streams in DIMEs and present new cyber-physical, spatio-temporal QoS metrics to measure QoS over bundle of streams. We realize Bundle of Streams concept by presenting a novel paradigm of Bundle Streaming as a Service (SAS). We propose and develop SAS Kernel, a generic, distributed, modular and highly ???exible streaming kernel realizing SAS concept. We validate the Bundle of Streams model by comparing the QoS performance of bundle of streams over different transport protocols in a 3D tele-immersive testbed. Also, further experiments demonstrate that the SAS Kernel incurs low overhead in delay, CPU, and bandwidth demands
    • …
    corecore