21,380 research outputs found

    Fundamentals of Large Sensor Networks: Connectivity, Capacity, Clocks and Computation

    Full text link
    Sensor networks potentially feature large numbers of nodes that can sense their environment over time, communicate with each other over a wireless network, and process information. They differ from data networks in that the network as a whole may be designed for a specific application. We study the theoretical foundations of such large scale sensor networks, addressing four fundamental issues- connectivity, capacity, clocks and function computation. To begin with, a sensor network must be connected so that information can indeed be exchanged between nodes. The connectivity graph of an ad-hoc network is modeled as a random graph and the critical range for asymptotic connectivity is determined, as well as the critical number of neighbors that a node needs to connect to. Next, given connectivity, we address the issue of how much data can be transported over the sensor network. We present fundamental bounds on capacity under several models, as well as architectural implications for how wireless communication should be organized. Temporal information is important both for the applications of sensor networks as well as their operation.We present fundamental bounds on the synchronizability of clocks in networks, and also present and analyze algorithms for clock synchronization. Finally we turn to the issue of gathering relevant information, that sensor networks are designed to do. One needs to study optimal strategies for in-network aggregation of data, in order to reliably compute a composite function of sensor measurements, as well as the complexity of doing so. We address the issue of how such computation can be performed efficiently in a sensor network and the algorithms for doing so, for some classes of functions.Comment: 10 pages, 3 figures, Submitted to the Proceedings of the IEE

    Entropy-scaling search of massive biological data

    Get PDF
    Many datasets exhibit a well-defined structure that can be exploited to design faster search tools, but it is not always clear when such acceleration is possible. Here, we introduce a framework for similarity search based on characterizing a dataset's entropy and fractal dimension. We prove that searching scales in time with metric entropy (number of covering hyperspheres), if the fractal dimension of the dataset is low, and scales in space with the sum of metric entropy and information-theoretic entropy (randomness of the data). Using these ideas, we present accelerated versions of standard tools, with no loss in specificity and little loss in sensitivity, for use in three domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics (MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search (esFragBag, 10x speedup of FragBag). Our framework can be used to achieve "compressive omics," and the general theory can be readily applied to data science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo

    Circuits with arbitrary gates for random operators

    Full text link
    We consider boolean circuits computing n-operators f:{0,1}^n --> {0,1}^n. As gates we allow arbitrary boolean functions; neither fanin nor fanout of gates is restricted. An operator is linear if it computes n linear forms, that is, computes a matrix-vector product y=Ax over GF(2). We prove the existence of n-operators requiring about n^2 wires in any circuit, and linear n-operators requiring about n^2/\log n wires in depth-2 circuits, if either all output gates or all gates on the middle layer are linear.Comment: 7 page

    Computational barriers in minimax submatrix detection

    Get PDF
    This paper studies the minimax detection of a small submatrix of elevated mean in a large matrix contaminated by additive Gaussian noise. To investigate the tradeoff between statistical performance and computational cost from a complexity-theoretic perspective, we consider a sequence of discretized models which are asymptotically equivalent to the Gaussian model. Under the hypothesis that the planted clique detection problem cannot be solved in randomized polynomial time when the clique size is of smaller order than the square root of the graph size, the following phase transition phenomenon is established: when the size of the large matrix p→∞p\to\infty, if the submatrix size k=Θ(pα)k=\Theta(p^{\alpha}) for any α∈(0,2/3)\alpha\in(0,{2}/{3}), computational complexity constraints can incur a severe penalty on the statistical performance in the sense that any randomized polynomial-time test is minimax suboptimal by a polynomial factor in pp; if k=Θ(pα)k=\Theta(p^{\alpha}) for any α∈(2/3,1)\alpha\in({2}/{3},1), minimax optimal detection can be attained within constant factors in linear time. Using Schatten norm loss as a representative example, we show that the hardness of attaining the minimax estimation rate can crucially depend on the loss function. Implications on the hardness of support recovery are also obtained.Comment: Published at http://dx.doi.org/10.1214/14-AOS1300 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Rank Minimization over Finite Fields: Fundamental Limits and Coding-Theoretic Interpretations

    Full text link
    This paper establishes information-theoretic limits in estimating a finite field low-rank matrix given random linear measurements of it. These linear measurements are obtained by taking inner products of the low-rank matrix with random sensing matrices. Necessary and sufficient conditions on the number of measurements required are provided. It is shown that these conditions are sharp and the minimum-rank decoder is asymptotically optimal. The reliability function of this decoder is also derived by appealing to de Caen's lower bound on the probability of a union. The sufficient condition also holds when the sensing matrices are sparse - a scenario that may be amenable to efficient decoding. More precisely, it is shown that if the n\times n-sensing matrices contain, on average, \Omega(nlog n) entries, the number of measurements required is the same as that when the sensing matrices are dense and contain entries drawn uniformly at random from the field. Analogies are drawn between the above results and rank-metric codes in the coding theory literature. In fact, we are also strongly motivated by understanding when minimum rank distance decoding of random rank-metric codes succeeds. To this end, we derive distance properties of equiprobable and sparse rank-metric codes. These distance properties provide a precise geometric interpretation of the fact that the sparse ensemble requires as few measurements as the dense one. Finally, we provide a non-exhaustive procedure to search for the unknown low-rank matrix.Comment: Accepted to the IEEE Transactions on Information Theory; Presented at IEEE International Symposium on Information Theory (ISIT) 201

    Towards a complexity theory for the congested clique

    Full text link
    The congested clique model of distributed computing has been receiving attention as a model for densely connected distributed systems. While there has been significant progress on the side of upper bounds, we have very little in terms of lower bounds for the congested clique; indeed, it is now know that proving explicit congested clique lower bounds is as difficult as proving circuit lower bounds. In this work, we use various more traditional complexity-theoretic tools to build a clearer picture of the complexity landscape of the congested clique: -- Nondeterminism and beyond: We introduce the nondeterministic congested clique model (analogous to NP) and show that there is a natural canonical problem family that captures all problems solvable in constant time with nondeterministic algorithms. We further generalise these notions by introducing the constant-round decision hierarchy (analogous to the polynomial hierarchy). -- Non-constructive lower bounds: We lift the prior non-uniform counting arguments to a general technique for proving non-constructive uniform lower bounds for the congested clique. In particular, we prove a time hierarchy theorem for the congested clique, showing that there are decision problems of essentially all complexities, both in the deterministic and nondeterministic settings. -- Fine-grained complexity: We map out relationships between various natural problems in the congested clique model, arguing that a reduction-based complexity theory currently gives us a fairly good picture of the complexity landscape of the congested clique
    • …
    corecore