153 research outputs found

    Approximate Range Emptiness in Constant Time and Optimal Space

    No full text
    This paper studies the \emph{Δ\varepsilon-approximate range emptiness} problem, where the task is to represent a set SS of nn points from {0,
,U−1}\{0,\ldots,U-1\} and answer emptiness queries of the form "[a;b]∩S≠∅[a ; b]\cap S \neq \emptyset ?" with a probability of \emph{false positives} allowed. This generalizes the functionality of \emph{Bloom filters} from single point queries to any interval length LL. Setting the false positive rate to Δ/L\varepsilon/L and performing LL queries, Bloom filters yield a solution to this problem with space O(nlg⁥(L/Δ))O(n \lg(L/\varepsilon)) bits, false positive probability bounded by Δ\varepsilon for intervals of length up to LL, using query time O(Llg⁥(L/Δ))O(L \lg(L/\varepsilon)). Our first contribution is to show that the space/error trade-off cannot be improved asymptotically: Any data structure for answering approximate range emptiness queries on intervals of length up to LL with false positive probability Δ\varepsilon, must use space Ω(nlg⁥(L/Δ))−O(n)\Omega(n \lg(L/\varepsilon)) - O(n) bits. On the positive side we show that the query time can be improved greatly, to constant time, while matching our space lower bound up to a lower order additive term. This result is achieved through a succinct data structure for (non-approximate 1d) range emptiness/reporting queries, which may be of independent interest

    On the k-Independence Required by Linear Probing and Minwise Independence

    Full text link

    Efficiently Correcting Matrix Products

    Get PDF
    We study the problem of efficiently correcting an erroneous product of two n×nn\times n matrices over a ring. Among other things, we provide a randomized algorithm for correcting a matrix product with at most kk erroneous entries running in O~(n2+kn)\tilde{O}(n^2+kn) time and a deterministic O~(kn2)\tilde{O}(kn^2)-time algorithm for this problem (where the notation O~\tilde{O} suppresses polylogarithmic terms in nn and kk).Comment: Fixed invalid reference to figure in v

    Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution

    Full text link
    Self-driving cars need to understand 3D scenes efficiently and accurately in order to drive safely. Given the limited hardware resources, existing 3D perception models are not able to recognize small instances (e.g., pedestrians, cyclists) very well due to the low-resolution voxelization and aggressive downsampling. To this end, we propose Sparse Point-Voxel Convolution (SPVConv), a lightweight 3D module that equips the vanilla Sparse Convolution with the high-resolution point-based branch. With negligible overhead, this point-based branch is able to preserve the fine details even from large outdoor scenes. To explore the spectrum of efficient 3D models, we first define a flexible architecture design space based on SPVConv, and we then present 3D Neural Architecture Search (3D-NAS) to search the optimal network architecture over this diverse design space efficiently and effectively. Experimental results validate that the resulting SPVNAS model is fast and accurate: it outperforms the state-of-the-art MinkowskiNet by 3.3%, ranking 1st on the competitive SemanticKITTI leaderboard. It also achieves 8x computation reduction and 3x measured speedup over MinkowskiNet with higher accuracy. Finally, we transfer our method to 3D object detection, and it achieves consistent improvements over the one-stage detection baseline on KITTI.Comment: ECCV 2020. The first two authors contributed equally to this work. Project page: http://spvnas.mit.edu

    Interactive Learning for Multimedia at Large

    Get PDF
    International audienceInteractive learning has been suggested as a key method for addressing analytic multimedia tasks arising in several domains. Until recently, however, methods to maintain interactive performance at the scale of today's media collections have not been addressed. We propose an interactive learning approach that builds on and extends the state of the art in user relevance feedback systems and high-dimensional indexing for multimedia. We report on a detailed experimental study using the ImageNet and YFCC100M collections, containing 14 million and 100 million images respectively. The proposed approach outperforms the relevant state-of-the-art approaches in terms of interactive performance, while improving suggestion relevance in some cases. In particular, even on YFCC100M, our approach requires less than 0.3 s per interaction round to generate suggestions, using a single computing core and less than 7 GB of main memory

    Sub-logarithmic Distributed Oblivious RAM with Small Block Size

    Get PDF
    Oblivious RAM (ORAM) is a cryptographic primitive that allows a client to securely execute RAM programs over data that is stored in an untrusted server. Distributed Oblivious RAM is a variant of ORAM, where the data is stored in m>1m>1 servers. Extensive research over the last few decades have succeeded to reduce the bandwidth overhead of ORAM schemes, both in the single-server and the multi-server setting, from O(N)O(\sqrt{N}) to O(1)O(1). However, all known protocols that achieve a sub-logarithmic overhead either require heavy server-side computation (e.g. homomorphic encryption), or a large block size of at least Ω(log⁥3N)\Omega(\log^3 N). In this paper, we present a family of distributed ORAM constructions that follow the hierarchical approach of Goldreich and Ostrovsky [GO96]. We enhance known techniques, and develop new ones, to take better advantage of the existence of multiple servers. By plugging efficient known hashing schemes in our constructions, we get the following results: 1. For any m≄2m\geq 2, we show an mm-server ORAM scheme with O(log⁥N/log⁥log⁥N)O(\log N/\log\log N) overhead, and block size Ω(log⁥2N)\Omega(\log^2 N). This scheme is private even against an (m−1)(m-1)-server collusion. 2. A 3-server ORAM construction with O(ω(1)log⁥N/log⁥log⁥N)O(\omega(1)\log N/\log\log N) overhead and a block size almost logarithmic, i.e. Ω(log⁥1+Ï”N)\Omega(\log^{1+\epsilon}N). We also investigate a model where the servers are allowed to perform a linear amount of light local computations, and show that constant overhead is achievable in this model, through a simple four-server ORAM protocol

    Efficient counting of k-mers in DNA sequences using a bloom filter

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Counting <it>k</it>-mers (substrings of length <it>k </it>in DNA sequence data) is an essential component of many methods in bioinformatics, including for genome and transcriptome assembly, for metagenomic sequencing, and for error correction of sequence reads. Although simple in principle, counting <it>k</it>-mers in large modern sequence data sets can easily overwhelm the memory capacity of standard computers. In current data sets, a large fraction-often more than 50%-of the storage capacity may be spent on storing <it>k</it>-mers that contain sequencing errors and which are typically observed only a single time in the data. These singleton <it>k</it>-mers are uninformative for many algorithms without some kind of error correction.</p> <p>Results</p> <p>We present a new method that identifies all the <it>k</it>-mers that occur more than once in a DNA sequence data set. Our method does this using a Bloom filter, a probabilistic data structure that stores all the observed <it>k</it>-mers implicitly in memory with greatly reduced memory requirements. We then make a second sweep through the data to provide exact counts of all nonunique <it>k</it>-mers. For example data sets, we report up to 50% savings in memory usage compared to current software, with modest costs in computational speed. This approach may reduce memory requirements for any algorithm that starts by counting <it>k</it>-mers in sequence data with errors.</p> <p>Conclusions</p> <p>A reference implementation for this methodology, BFCounter, is written in C++ and is GPL licensed. It is available for free download at <url>http://pritch.bsd.uchicago.edu/bfcounter.html</url></p

    Extracellular vesicles and intercellular communication within the nervous system

    Get PDF
    Extracellular vesicles (EVs, including exosomes) are implicated in many aspects of nervous system development and function, including regulation of synaptic communication, synaptic strength, and nerve regeneration. They mediate the transfer of packets of information in the form of nonsecreted proteins and DNA/RNA protected within a membrane compartment. EVs are essential for the packaging and transport of many cell-fate proteins during development as well as many neurotoxic misfolded proteins during pathogenesis. This form of communication provides another dimension of cellular crosstalk, with the ability to assemble a “kit” of directional instructions made up of different molecular entities and address it to specific recipient cells. This multidimensional form of communication has special significance in the nervous system. How EVs help to orchestrate the wiring of the brain while allowing for plasticity associated with learning and memory and contribute to regeneration and degeneration are all under investigation. Because they carry specific disease-related RNAs and proteins, practical applications of EVs include potential uses as biomarkers and therapeutics. This Review describes our current understanding of EVs and serves as a springboard for future advances, which may reveal new important mechanisms by which EVs in coordinate brain and body function and dysfunction
    • 

    corecore