45,057 research outputs found

    Decoding billions of integers per second through vectorization

    Get PDF
    In many important applications -- such as search engines and relational database systems -- data is stored in the form of arrays of integers. Encoding and, most importantly, decoding of these arrays consumes considerable CPU time. Therefore, substantial effort has been made to reduce costs associated with compression and decompression. In particular, researchers have exploited the superscalar nature of modern processors and SIMD instructions. Nevertheless, we introduce a novel vectorized scheme called SIMD-BP128 that improves over previously proposed vectorized approaches. It is nearly twice as fast as the previously fastest schemes on desktop processors (varint-G8IU and PFOR). At the same time, SIMD-BP128 saves up to 2 bits per integer. For even better compression, we propose another new vectorized scheme (SIMD-FastPFOR) that has a compression ratio within 10% of a state-of-the-art scheme (Simple-8b) while being two times faster during decoding.Comment: For software, see https://github.com/lemire/FastPFor, For data, see http://boytsov.info/datasets/clueweb09gap

    The Case for Learned Index Structures

    Full text link
    Indexes are models: a B-Tree-Index can be seen as a model to map a key to the position of a record within a sorted array, a Hash-Index as a model to map a key to a position of a record within an unsorted array, and a BitMap-Index as a model to indicate if a data record exists or not. In this exploratory research paper, we start from this premise and posit that all existing index structures can be replaced with other types of models, including deep-learning models, which we term learned indexes. The key idea is that a model can learn the sort order or structure of lookup keys and use this signal to effectively predict the position or existence of records. We theoretically analyze under which conditions learned indexes outperform traditional index structures and describe the main challenges in designing learned index structures. Our initial results show, that by using neural nets we are able to outperform cache-optimized B-Trees by up to 70% in speed while saving an order-of-magnitude in memory over several real-world data sets. More importantly though, we believe that the idea of replacing core components of a data management system through learned models has far reaching implications for future systems designs and that this work just provides a glimpse of what might be possible

    On controllability of neuronal networks with constraints on the average of control gains

    Get PDF
    Control gains play an important role in the control of a natural or a technical system since they reflect how much resource is required to optimize a certain control objective. This paper is concerned with the controllability of neuronal networks with constraints on the average value of the control gains injected in driver nodes, which are in accordance with engineering and biological backgrounds. In order to deal with the constraints on control gains, the controllability problem is transformed into a constrained optimization problem (COP). The introduction of the constraints on the control gains unavoidably leads to substantial difficulty in finding feasible as well as refining solutions. As such, a modified dynamic hybrid framework (MDyHF) is developed to solve this COP, based on an adaptive differential evolution and the concept of Pareto dominance. By comparing with statistical methods and several recently reported constrained optimization evolutionary algorithms (COEAs), we show that our proposed MDyHF is competitive and promising in studying the controllability of neuronal networks. Based on the MDyHF, we proceed to show the controlling regions under different levels of constraints. It is revealed that we should allocate the control gains economically when strong constraints are considered. In addition, it is found that as the constraints become more restrictive, the driver nodes are more likely to be selected from the nodes with a large degree. The results and methods presented in this paper will provide useful insights into developing new techniques to control a realistic complex network efficiently
    corecore