226 research outputs found

    A Novel Architecture for Mobile Distributed Trie Hashing System

    Get PDF
    International audienceScalable and Distributed Data Structures (SDDS) are a class of data structures completely dedicated to distributed environments. They allow the management of large amounts of data while maintaining steady and optimum performances. Several families of SDDS have been proposed: LH*, RP*, DRT*, CTH*. None of these SDDS deals with the mobile environment. In this paper we present a novel architecture that uses a scalable and distributed data structure to manage insert/find/range query operations for mobile clients. We describe the design and the implementation of a mobile CTH* prototype. Our experimental results prove the validity of the design choices and show interesting access performances. The capabilities of the mobile CTH* platform offer new perspectives for high performance and ubiquitous data intensive applications

    A DHT-Based Discovery Service for the Internet of Things

    Get PDF
    Current trends towards the Future Internet are envisaging the conception of novel services endowed with context-aware and autonomic capabilities to improve end users' quality of life. The Internet of Things paradigm is expected to contribute towards this ambitious vision by proposing models and mechanisms enabling the creation of networks of "smart things" on a large scale. It is widely recognized that efficient mechanisms for discovering available resources and capabilities are required to realize such vision. The contribution of this work consists in a novel discovery service for the Internet of Things. The proposed solution adopts a peer-to-peer approach for guaranteeing scalability, robustness, and easy maintenance of the overall system. While most existing peer-to-peer discovery services proposed for the IoT support solely exact match queries on a single attribute (i.e., the object identifier), our solution can handle multiattribute and range queries. We defined a layered approach by distinguishing three main aspects: multiattribute indexing, range query support, peer-to-peer routing. We chose to adopt an over-DHT indexing scheme to guarantee ease of design and implementation principles. We report on the implementation of a Proof of Concept in a dangerous goods monitoring scenario, and, finally, we discuss test results for structural properties and query performance evaluation

    The Case for Learned Index Structures

    Full text link
    Indexes are models: a B-Tree-Index can be seen as a model to map a key to the position of a record within a sorted array, a Hash-Index as a model to map a key to a position of a record within an unsorted array, and a BitMap-Index as a model to indicate if a data record exists or not. In this exploratory research paper, we start from this premise and posit that all existing index structures can be replaced with other types of models, including deep-learning models, which we term learned indexes. The key idea is that a model can learn the sort order or structure of lookup keys and use this signal to effectively predict the position or existence of records. We theoretically analyze under which conditions learned indexes outperform traditional index structures and describe the main challenges in designing learned index structures. Our initial results show, that by using neural nets we are able to outperform cache-optimized B-Trees by up to 70% in speed while saving an order-of-magnitude in memory over several real-world data sets. More importantly though, we believe that the idea of replacing core components of a data management system through learned models has far reaching implications for future systems designs and that this work just provides a glimpse of what might be possible

    A DHT-Based Discovery Service for the Internet of Things

    Get PDF

    ULTRA-FAST AND MEMORY-EFFICIENT LOOKUPS FOR CLOUD, NETWORKED SYSTEMS, AND MASSIVE DATA MANAGEMENT

    Get PDF
    Systems that process big data (e.g., high-traffic networks and large-scale storage) prefer data structures and algorithms with small memory and fast processing speed. Efficient and fast algorithms play an essential role in system design, despite the improvement of hardware. This dissertation is organized around a novel algorithm called Othello Hashing. Othello Hashing supports ultra-fast and memory-efficient key-value lookup, and it fits the requirements of the core algorithms of many large-scale systems and big data applications. Using Othello hashing, combined with domain expertise in cloud, computer networks, big data, and bioinformatics, I developed the following applications that resolve several major challenges in the area. Concise: Forwarding Information Base. A Forwarding Information Base is a data structure used by the data plane of a forwarding device to determine the proper forwarding actions for packets. The polymorphic property of Othello Hashing the separation of its query and control functionalities, which is a perfect match to the programmable networks such as Software Defined Networks. Using Othello Hashing, we built a fast and scalable FIB named \textit{Concise}. Extensive evaluation results on three different platforms show that Concise outperforms other FIB designs. SDLB: Cloud Load Balancer. In a cloud network, the layer-4 load balancer servers is a device that acts as a reverse proxy and distributes network or application traffic across a number of servers. We built a software load balancer with Othello Hashing techniques named SDLB. SDLB is able to accomplish two functionalities of the SDLB using one Othello query: to find the designated server for packets of ongoing sessions and to distribute new or session-free packets. MetaOthello: Taxonomic Classification of Metagenomic Sequences. Metagenomic read classification is a critical step in the identification and quantification of microbial species sampled by high-throughput sequencing. Due to the growing popularity of metagenomic data in both basic science and clinical applications, as well as the increasing volume of data being generated, efficient and accurate algorithms are in high demand. We built a system to support efficient classification of taxonomic sequences using its k-mer signatures. SeqOthello: RNA-seq Sequence Search Engine. Advances in the study of functional genomics produced a vast supply of RNA-seq datasets. However, how to quickly query and extract information from sequencing resources remains a challenging problem and has been the bottleneck for the broader dissemination of sequencing efforts. The challenge resides in both the sheer volume of the data and its nature of unstructured representation. Using the Othello Hashing techniques, we built the SeqOthello sequence search engine. SeqOthello is a reference-free, alignment-free, and parameter-free sequence search system that supports arbitrary sequence query against large collections of RNA-seq experiments, which enables large-scale integrative studies using sequence-level data

    Split keyword fuzzy and synonym search over encrypted cloud data

    Get PDF
    A substitute solution for various organizations of data owners to store their data in the cloud using storage as a service(SaaS). The outsourced sensitive data is encrypted before uploading into the cloud to achieve data privacy. The encrypted data is search based on keywords and retrieve interested files by data user using a lot of traditional Search scheme. Existing search schemes supports exact keyword match or fuzzy keyword search, but synonym based multi-keyword search are not supported. In the real world scenario, cloud users may not know the exact keyword for searching and they might give synonym of the keyword as the input for search instead of exact or fuzzy keyword due to lack of appropriate knowledge of data. In this paper, we describe an efficient search approach for encrypted data called as Split Keyword Fuzzy and Synonym Search (SKFS). Multi-keyword ranked search with accurate keyword and Fuzzy search supports synonym queries are a major contribution of SKFS. The wildcard Technique is used to store the keywords securely within the index tree. Index tree helps to search faster, accurate and low storage cost. Extensive experimental results on real-time data sets shows, the proposed solution is effective and efficient for multi-keyword ranked search and synonym queries Fuzzy based search over encrypted cloud data. © 2017 Springer Science+Business Media, LL
    corecore