227 research outputs found

    Bridging the gap between algorithmic and learned index structures

    Get PDF
    Index structures such as B-trees and bloom filters are the well-established petrol engines of database systems. However, these structures do not fully exploit patterns in data distribution. To address this, researchers have suggested using machine learning models as electric engines that can entirely replace index structures. Such a paradigm shift in data system design, however, opens many unsolved design challenges. More research is needed to understand the theoretical guarantees and design efficient support for insertion and deletion. In this thesis, we adopt a different position: index algorithms are good enough, and instead of going back to the drawing board to fit data systems with learned models, we should develop lightweight hybrid engines that build on the benefits of both algorithmic and learned index structures. The indexes that we suggest provide the theoretical performance guarantees and updatability of algorithmic indexes while using position prediction models to leverage the data distributions and thereby improve the performance of the index structure. We investigate the potential for minimal modifications to algorithmic indexes such that they can leverage data distribution similar to how learned indexes work. In this regard, we propose and explore the use of helping models that boost classical index performance using techniques from machine learning. Our suggested approach inherits performance guarantees from its algorithmic baseline index, but at the same time it considers the data distribution to improve performance considerably. We study single-dimensional range indexes, spatial indexes, and stream indexing, and show that the suggested approach results in range indexes that outperform the algorithmic indexes and have comparable performance to the read-only, fully learned indexes and hence can be reliably used as a default index structure in a database engine. Besides, we consider the updatability of the indexes and suggest solutions for updating the index, notably when the data distribution drastically changes over time (e.g., for indexing data streams). In particular, we propose a specific learning-augmented index for indexing a sliding window with timestamps in a data stream. Additionally, we highlight the limitations of learned indexes for low-latency lookup on real- world data distributions. To tackle this issue, we suggest adding an algorithmic enhancement layer to a learned model to correct the prediction error with a small memory latency. This approach enables efficient modelling of the data distribution and resolves the local biases of a learned model at the cost of roughly one memory lookup.Open Acces

    Fast simulation methods to predict wireless sensor network performance

    Full text link

    Analysis domain model for shared virtual environments

    Get PDF
    The field of shared virtual environments, which also encompasses online games and social 3D environments, has a system landscape consisting of multiple solutions that share great functional overlap. However, there is little system interoperability between the different solutions. A shared virtual environment has an associated problem domain that is highly complex raising difficult challenges to the development process, starting with the architectural design of the underlying system. This paper has two main contributions. The first contribution is a broad domain analysis of shared virtual environments, which enables developers to have a better understanding of the whole rather than the part(s). The second contribution is a reference domain model for discussing and describing solutions - the Analysis Domain Model

    AirIndex: Versatile Index Tuning Through Data and Storage

    Full text link
    The end-to-end lookup latency of a hierarchical index -- such as a B-tree or a learned index -- is determined by its structure such as the number of layers, the kinds of branching functions appearing in each layer, the amount of data we must fetch from layers, etc. Our primary observation is that by optimizing those structural parameters (or designs) specifically to a target system's I/O characteristics (e.g., latency, bandwidth), we can offer a faster lookup compared to the ones that are not optimized. Can we develop a systematic method for finding those optimal design parameters? Ideally, the method must have the potential to generate almost any existing index or a novel combination of them for the fastest possible lookup. In this work, we present new data and an I/O-aware index builder (called AirIndex) that can find high-speed hierarchical index designs in a principled way. Specifically, AirIndex minimizes an objective function expressing the end-to-end latency in terms of various designs -- the number of layers, types of layers, and more -- for given data and a storage profile, using a graph-based optimization method purpose-built to address the computational challenges rising from the inter-dependencies among index layers and the exponentially many candidate parameters in a large search space. Our empirical studies confirm that AirIndex can find optimal index designs, build optimal indexes within the times comparable to existing methods, and deliver up to 4.1x faster lookup than a lightweight B-tree library (LMDB), 3.3x--46.3x faster than state-of-the-art learned indexes (RMI/CDFShop, PGM-Index, ALEX/APEX, PLEX), and 2.0 faster than Data Calculator's suggestion on various dataset and storage settings.Comment: 13 pages, 3 appendices, 19 figures, to appear at SIGMOD 202

    THash: A Practical Network Optimization Scheme for DHT-based P2P Applications

    Get PDF
    International audienceP2P platforms have been criticized because of the heavy strain that they can inflict on costly inter-domain links of network operators. It is therefore mandatory to develop network optimization schemes for controlling the load generated by a P2P platform on an operator network. While many research efforts exist on centralized tracker-based systems, in recent years multiple DHT-based P2P platforms have been widely deployed and considered as commercial services due to their scalability and fault tolerance. Finding network optimization for DHT-based P2P applications has thereby potential large practical impacts. In this paper, we present THash, a simple scheme that implements a distributed and effective network optimization for DHT systems. THash uses standard DHT put/get semantics and utilizes a triple hash method to guide the DHT clients to choose their sharing peers in proper domains. We have implemented THash in a major commercial P2P system (PPLive), using the standard ALTO/P4P protocol as the network information source. We conducted experiments over this network in real operation and observed that compared with Native DHT, THash reduced respectively by 47.4% and 67.7% the inter-PID and inter-AS traffic, while reducing the average downloading time by 14.6% to 24.5%

    The Single Event Effect Characteristics of the 486-DX4 Microprocessor

    Get PDF
    This research describes the development of an experimental radiation testing environment to investigate the single event effect (SEE) susceptibility of the 486-DX4 microprocessor. SEE effects are caused by radiation particles that disrupt the logic state of an operating semiconductor, and include single event upsets (SEU) and single event latchup (SEL). The relevance of this work can be applied directly to digital devices that are used in spaceflight computer systems. The 486-DX4 is a powerful commercial microprocessor that is currently under consideration for use in several spaceflight systems. As part of its selection process, it must be rigorously tested to determine its overall reliability in the space environment, including its radiation susceptibility. The goal of this research is to experimentally test and characterize the single event effects of the 486-DX4 microprocessor using a cyclotron facility as the fault-injection source. The test philosophy is to focus on the "operational susceptibility," by executing real software and monitoring for errors while the device is under irradiation. This research encompasses both experimental and analytical techniques, and yields a characterization of the 486-DX4's behavior for different operating modes. Additionally, the test methodology can accommodate a wide range of digital devices, such as microprocessors, microcontrollers, ASICS, and memory modules, for future testing. The goals were achieved by testing with three heavy-ion species to provide different linear energy transfer rates, and a total of six microprocessor parts were tested from two different vendors. A consistent set of error modes were identified that indicate the manner in which the errors were detected in the processor. The upset cross-section curves were calculated for each error mode, and the SEU threshold and saturation levels were identified for each processor. Results show a distinct difference in the upset rate for different configurations of the on-chip cache, as well as proving that one vendor is superior to the other in terms of latchup susceptibility. Results from this testing were also used to provide a mean-time-between-failure estimate of the 486-DX4 operating in the radiation environment for the International Space Station

    WAIT: Selective Loss Recovery for Multimedia Multicast.

    Get PDF
    Recently the Internet has been increasingly used for multi-party applications like video-conferencing, video-on-demand and shared white-boards. Multicast extensions to IP to support multi-party applications are best effort, often resulting in packet loss within the network. Since some multicast applications can not tolerate packet loss, most of the existing reliable multicast schemes recover each and every lost packet. However, multimedia applications can tolerate a certain amount of packet loss and are sensitive to long recovery delays. We propose a new loss recovery technique that selectively repairs lost packets based upon the amount of packet loss and delay expected for the repair. Our technique sends a special WAIT message down the multicast tree in the event a loss is detected in order to reduce the number of retransmission requests. We also propose an efficient sender initiated multicast trace-route mechanism for determining the multicast topology and a mechanism to deliver the topology information to the multicast session participants. We evaluate our proposed technique using an event driven network simulator, comparing it with two popular reliable multicast protocols, SRM and PGM. We conclude that our proposed WAIT protocol can reduce the overhead on a multicast session as well as improve the average end-to-end latency of the session

    Development of Uclinux Platform for Computer Vision Algorithm in FPGA Devices

    Get PDF
    This paper describes the use of the XilinxMicroblaze 32- bit, soft-core processor in a series of senior design projects. The Microblaze was implemented on a commercial-off-theshelf FPGA- based single board computer. The FPGA is pre-configured with the Microblaze running a version of Linux called uCLinux. Using this platform, students can develop custom hardware that can interface to the Microblaze using the OPB bus, and custom software that runs as a thread (or threads) on uClinux. Computer Vision Algorithm with sobel filter will be implemented in Microblaze system to know resource that be used by application

    High Availability and Scalability of Mainframe Environments using System z and z/OS as example

    Get PDF
    Mainframe computers are the backbone of industrial and commercial computing, hosting the most relevant and critical data of businesses. One of the most important mainframe environments is IBM System z with the operating system z/OS. This book introduces mainframe technology of System z and z/OS with respect to high availability and scalability. It highlights their presence on different levels within the hardware and software stack to satisfy the needs for large IT organizations
    • …
    corecore