3,378 research outputs found

    Continuous Nearest Neighbor Queries over Sliding Windows

    Get PDF
    Abstract—This paper studies continuous monitoring of nearest neighbor (NN) queries over sliding window streams. According to this model, data points continuously stream in the system, and they are considered valid only while they belong to a sliding window that contains 1) the W most recent arrivals (count-based) or 2) the arrivals within a fixed interval W covering the most recent time stamps (time-based). The task of the query processor is to constantly maintain the result of long-running NN queries among the valid data. We present two processing techniques that apply to both count-based and time-based windows. The first one adapts conceptual partitioning, the best existing method for continuous NN monitoring over update streams, to the sliding window model. The second technique reduces the problem to skyline maintenance in the distance-time space and precomputes the future changes in the NN set. We analyze the performance of both algorithms and extend them to variations of NN search. Finally, we compare their efficiency through a comprehensive experimental evaluation. The skyline-based algorithm achieves lower CPU cost, at the expense of slightly larger space overhead. Index Terms—Location-dependent and sensitive, spatial databases, query processing, nearest neighbors, data streams, sliding windows.

    Performances of Two Prototypes of Log Extraction Techniques Using the Skyline System

    Full text link
    Timber extraction from felling area to road side is not an easy job. This activity facing a number of difficulties particularly due to geo-biophysical conditions, such as steep terrain, up and/or down-hill, valley or river-to be crossed, slippery road and also the size of the timber and low accessibility. To anticipate those obstacles two engineering designs of the skyline system had been tried, the so called Expo-2000 Generation-1, using gasoline engine of 6 HP (G-1), and Expo-2000 Generation-3 using dieselengine of 12 HP (G-3). G-1 model has been tested in Cimeong and Rancaparang in 2011. G-3 model has been examined in Cibatu Canjur and Cibaliung Banten in 2013. This paper evaluates the modification of skyline system for steep terrain and to compare the performance between two modified skyline systems, in term of productivity and cost. The data collected included working time, log volume extracted, log extraction distance and fuel used. Data were analyzed to get the average productivity and cost of operation. Result show that prototype G-3 with logs in horizontal position at a distance of 130-430 m, can extract logs averaging 1.72 m3/hr, at a cost of about Rp 80,346/m3, while prototype G-1 and logs in vertical position at a distance of about 50-320 m, could only extract logs averaging ± 0.85 m3/ hr at a cost of about Rp 156,351/m3. It suggests that prototype Expo-2000 G-3 is more effective for log extraction logs in steep terrain

    Integrating OLAP and Ranking: The Ranking-Cube Methodology

    Get PDF
    Recent years have witnessed an enormous growth of data in business, industry, and Web applications. Database search often returns a large collection of results, which poses challenges to both efficient query processing and effective digest of the query results. To address this problem, ranked search has been introduced to database systems. We study the problem of On-Line Analytical Processing (OLAP) of ranked queries, where ranked queries are conducted in the arbitrary subset of data defined by multi-dimensional selections. While pre-computation and multi-dimensional aggregation is the standard solution for OLAP, materializing dynamic ranking results is unrealistic because the ranking criteria are not known until the query time. To overcome such difficulty, we develop a new ranking cube method that performs semi on-line materialization and semi online computation in this thesis. Its complete life cycle, including cube construction, incremental maintenance, and query processing, is also discussed. We further extend the ranking cube in three dimensions. First, how to answer queries in high-dimensional data. Second, how to answer queries which involves joins over multiple relations. Third, how to answer general preference queries (besides ranked queries, such as skyline queries). Our performance studies show that ranking-cube is orders of magnitude faster than previous approaches

    Investigating Lactococcus lactis MG1363 response to phage p2 infection at the proteome level

    Get PDF
    Phages are viruses that specifically infect and eventually kill their bacterial hosts. Bacterial fermentation and biotechnology industries see them as enemies, however, they are also investigated as antibacterial agents for the treatment or prevention of bacterial infections in various sectors. They also play key ecological roles in all ecosystems. Despite decades of research some aspects of phage biology are still poorly understood. In this study, we used label-free quantitative proteomics to reveal the proteotypes of Lactococcus lactis MG1363 during infection by the virulent phage p2, a model for studying the biology of phages infecting Gram-positive bacteria. Our approach resulted in the high-confidence detection and quantification of 59% of the theoretical bacterial proteome, including 226 bacterial proteins detected only during phage infection and 6 proteins unique to uninfected bacteria. We also identified many bacterial proteins of differing abundance during the infection. Using this high-throughput proteomic datasets, we selected specific bacterial genes for inactivation using CRISPR-Cas9 to investigate their involvement in phage replication. One knockout mutant lacking gene llmg_0219 showed resistance to phage p2 because of a deficiency in phage adsorption. Furthermore, we detected and quantified 78% of the theoretical phage proteome and identified many proteins of phage p2 that had not been previously detected. Among others, we uncovered a conserved small phage protein (pORFN1) coded by an unannotated gene. We also applied a targeted approach to achieve greater sensitivity and identify undetected phage proteins that were expected to be present. This allowed us to follow the fate of pORF46, a small phage protein of low abundance. In summary, this work offers a unique view of the virulent phages' takeover of bacterial cells and provides novel information on phage-host interactions

    Dynamic Geometric Data Structures via Shallow Cuttings

    Get PDF
    We present new results on a number of fundamental problems about dynamic geometric data structures: 1) We describe the first fully dynamic data structures with sublinear amortized update time for maintaining (i) the number of vertices or the volume of the convex hull of a 3D point set, (ii) the largest empty circle for a 2D point set, (iii) the Hausdorff distance between two 2D point sets, (iv) the discrete 1-center of a 2D point set, (v) the number of maximal (i.e., skyline) points in a 3D point set. The update times are near n^{11/12} for (i) and (ii), n^{7/8} for (iii) and (iv), and n^{2/3} for (v). Previously, sublinear bounds were known only for restricted "semi-online" settings [Chan, SODA 2002]. 2) We slightly improve previous fully dynamic data structures for answering extreme point queries for the convex hull of a 3D point set and nearest neighbor search for a 2D point set. The query time is O(log^2n), and the amortized update time is O(log^4n) instead of O(log^5n) [Chan, SODA 2006; Kaplan et al., SODA 2017]. 3) We also improve previous fully dynamic data structures for maintaining the bichromatic closest pair between two 2D point sets and the diameter of a 2D point set. The amortized update time is O(log^4n) instead of O(log^7n) [Eppstein 1995; Chan, SODA 2006; Kaplan et al., SODA 2017]

    Supporting Multi-Criteria Decision Support Queries over Disparate Data Sources

    Get PDF
    In the era of big data revolution, marked by an exponential growth of information, extracting value from data enables analysts and businesses to address challenging problems such as drug discovery, fraud detection, and earthquake predictions. Multi-Criteria Decision Support (MCDS) queries are at the core of big-data analytics resulting in several classes of MCDS queries such as OLAP, Top-K, Pareto-optimal, and nearest neighbor queries. The intuitive nature of specifying multi-dimensional preferences has made Pareto-optimal queries, also known as skyline queries, popular. Existing skyline algorithms however do not address several crucial issues such as performing skyline evaluation over disparate sources, progressively generating skyline results, or robustly handling workload with multiple skyline over join queries. In this dissertation we thoroughly investigate topics in the area of skyline-aware query evaluation. In this dissertation, we first propose a novel execution framework called SKIN that treats skyline over joins as first class citizens during query processing. This is in contrast to existing techniques that treat skylines as an add-on, loosely integrated with query processing by being placed on top of the query plan. SKIN is effective in exploiting the skyline characteristics of the tuples within individual data sources as well as across disparate sources. This enables SKIN to significantly reduce two primary costs, namely the cost of generating the join results and the cost of skyline comparisons to compute the final results. Second, we address the crucial business need to report results early; as soon as they are being generated so that users can formulate competitive decisions in near real-time. On top of SKIN, we built a progressive query evaluation framework ProgXe to transform the execution of queries involving skyline over joins to become non-blocking, i.e., to be progressively generating results early and often. By exploiting SKIN\u27s principle of processing query at multiple levels of abstraction, ProgXe is able to: (1) extract the output dependencies in the output spaces by analyzing both the input and output space, and (2) exploit this knowledge of abstract-level relationships to guarantee correctness of early output. Third, real-world applications handle query workloads with diverse Quality of Service (QoS) requirements also referred to as contracts. Time sensitive queries, such as fraud detection, require results to progressively output with minimal delay, while ad-hoc and reporting queries can tolerate delay. In this dissertation, by building on the principles of ProgXe we propose the Contract-Aware Query Execution (CAQE) framework to support the open problem of contract driven multi-query processing. CAQE employs an adaptive execution strategy to continuously monitor the run-time satisfaction of queries and aggressively take corrective steps whenever the contracts are not being met. Lastly, to elucidate the portability of the core principle of this dissertation, the reasoning and query processing at different levels of data abstraction, we apply them to solve an orthogonal research question to auto-generate recommendation queries that facilitate users in exploring a complex database system. User queries are often too strict or too broad requiring a frustrating trial-and-error refinement process to meet the desired result cardinality while preserving original query semantics. Based on the principles of SKIN, we propose CAPRI to automatically generate refined queries that: (1) attain the desired cardinality and (2) minimize changes to the original query intentions. In our comprehensive experimental study of each part of this dissertation, we demonstrate the superiority of the proposed strategies over state-of-the-art techniques in both efficiency, as well as resource consumption

    Description of remote control cable yarding systems and an evaluation of the Forestral Remote Control Grapple Yarding System

    Get PDF
    • …
    corecore