155,697 research outputs found

    Content-Aware DataGuides for Indexing Large Collections of XML Documents

    Get PDF
    XML is well-suited for modelling structured data with textual content. However, most indexing approaches perform structure and content matching independently, combining the retrieved path and keyword occurrences in a third step. This paper shows that retrieval in XML documents can be accelerated significantly by processing text and structure simultaneously during all retrieval phases. To this end, the Content-Aware DataGuide (CADG) enhances the wellknown DataGuide with (1) simultaneous keyword and path matching and (2) a precomputed content/structure join. Extensive experiments prove the CADG to be 50-90% faster than the DataGuide for various sorts of query and document, including difficult cases such as poorly structured queries and recursive document paths. A new query classification scheme identifies precise query characteristics with a predominant influence on the performance of the individual indices. The experiments show that the CADG is applicable to many real-world applications, in particular large collections of heterogeneously structured XML documents

    Approximate MIMO Iterative Processing with Adjustable Complexity Requirements

    Full text link
    Targeting always the best achievable bit error rate (BER) performance in iterative receivers operating over multiple-input multiple-output (MIMO) channels may result in significant waste of resources, especially when the achievable BER is orders of magnitude better than the target performance (e.g., under good channel conditions and at high signal-to-noise ratio (SNR)). In contrast to the typical iterative schemes, a practical iterative decoding framework that approximates the soft-information exchange is proposed which allows reduced complexity sphere and channel decoding, adjustable to the transmission conditions and the required bit error rate. With the proposed approximate soft information exchange the performance of the exact soft information can still be reached with significant complexity gains.Comment: The final version of this paper appears in IEEE Transactions on Vehicular Technolog

    Multidimensional Range Queries on Modern Hardware

    Full text link
    Range queries over multidimensional data are an important part of database workloads in many applications. Their execution may be accelerated by using multidimensional index structures (MDIS), such as kd-trees or R-trees. As for most index structures, the usefulness of this approach depends on the selectivity of the queries, and common wisdom told that a simple scan beats MDIS for queries accessing more than 15%-20% of a dataset. However, this wisdom is largely based on evaluations that are almost two decades old, performed on data being held on disks, applying IO-optimized data structures, and using single-core systems. The question is whether this rule of thumb still holds when multidimensional range queries (MDRQ) are performed on modern architectures with large main memories holding all data, multi-core CPUs and data-parallel instruction sets. In this paper, we study the question whether and how much modern hardware influences the performance ratio between index structures and scans for MDRQ. To this end, we conservatively adapted three popular MDIS, namely the R*-tree, the kd-tree, and the VA-file, to exploit features of modern servers and compared their performance to different flavors of parallel scans using multiple (synthetic and real-world) analytical workloads over multiple (synthetic and real-world) datasets of varying size, dimensionality, and skew. We find that all approaches benefit considerably from using main memory and parallelization, yet to varying degrees. Our evaluation indicates that, on current machines, scanning should be favored over parallel versions of classical MDIS even for very selective queries

    Search algorithms as a framework for the optimization of drug combinations

    Get PDF
    Combination therapies are often needed for effective clinical outcomes in the management of complex diseases, but presently they are generally based on empirical clinical experience. Here we suggest a novel application of search algorithms, originally developed for digital communication, modified to optimize combinations of therapeutic interventions. In biological experiments measuring the restoration of the decline with age in heart function and exercise capacity in Drosophila melanogaster, we found that search algorithms correctly identified optimal combinations of four drugs with only one third of the tests performed in a fully factorial search. In experiments identifying combinations of three doses of up to six drugs for selective killing of human cancer cells, search algorithms resulted in a highly significant enrichment of selective combinations compared with random searches. In simulations using a network model of cell death, we found that the search algorithms identified the optimal combinations of 6-9 interventions in 80-90% of tests, compared with 15-30% for an equivalent random search. These findings suggest that modified search algorithms from information theory have the potential to enhance the discovery of novel therapeutic drug combinations. This report also helps to frame a biomedical problem that will benefit from an interdisciplinary effort and suggests a general strategy for its solution.Comment: 36 pages, 10 figures, revised versio

    The STRESS Method for Boundary-point Performance Analysis of End-to-end Multicast Timer-Suppression Mechanisms

    Full text link
    Evaluation of Internet protocols usually uses random scenarios or scenarios based on designers' intuition. Such approach may be useful for average-case analysis but does not cover boundary-point (worst or best-case) scenarios. To synthesize boundary-point scenarios a more systematic approach is needed.In this paper, we present a method for automatic synthesis of worst and best case scenarios for protocol boundary-point evaluation. Our method uses a fault-oriented test generation (FOTG) algorithm for searching the protocol and system state space to synthesize these scenarios. The algorithm is based on a global finite state machine (FSM) model. We extend the algorithm with timing semantics to handle end-to-end delays and address performance criteria. We introduce the notion of a virtual LAN to represent delays of the underlying multicast distribution tree. The algorithms used in our method utilize implicit backward search using branch and bound techniques and start from given target events. This aims to reduce the search complexity drastically. As a case study, we use our method to evaluate variants of the timer suppression mechanism, used in various multicast protocols, with respect to two performance criteria: overhead of response messages and response time. Simulation results for reliable multicast protocols show that our method provides a scalable way for synthesizing worst-case scenarios automatically. Results obtained using stress scenarios differ dramatically from those obtained through average-case analyses. We hope for our method to serve as a model for applying systematic scenario generation to other multicast protocols.Comment: 24 pages, 10 figures, IEEE/ACM Transactions on Networking (ToN) [To appear
    • …
    corecore