1,905 research outputs found

    Expressions for Batched Searching of Sequential and Hierarchical Files

    Get PDF
    Batching yields significant savings in access costs in sequential, tree-structured, and random files. A direct and simple expression is developed for computing the average number of records/pages accessed to satisfy a batched query of a sequential file. The advantages of batching for sequential and random files are discussed. A direct equation is provided for the number of nodes accessed in unbatched queries of hierarchical files. An exact recursive expression is developed for node accesses in batched queries of hierarchical files. In addition to the recursive relationship, good, closed-form upper- and lower-bound approximations are provided for the case of batched queries of hierarchical files

    Signature Files: An Integrated Access Method for Formatted and Unformatted Databases

    Get PDF
    The signature file approach is one of the most powerful information storage and retrieval techniques which is used for finding the data objects that are relevant to the user queries. The main idea of all signature based schemes is to reflect the essence of the data items into bit pattern (descriptors or signatures) and store them in a separate file which acts as a filter to eliminate the non aualifvine data items for an information reauest. It provides an integrated access method for both formattid and formatted databases. A complative overview and discussion of the proposed signatnre generation methods and the major signature file organization schemes are presented. Applications of the signature techniques to formatted and unformatted databases, single and multiterm query cases, serial and paratlei architecture. static and dynamic environments are provided with a special emphasis on the multimedia databases where the pioneering prototype systems using signatnres yield highly encouraging results

    Identifying critically important vascular access outcomes for trials in haemodialysis : an international survey with patients, caregivers and health professionals

    Get PDF
    BACKGROUND: Vascular access outcomes reported across haemodialysis (HD) trials are numerous, heterogeneous and not always relevant to patients and clinicians. This study aimed to identify critically important vascular access outcomes. METHOD: Outcomes derived from a systematic review, multi-disciplinary expert panel and patient input were included in a multilanguage online survey. Participants rated the absolute importance of outcomes using a 9-point Likert scale (7-9 being critically important). The relative importance was determined by a best-worst scale using multinomial logistic regression. Open text responses were analysed thematically. RESULTS: The survey was completed by 873 participants [224 (26%) patients/caregivers and 649 (74%) health professionals] from 58 countries. Vascular access function was considered the most important outcome (mean score 7.8 for patients and caregivers/8.5 for health professionals, with 85%/95% rating it critically important, and top ranked on best-worst scale), followed by infection (mean 7.4/8.2, 79%/92% rating it critically important, second rank on best-worst scale). Health professionals rated all outcomes of equal or higher importance than patients/caregivers, except for aneurysms. We identified six themes: necessity for HD, applicability across vascular access types, frequency and severity of debilitation, minimizing the risk of hospitalization and death, optimizing technical competence and adherence to best practice and direct impact on appearance and lifestyle. CONCLUSIONS: Vascular access function was the most critically important outcome among patients/caregivers and health professionals. Consistent reporting of this outcome across trials in HD will strengthen their value in supporting vascular access practice and shared decision making in patients requiring HD

    Sampling Algorithms for Evolving Datasets

    Get PDF
    Perhaps the most flexible synopsis of a database is a uniform random sample of the data; such samples are widely used to speed up the processing of analytic queries and data-mining tasks, to enhance query optimization, and to facilitate information integration. Most of the existing work on database sampling focuses on how to create or exploit a random sample of a static database, that is, a database that does not change over time. The assumption of a static database, however, severely limits the applicability of these techniques in practice, where data is often not static but continuously evolving. In order to maintain the statistical validity of the sample, any changes to the database have to be appropriately reflected in the sample. In this thesis, we study efficient methods for incrementally maintaining a uniform random sample of the items in a dataset in the presence of an arbitrary sequence of insertions, updates, and deletions. We consider instances of the maintenance problem that arise when sampling from an evolving set, from an evolving multiset, from the distinct items in an evolving multiset, or from a sliding window over a data stream. Our algorithms completely avoid any accesses to the base data and can be several orders of magnitude faster than algorithms that do rely on such expensive accesses. The improved efficiency of our algorithms comes at virtually no cost: the resulting samples are provably uniform and only a small amount of auxiliary information is associated with the sample. We show that the auxiliary information not only facilitates efficient maintenance, but it can also be exploited to derive unbiased, low-variance estimators for counts, sums, averages, and the number of distinct items in the underlying dataset. In addition to sample maintenance, we discuss methods that greatly improve the flexibility of random sampling from a system's point of view. More specifically, we initiate the study of algorithms that resize a random sample upwards or downwards. Our resizing algorithms can be exploited to dynamically control the size of the sample when the dataset grows or shrinks; they facilitate resource management and help to avoid under- or oversized samples. Furthermore, in large-scale databases with data being distributed across several remote locations, it is usually infeasible to reconstruct the entire dataset for the purpose of sampling. To address this problem, we provide efficient algorithms that directly combine the local samples maintained at each location into a sample of the global dataset. We also consider a more general problem, where the global dataset is defined as an arbitrary set or multiset expression involving the local datasets, and provide efficient solutions based on hashing
    • …
    corecore