16,598 research outputs found

    TPQL7: THE EFFECT OF ORDER OF ADMINISTRATION OF GENERIC AND DISEASE-SPECIFIC QUALITY OF LIFE QUESTIONNAIRES

    Get PDF

    Reducing UK-means to k-means

    Get PDF
    This paper proposes an optimisation to the UK-means algorithm, which generalises the k-means algorithm to handle objects whose locations are uncertain. The location of each object is described by a probability density function (pdf). The UK-means algorithm needs to compute expected distances (EDs) between each object and the cluster representatives. The evaluation of ED from first principles is very costly operation, because the pdf's are different and arbitrary. But UK-means needs to evaluate a lot of EDs. This is a major performance burden of the algorithm. In this paper, we derive a formula for evaluating EDs efficiently. This tremendously reduces the execution time of UK-means, as demonstrated by our preliminary experiments. We also illustrate that this optimised formula effectively reduces the UK-means problem to the traditional clustering algorithm addressed by the k-means algorithm. © 2007 IEEE.published_or_final_versionThe 7th IEEE International Conference on Data Mining (ICDM) Workshops 2007, Omaha, NE., 28-31 October 2007. In Proceedings of the 7th ICDM, 2007, p. 483-48

    Clustering uncertain data using voronoi diagrams and R-tree index

    Get PDF
    We study the problem of clustering uncertain objects whose locations are described by probability density functions (pdfs). We show that the UK-means algorithm, which generalizes the k-means algorithm to handle uncertain objects, is very inefficient. The inefficiency comes from the fact that UK-means computes expected distances (EDs) between objects and cluster representatives. For arbitrary pdfs, expected distances are computed by numerical integrations, which are costly operations. We propose pruning techniques that are based on Voronoi diagrams to reduce the number of expected distance calculations. These techniques are analytically proven to be more effective than the basic bounding-box-based technique previously known in the literature. We then introduce an R-tree index to organize the uncertain objects so as to reduce pruning overheads. We conduct experiments to evaluate the effectiveness of our novel techniques. We show that our techniques are additive and, when used in combination, significantly outperform previously known methods. © 2006 IEEE.published_or_final_versio

    Distributed and scalable XML document processing architecture for E-commerce systems

    Get PDF
    XML has became a very important emerging standard for E-commerce because of its flexibility and universality. Many software designers are actively developing new systems to handle information in XML formats. We propose a generic architecture for processing XML. We have designed an XML processing system using the latest technologies, such as XML, XSLT (XML Stylesheet Language Transformation), HTTP and Java servlets. Our design is very generic, flexible, scalable, extensible, and also suitable for distributed network environments. A main application of the architecture and the system is to support data exchange in E-commerce systems.published_or_final_versio

    Efficient mining of frequent item sets on large uncertain databases

    Get PDF
    The data handled in emerging applications like location-based services, sensor monitoring systems, and data integration, are often inexact in nature. In this paper, we study the important problem of extracting frequent item sets from a large uncertain database, interpreted under the Possible World Semantics (PWS). This issue is technically challenging, since an uncertain database contains an exponential number of possible worlds. By observing that the mining process can be modeled as a Poisson binomial distribution, we develop an approximate algorithm, which can efficiently and accurately discover frequent item sets in a large uncertain database. We also study the important issue of maintaining the mining result for a database that is evolving (e.g., by inserting a tuple). Specifically, we propose incremental mining algorithms, which enable Probabilistic Frequent Item set (PFI) results to be refreshed. This reduces the need of re-executing the whole mining algorithm on the new database, which is often more expensive and unnecessary. We examine how an existing algorithm that extracts exact item sets, as well as our approximate algorithm, can support incremental mining. All our approaches support both tuple and attribute uncertainty, which are two common uncertain database models. We also perform extensive evaluation on real and synthetic data sets to validate our approaches. © 1989-2012 IEEE.published_or_final_versio
    • …
    corecore