17,187 research outputs found

    Using Fuzzy Linguistic Representations to Provide Explanatory Semantics for Data Warehouses

    Get PDF
    A data warehouse integrates large amounts of extracted and summarized data from multiple sources for direct querying and analysis. While it provides decision makers with easy access to such historical and aggregate data, the real meaning of the data has been ignored. For example, "whether a total sales amount 1,000 items indicates a good or bad sales performance" is still unclear. From the decision makers' point of view, the semantics rather than raw numbers which convey the meaning of the data is very important. In this paper, we explore the use of fuzzy technology to provide this semantics for the summarizations and aggregates developed in data warehousing systems. A three layered data warehouse semantic model, consisting of quantitative (numerical) summarization, qualitative (categorical) summarization, and quantifier summarization, is proposed for capturing and explicating the semantics of warehoused data. Based on the model, several algebraic operators are defined. We also extend the SQL language to allow for flexible queries against such enhanced data warehouses

    Approximate Minimum Diameter

    Full text link
    We study the minimum diameter problem for a set of inexact points. By inexact, we mean that the precise location of the points is not known. Instead, the location of each point is restricted to a contineus region (\impre model) or a finite set of points (\indec model). Given a set of inexact points in one of \impre or \indec models, we wish to provide a lower-bound on the diameter of the real points. In the first part of the paper, we focus on \indec model. We present an O(21ϵdϵ2dn3)O(2^{\frac{1}{\epsilon^d}} \cdot \epsilon^{-2d} \cdot n^3 ) time approximation algorithm of factor (1+ϵ)(1+\epsilon) for finding minimum diameter of a set of points in dd dimensions. This improves the previously proposed algorithms for this problem substantially. Next, we consider the problem in \impre model. In dd-dimensional space, we propose a polynomial time d\sqrt{d}-approximation algorithm. In addition, for d=2d=2, we define the notion of α\alpha-separability and use our algorithm for \indec model to obtain (1+ϵ)(1+\epsilon)-approximation algorithm for a set of α\alpha-separable regions in time O(21ϵ2.n3ϵ10.sin(α/2)3)O(2^{\frac{1}{\epsilon^2}}\allowbreak . \frac{n^3}{\epsilon^{10} .\sin(\alpha/2)^3} )

    Privacy through uncertainty in location-based services

    Get PDF
    Location-Based Services (LBS) are becoming more prevalent. While there are many benefits, there are also real privacy risks. People are unwilling to give up the benefits - but can we reduce privacy risks without giving up on LBS entirely? This paper explores the possibility of introducing uncertainty into location information when using an LBS, so as to reduce privacy risk while maintaining good quality of service. This paper also explores the current uses of uncertainty information in a selection of mobile applications

    Structural learning for large scale image classification

    Get PDF
    To leverage large-scale collaboratively-tagged (loosely-tagged) images for training a large number of classifiers to support large-scale image classification, we need to develop new frameworks to deal with the following issues: (1) spam tags, i.e., tags are not relevant to the semantic of the images; (2) loose object tags, i.e., multiple object tags are loosely given at the image level without their locations in the images; (3) missing object tags, i.e. some object tags are missed due to incomplete tagging; (4) inter-related object classes, i.e., some object classes are visually correlated and their classifiers need to be trained jointly instead of independently; (5) large scale object classes, which requires to limit the computational time complexity for classifier training algorithms as well as the storage spaces for intermediate results. To deal with these issues, we propose a structural learning framework which consists of the following key components: (1) cluster-based junk image filtering to address the issue of spam tags; (2) automatic tag-instance alignment to address the issue of loose object tags; (3) automatic missing object tag prediction; (4) object correlation network for inter-class visual correlation characterization to address the issue of missing tags; (5) large-scale structural learning with object correlation network for enhancing the discrimination power of object classifiers. To obtain enough numbers of labeled training images, our proposed framework leverages the abundant web images and their social tags. To make those web images usable, tag cleansing has to be done to neutralize the noise from user tagging preferences, in particularly junk tags, loose tags and missing tags. Then a discriminative learning algorithm is developed to train a large number of inter-related classifiers for achieving large-scale image classification, e.g., learning a large number of classifiers for categorizing large-scale images into a large number of inter-related object classes and image concepts. A visual concept network is first constructed for organizing enumorus object classes and image concepts according to their inter-concept visual correlations. The visual concept network is further used to: (a) identify inter-related learning tasks for classifier training; (b) determine groups of visually-similar object classes and image concepts; and (c) estimate the learning complexity for classifier training. A large-scale discriminative learning algorithm is developed for supporting multi-class classifier training and achieving accurate inter-group discrimination and effective intra-group separation. Our discriminative learning algorithm can significantly enhance the discrimination power of the classifiers and dramatically reduce the computational cost for large-scale classifier training
    corecore