33,546 research outputs found

    Privacy and Confidentiality in an e-Commerce World: Data Mining, Data Warehousing, Matching and Disclosure Limitation

    Full text link
    The growing expanse of e-commerce and the widespread availability of online databases raise many fears regarding loss of privacy and many statistical challenges. Even with encryption and other nominal forms of protection for individual databases, we still need to protect against the violation of privacy through linkages across multiple databases. These issues parallel those that have arisen and received some attention in the context of homeland security. Following the events of September 11, 2001, there has been heightened attention in the United States and elsewhere to the use of multiple government and private databases for the identification of possible perpetrators of future attacks, as well as an unprecedented expansion of federal government data mining activities, many involving databases containing personal information. We present an overview of some proposals that have surfaced for the search of multiple databases which supposedly do not compromise possible pledges of confidentiality to the individuals whose data are included. We also explore their link to the related literature on privacy-preserving data mining. In particular, we focus on the matching problem across databases and the concept of ``selective revelation'' and their confidentiality implications.Comment: Published at http://dx.doi.org/10.1214/088342306000000240 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Subspace clustering of dimensionality-reduced data

    Full text link
    Subspace clustering refers to the problem of clustering unlabeled high-dimensional data points into a union of low-dimensional linear subspaces, assumed unknown. In practice one may have access to dimensionality-reduced observations of the data only, resulting, e.g., from "undersampling" due to complexity and speed constraints on the acquisition device. More pertinently, even if one has access to the high-dimensional data set it is often desirable to first project the data points into a lower-dimensional space and to perform the clustering task there; this reduces storage requirements and computational cost. The purpose of this paper is to quantify the impact of dimensionality-reduction through random projection on the performance of the sparse subspace clustering (SSC) and the thresholding based subspace clustering (TSC) algorithms. We find that for both algorithms dimensionality reduction down to the order of the subspace dimensions is possible without incurring significant performance degradation. The mathematical engine behind our theorems is a result quantifying how the affinities between subspaces change under random dimensionality reducing projections.Comment: ISIT 201

    Specific resources as bases for the differentiation and innovation of tourist destinations

    Get PDF
    Given that one type of tourist does not exist and different strategies are drawn to reach the wished "extraordinary" by tourists for holidays, there are windows of opportunities to the tourist destinations, as these give them the chance for differentiated offers and for a flexibility that opposes uniformity and gives place to variety and difference. Assuming that the development of the destinations do not obey to just a standard way, and alternatively is embedded in the historical, cultural, institutional and natural matrices of the regions where destinations are anchored, then the specific resources of a place can assume the basic role of inputs for the differentiation of the tourist destination and for the diversification of its tourist offers. Taking into account the exceptionality of tourist product as an experience, which is associated with an integrated experience offer, one can say that an idiographic perspective of a destination requires that the valuation of its specific resources pass not only for the tourist services providers to assume themselves as agents who facilitate the stay and the mobility of the tourists, but also that they need to become ambassadors of all the kind of services of the destination as well as of the region itself. Such tourist destination generates change. As it generates differentiated strategies at the regional level and as it is based on co-operation and network, these strategies and related facts make the environment propitious to the dissemination of knowledge and innovation. Innovation, in turn, generates difference, that strengthens the identity of the region, and potentially, of the tourist destination. Such strategies of differentiation, in a sustainable development frame, can be the turning point for a more selective tourist industry, and where all can win: the local communities, the tourists, the tourist agents, and the environment.Specific resources; idiographic approach; innovation; tourist destination; sustainability; regional development
    corecore