4,320 research outputs found

    New Fundamental Technologies in Data Mining

    Get PDF
    The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

    Security and privacy for data mining of RFID-enabled product supply chains

    Get PDF
    The e-Pedigree used for verifying the authenticity of the products in RFID-enabled product supply chains plays a very important role in product anti-counterfeiting and risk management, but it is also vulnerable to malicious attacks and privacy leakage. While the radio frequency identification (RFID) technology bears merits such as automatic wireless identification without direct eye-sight contact, its security has been one of the main concerns in recent researches such as tag data tampering and cloning. Moreover, privacy leakage of the partners along the supply chains may lead to complete compromise of the whole system, and in consequence all authenticated products may be replaced by the faked ones! Quite different from other conventional databases, datasets in supply chain scenarios are temporally correlated, and every party of the system can only be semi-trusted. In this paper, a system that incorporates merits of both the secure multi-party computing and differential privacy is proposed to address the security and privacy issues, focusing on the vulnerability analysis of the data mining with distributed EPCIS datasets of e-pedigree having temporal relations from multiple range and aggregate queries in typical supply chain scenarios and the related algorithms. Theoretical analysis shows that our proposed system meets perfectly our preset design goals, while some of the other problems leave for future research

    Differential Privacy for Industrial Internet of Things: Opportunities, Applications and Challenges

    Get PDF
    The development of Internet of Things (IoT) brings new changes to various fields. Particularly, industrial Internet of Things (IIoT) is promoting a new round of industrial revolution. With more applications of IIoT, privacy protection issues are emerging. Specially, some common algorithms in IIoT technology such as deep models strongly rely on data collection, which leads to the risk of privacy disclosure. Recently, differential privacy has been used to protect user-terminal privacy in IIoT, so it is necessary to make in-depth research on this topic. In this paper, we conduct a comprehensive survey on the opportunities, applications and challenges of differential privacy in IIoT. We firstly review related papers on IIoT and privacy protection, respectively. Then we focus on the metrics of industrial data privacy, and analyze the contradiction between data utilization for deep models and individual privacy protection. Several valuable problems are summarized and new research ideas are put forward. In conclusion, this survey is dedicated to complete comprehensive summary and lay foundation for the follow-up researches on industrial differential privacy

    From Social Data Mining to Forecasting Socio-Economic Crisis

    Full text link
    Socio-economic data mining has a great potential in terms of gaining a better understanding of problems that our economy and society are facing, such as financial instability, shortages of resources, or conflicts. Without large-scale data mining, progress in these areas seems hard or impossible. Therefore, a suitable, distributed data mining infrastructure and research centers should be built in Europe. It also appears appropriate to build a network of Crisis Observatories. They can be imagined as laboratories devoted to the gathering and processing of enormous volumes of data on both natural systems such as the Earth and its ecosystem, as well as on human techno-socio-economic systems, so as to gain early warnings of impending events. Reality mining provides the chance to adapt more quickly and more accurately to changing situations. Further opportunities arise by individually customized services, which however should be provided in a privacy-respecting way. This requires the development of novel ICT (such as a self- organizing Web), but most likely new legal regulations and suitable institutions as well. As long as such regulations are lacking on a world-wide scale, it is in the public interest that scientists explore what can be done with the huge data available. Big data do have the potential to change or even threaten democratic societies. The same applies to sudden and large-scale failures of ICT systems. Therefore, dealing with data must be done with a large degree of responsibility and care. Self-interests of individuals, companies or institutions have limits, where the public interest is affected, and public interest is not a sufficient justification to violate human rights of individuals. Privacy is a high good, as confidentiality is, and damaging it would have serious side effects for society.Comment: 65 pages, 1 figure, Visioneer White Paper, see http://www.visioneer.ethz.c

    DataSHIELD – new directions and dimensions

    Get PDF
    In disciplines such as biomedicine and social sciences, sharing and combining sensitive individual-level data is often prohibited by ethical-legal or governance constraints and other barriers such as the control of intellectual property or the huge sample sizes. DataSHIELD (Data Aggregation Through Anonymous Summary-statistics from Harmonised Individual-levEL Databases) is a distributed approach that allows the analysis of sensitive individual-level data from one study, and the co-analysis of such data from several studies simultaneously without physically pooling them or disclosing any data. Following initial proof of principle, a stable DataSHIELD platform has now been implemented in a number of epidemiological consortia. This paper reports three new applications of DataSHIELD including application to post-publication sensitive data analysis, text data analysis and privacy protected data visualisation. Expansion of DataSHIELD analytic functionality and application to additional data types demonstrate the broad applications of the software beyond biomedical sciences

    Geospatial Data Preservation Prime

    Get PDF
    This primer is one in a series of Operational Policy documents being developed by GeoConnections. It is intended to inform Canadian Geospatial Data Infrastructure (CGDI) stakeholders about the nature and scope of digital geospatial data archiving and preservation and the realities, challenges and good practices of related operational policies. Burgeoning growth of online geospatial applications and the deluge of data, combined with the growing complexity of archiving and preserving digital data, has revealed a significant gap in the operational policy coverage for the Canadian geospatial data infrastructure (CGDI). Currently there is no commonly accepted guidance for CGDI stakeholders wishing or mandated to preserve their geospatial data assets for long-term access and use. More specifically, there is little or no guidance available to inform operational policy decisions on how to manage, preserve and provide access to a digital geospatial data collection. The preservation of geospatial data over a period of time is especially important when datasets are required to inform modeling applications such as climate change impact predictions, flood forecasts and land use management. Furthermore, data custodians may have both a legal and moral responsibility to implement effective archiving and preservation programs. Based on research and analysis of the Canadian legislative framework and current international practices in digital data archiving and preservation, this primer provides guidance on the factors to be considered and the steps to be taken in planning and implementing a data archiving and preservation program. It describes an approach to establishing a geospatial data archives based on good practices from the literature and Canadian case studies. This primer will provide CGDI stakeholders with information on how to incorporate archiving and preservation considerations into an effective data management process that covers the entire life cycle (DCC, 2013) (LAC, 2006) of their geospatial data assets (i.e., creation and receipt, distribution, use, maintenance, and disposition. It is intended to inform CGDI stakeholders on the importance of long term data preservation, and provide them with the information and tools required to make policy decisions for creating an archives and preserving digital geospatial data

    State of the Art in Privacy Preserving Data Mining

    Get PDF
    Privacy is one of the most important properties an information system must satisfy. A relatively new trend shows that classical access control techniques are not sufficient to guarantee privacy when Data Mining techniques are used. Such a trend, especially in the context of public databases, or in the context of sensible information related to critical infrastructures, represents, nowadays a not negligible thread. Privacy Preserving Data Mining (PPDM) algorithms have been recently introduced with the aim of modifying the database in such a way to prevent the discovery of sensible information. This is a very complex task and there exist in the scientific literature some different approaches to the problem. In this work we present a "Survey" of the current PPDM methodologies which seem promising for the future.JRC.G.6-Sensors, radar technologies and cybersecurit

    DP-starJ: A Differential Private Scheme towards Analytical Star-Join Queries

    Full text link
    Star-join query is the fundamental task in data warehouse and has wide applications in On-line Analytical Processing (OLAP) scenarios. Due to the large number of foreign key constraints and the asymmetric effect in the neighboring instance between the fact and dimension tables, even those latest DP efforts specifically designed for join, if directly applied to star-join query, will suffer from extremely large estimation errors and expensive computational cost. In this paper, we are thus motivated to propose DP-starJ, a novel Differentially Private framework for star-Join queries. DP-starJ consists of a series of strategies tailored to specific features of star-join, including 1) we unveil the different effect of fact and dimension tables on the neighboring database instances, and accordingly revisit the definitions tailored to different cases of star-join; 2) we propose Predicate Mechanism (PM), which utilizes predicate perturbation to inject noise into the join procedure instead of the results; 3) to further boost the robust performance, we propose a DP-compliant star-join algorithm for various types of star-join tasks based on PM. We provide both theoretical analysis and empirical study, which demonstrate the superiority of the proposed methods over the state-of-the-art solutions in terms of accuracy, efficiency, and scalability
    • …
    corecore