34 research outputs found
A Review on: Association Rule Mining Using Privacy for Partitioned Database
Data Analysis techniques that are Association manage mining and Frequent thing set mining are two prominent and broadly utilized for different applications. The conventional framework concentrated independently on vertically parceled database and on a level plane apportioned databases on the premise of this presenting a framework which concentrate on both on a level plane and vertically divided databases cooperatively with protection safeguarding component. Information proprietors need to know the continuous thing sets or affiliation rules from an aggregate information set and unveil or uncover as few data about their crude information as could reasonably be expected to other information proprietors and outsiders. To guarantee information protection a Symmetric Encryption Technique is utilized to show signs of improvement result. Cloud supported successive thing set mining arrangement used to exhibit an affiliation govern mining arrangement. The subsequent arrangements are intended for outsourced databases that permit various information proprietors to proficiently share their information safely without trading off on information protection. Information security is one of the key procedures in outsourcing information to different outside clients. Customarily Fast Distribution Mining calculation was proposed for securing conveyed information. These business locales an issue by secure affiliation governs over parceled information in both even and vertical. A Frequent thing sets calculation and Distributed affiliation administer digging calculation is used for doing above method adequately in divided information, which incorporates administrations of the information in outsourcing process for disseminated databases. This work keeps up or keeps up proficient security over vertical and flat perspective of representation in secure mining applications
A Secure and Verifiable Computation for k-Nearest Neighbor Queries in Cloud
The popularity of cloud computing has increased significantly in the last few years due to scalability, cost efficiency, resiliency, and quality of service. Organizations are more interested in outsourcing the database and DBMS functionalities to the cloud owing to the tremendous growth of big data and on-demand access requirements. As the data is outsourced to untrusted parties, security has become a key consideration to achieve the confidentiality and integrity of data. Therefore, data owners must transform and encrypt the data before outsourcing. In this paper, we focus on a Secure and Verifiable Computation for k-Nearest Neighbor (SVC-kNN) problem. The existing verifiable computation approaches for the kNN problem delegate the verification task solely to a single semi-trusted party. We show that these approaches are unreliable in terms of security, as the verification server could be either dishonest or compromised. To address these issues, we propose a novel solution to the SVC-kNN problem that utilizes the random-splitting approach in conjunction with the homomorphic properties under a two-cloud model. Specifically, the clouds generate and send verification proofs to end-users, allowing them to verify the computation results efficiently. Our solution is highly efficient from the data owner and query issuers’ perspective as it significantly reduces the encryption cost and pre-processing time. Furthermore, we demonstrated the correctness of our solution using Proof by Induction methodology to prove the Euclidean Distance Verification
Authenticated Outlier Mining for Outsourced Databases
The Data-Mining-as-a-Service (DMaS) paradigm is becoming the focus of research, as it allows the data owner (client) who lacks expertise and/or computational resources to outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises some issues about result integrity : how could the client verify the mining results returned by the server are both sound and complete? In this paper, we focus on outlier mining, an important mining task. Previous verification techniques use an authenticated data structure (ADS) for correctness authentication, which may incur much space and communication cost. In this paper, we propose a novel solution that returns a probabilistic result integrity guarantee with much cheaper verification cost. The key idea is to insert a set of artificial records ( A R s) into the dataset, from which it constructs a set of artificial outliers ( A O s) and artificial non-outliers ( A N O s). The A O s and A N O s are used by the client to detect any incomplete and/or incorrect mining results with a probabilistic guarantee. The main challenge that we address is how to construct A R s so that they do not change the (non-)outlierness of original records, while guaranteeing that the client can identify A N O s and A O s without executing mining. Furthermore, we build a strategic game and show that a Nash equilibrium exists only when the server returns correct outliers. Our implementation and experiments demonstrate that our verification solution is efficient and lightweight
Secure Protocols for Privacy-preserving Data Outsourcing, Integration, and Auditing
As the amount of data available from a wide range of domains has increased tremendously in recent years, the demand for data sharing and integration has also risen. The cloud computing paradigm provides great flexibility to data owners with respect to computation and storage capabilities, which makes it a suitable platform for them to share their data. Outsourcing person-specific data to the cloud, however, imposes serious concerns about the confidentiality of the outsourced data, the privacy of the individuals referenced in the data, as well as the confidentiality of the queries processed over the data. Data integration is another form of data sharing, where data owners jointly perform the integration process, and the resulting dataset is shared between them. Integrating related data from different sources enables individuals, businesses, organizations and government agencies to perform better data analysis, make better informed decisions, and provide better services. Designing distributed, secure, and privacy-preserving protocols for integrating person-specific data, however, poses several challenges, including how to prevent each party from inferring sensitive information about individuals during the execution of the protocol, how to guarantee an effective level of privacy on the released data while maintaining utility for data mining, and how to support public auditing such that anyone at any time can verify that the integration was executed correctly and no participants deviated from the protocol.
In this thesis, we address the aforementioned concerns by presenting secure protocols for privacy-preserving data outsourcing, integration and auditing. First, we propose a secure cloud-based data outsourcing and query processing framework that simultaneously preserves the confidentiality of the data and the query requests, while providing differential privacy guarantees on the query results. Second, we propose a publicly verifiable protocol for integrating person-specific data from multiple data owners, while providing differential privacy guarantees and maintaining an effective level of utility on the released data for the purpose of data mining. Next, we propose a privacy-preserving multi-party protocol for high-dimensional data mashup with guaranteed LKC-privacy on the output data.
Finally, we apply the theory to the real world problem of solvency in Bitcoin. More specifically, we propose a privacy-preserving and publicly verifiable cryptographic proof of solvency scheme for Bitcoin exchanges such that no information is revealed about the exchange's customer holdings, the value of the exchange's total holdings is kept secret, and multiple exchanges performing the same proof of solvency can contemporaneously prove they are not colluding
Recommended from our members
Strategy and methodology for enterprise data warehouse development. Integrating data mining and social networking techniques for identifying different communities within the data warehouse.
Data warehouse technology has been successfully integrated into the information
infrastructure of major organizations as potential solution for eliminating redundancy and
providing for comprehensive data integration. Realizing the importance of a data
warehouse as the main data repository within an organization, this dissertation addresses
different aspects related to the data warehouse architecture and performance issues.
Many data warehouse architectures have been presented by industry analysts and
research organizations. These architectures vary from the independent and physical
business unit centric data marts to the centralised two-tier hub-and-spoke data warehouse.
The operational data store is a third tier which was offered later to address the business
requirements for inter-day data loading. While the industry-available architectures are all
valid, I found them to be suboptimal in efficiency (cost) and effectiveness (productivity).
In this dissertation, I am advocating a new architecture (The Hybrid Architecture)
which encompasses the industry advocated architecture. The hybrid architecture demands
the acquisition, loading and consolidation of enterprise atomic and detailed data into a
single integrated enterprise data store (The Enterprise Data Warehouse) where businessunit
centric Data Marts and Operational Data Stores (ODS) are built in the same instance
of the Enterprise Data Warehouse.
For the purpose of highlighting the role of data warehouses for different
applications, we describe an effort to develop a data warehouse for a geographical
information system (GIS). We further study the importance of data practices, quality and
governance for financial institutions by commenting on the RBC Financial Group case.
v
The development and deployment of the Enterprise Data Warehouse based on the
Hybrid Architecture spawned its own issues and challenges. Organic data growth and
business requirements to load additional new data significantly will increase the amount
of stored data. Consequently, the number of users will increase significantly. Enterprise
data warehouse obesity, performance degradation and navigation difficulties are chief
amongst the issues and challenges.
Association rules mining and social networks have been adopted in this thesis to
address the above mentioned issues and challenges. We describe an approach that uses
frequent pattern mining and social network techniques to discover different communities
within the data warehouse. These communities include sets of tables frequently accessed
together, sets of tables retrieved together most of the time and sets of attributes that
mostly appear together in the queries. We concentrate on tables in the discussion;
however, the model is general enough to discover other communities. We first build a
frequent pattern mining model by considering each query as a transaction and the tables
as items. Then, we mine closed frequent itemsets of tables; these itemsets include tables
that are mostly accessed together and hence should be treated as one unit in storage and
retrieval for better overall performance. We utilize social network construction and
analysis to find maximum-sized sets of related tables; this is a more robust approach as
opposed to a union of overlapping itemsets. We derive the Jaccard distance between the
closed itemsets and construct the social network of tables by adding links that represent
distance above a given threshold. The constructed network is analyzed to discover
communities of tables that are mostly accessed together. The reported test results are
promising and demonstrate the applicability and effectiveness of the developed approach
Are You Ready? A Proposed Framework For The Assessment Of Digital Forensic Readiness
This dissertation develops a framework to assess Digital Forensic Readiness (DFR) in organizations. DFR is the state of preparedness to obtain, understand, and present digital evidence when needed. This research collects indicators of digital forensic readiness from a systematic literature review. More than one thousand indicators were found and semantically analyzed to identify the dimensions to where they belong. These dimensions were subjected to a q-sort test and validated using association rules, producing a preliminary framework of DFR for practitioners. By classifying these indicators into dimensions, it was possible to distill them into 71 variables further classified into either extant or perceptual variables. Factor analysis was used to identify latent factors within the two groups of variables. A statistically-based framework to assess DFR is presented, wherein the extant indicators are used as a proxy of the real DFR status and the perceptual factors as the perception of this status
To Cheat or Not to Cheat - A Game-Theoretic Analysis of Outsourced Computation Verification
peer reviewedIn the cloud computing era, in order to avoid computational
burdens, many organizations tend to outsource their com-
putations to third-party cloud servers. In order to protect
service quality, the integrity of computation results need to
be guaranteed. In this paper, we develop a game theoretic
framework which helps the outsourcer to maximize its pay-
o while ensuring the desired level of integrity for the out-
sourced computation. We de ne two Stackelberg games and
analyze the optimal setting's sensitivity for the parameters
of the model
Innovation in manufacturing through digital technologies and applications: Thoughts and Reflections on Industry 4.0
The rapid pace of developments in digital technologies offers many opportunities to increase the efficiency, flexibility and sophistication of manufacturing processes; including the potential for easier customisation, lower volumes and rapid changeover of products within the same manufacturing cell or line. A number of initiatives on this theme have been proposed around the world to support national industries under names such as Industry 4.0 (Industrie 4.0 in Germany, Made-in-China in China and Made Smarter in the UK).
This book presents an overview of the state of art and upcoming developments in digital technologies pertaining to manufacturing. The starting point is an introduction on Industry 4.0 and its potential for enhancing the manufacturing process. Later on moving to the design of smart (that is digitally driven) business processes which are going to rely on sensing of all relevant parameters, gathering, storing and processing the data from these sensors, using computing power and intelligence at the most appropriate points in the digital workflow including application of edge computing and parallel processing.
A key component of this workflow is the application of Artificial Intelligence and particularly techniques in Machine Learning to derive actionable information from this data; be it real-time automated responses such as actuating transducers or informing human operators to follow specified standard operating procedures or providing management data for operational and strategic planning. Further consideration also needs to be given to the properties and behaviours of particular machines that are controlled and materials that are transformed during the manufacturing process and this is sometimes referred to as Operational Technology (OT) as opposed to IT. The digital capture of these properties and behaviours can then be used to define so-called Cyber Physical Systems.
Given the power of these digital technologies it is of paramount importance that they operate safely and are not vulnerable to malicious interference. Industry 4.0 brings unprecedented cybersecurity challenges to manufacturing and the overall industrial sector and the case is made here that new codes of practice are needed for the combined Information Technology and Operational Technology worlds, but with a framework that should be native to Industry 4.0. Current computing technologies are also able to go in other directions than supporting the digital ‘sense to action’ process described above. One of these is to use digital technologies to enhance the ability of the human operators who are still essential within the manufacturing process. One such technology, that has recently become accessible for widespread adoption, is Augmented Reality, providing operators with real-time additional information in situ with the machines that they interact with in their workspace in a hands-free mode.
Finally, two linked chapters discuss the specific application of digital technologies to High Pressure Die Casting (HDPC) of Magnesium components. Optimizing the HPDC process is a key task for increasing productivity and reducing defective parts and the first chapter provides an overview of the HPDC process with attention to the most common defects and their sources. It does this by first looking at real-time process control mechanisms, understanding the various process variables and assessing their impact on the end product quality. This understanding drives the choice of sensing methods and the associated smart digital workflow to allow real-time control and mitigation of variation in the identified variables. Also, data from this workflow can be captured and used for the design of optimised dies and associated processes
Monte Carlo Method with Heuristic Adjustment for Irregularly Shaped Food Product Volume Measurement
Volume measurement plays an important role in the production and processing of food products. Various methods have been
proposed to measure the volume of food products with irregular shapes based on 3D reconstruction. However, 3D reconstruction
comes with a high-priced computational cost. Furthermore, some of the volume measurement methods based on 3D reconstruction
have a low accuracy. Another method for measuring volume of objects uses Monte Carlo method. Monte Carlo method performs
volume measurements using random points. Monte Carlo method only requires information regarding whether random points
fall inside or outside an object and does not require a 3D reconstruction. This paper proposes volume measurement using a
computer vision system for irregularly shaped food products without 3D reconstruction based on Monte Carlo method with
heuristic adjustment. Five images of food product were captured using five cameras and processed to produce binary images.
Monte Carlo integration with heuristic adjustment was performed to measure the volume based on the information extracted from
binary images. The experimental results show that the proposed method provided high accuracy and precision compared to the
water displacement method. In addition, the proposed method is more accurate and faster than the space carving method