5 research outputs found

    Forest Tree- An Efficient Proposal Approach for Data Mining

    Get PDF
    Data Mining (DM) is a way of looking on different models, summaries, & derived values from a given gathered data. DM itself work on the process of looking for analytical information in huge amount of available databases. An illustration of a predictive riddle is targeted marketing. There are many factors that influence the performance of mining on large data sets. In this paper we are going to use forest tree technique in order to improve the way of performance of how the data is to be fetched and when on implementation it will definitely overcome the performance of previous work which includes existing approach decision tree algorithm

    Forest Tree Algorithm- An Efficient Approach of Data Mining Over Decision Tree

    Get PDF
    Mining of Data (DM) is a way to display different models, summaries and values derived from a given data collected. The DM itself works in the process of searching for analytical information on the large number of available databases. An example of a predictive enigma is targeted marketing. There are many factors that affect data mining performance in large data sets. In this article we will use the forest tree technique to improve performance in search for data and implementation, surely overcome the previous work performance that includes the approach of the existing tree decision tree algorithm

    Secure Algorithm for File Sharing Using Clustering Technique of K-Means Clustering

    Get PDF
    In the current scenario The Security is most or of at most importance when we are talking about file transferring in networks. In the thesis, the work has design a new innovative algorithm to securely transfer the data over network. The k –means clustering algorithm, introduced by MacQueen in 1967 is a broadly utilized plan to solve the clustering problem. It classifies a given arrangement of n-information focuses in m-dimensional space into k-clusters whose focuses are gotten by the centroids. The issue with the privacy consideration has been examined, and that is the data is distributed among various gatherings and the disseminated information is to be safeguarded. In this thesis, created chucks or parts of file using the K-Means Clustering Algorithm and the individual part is encrypted using the key which is shared between sender and receiver. Further, the bunched records have been encoded by utilizing AES encryption algorithm with the introduction of private key concept covertly shared between the involved parties which gives a superior security state

    Non-linear dimensionality reduction for privacy-preserving data classification

    No full text
    Many techniques have been proposed to protect the privacy of data outsourced for analysis by external parties. However, most of these techniques distort the underlying data properties, and therefore, hinder data mining algorithms from discovering patterns. The aim of Privacy-Preserving Data Mining (PPDM) is to generate a data-friendly transformation that maintains both the privacy and the utility of the data. We have proposed a novel privacy-preserving framework based on non-linear dimensionality reduction (i.e. non-metric multidimensional scaling) to perturb the original data. The perturbed data exhibited good utility in terms of distance-preservation between objects. This was tested on a clustering task with good results. In this paper, we test our novel PPDM approach on a classification task using a k-Nearest Neighbour (k-NN) classification algorithm. We compare the classification results obtained from both the original and the perturbed data and find them to be much same particularly for the few lower dimensions. We show that, for distance-based classification, our approach preserves the utility of the data while hiding the private details

    Non-Metric Multi-Dimensional Scaling for Distance-Based Privacy-Preserving Data Mining

    Get PDF
    Recent advances in the field of data mining have led to major concerns about privacy. Sharing data with external parties for analysis puts private information at risk. The original data are often perturbed before external release to protect private information. However, data perturbation can decrease the utility of the output. A good perturbation technique requires balance between privacy and utility. This study proposes a new method for data perturbation in the context of distance-based data mining. We propose the use of non-metric multi-dimensional scaling (MDS) as a suitable technique to perturb data that are intended for distance-based data mining. The basic premise of this approach is to transform the original data into a lower dimensional space and generate new data that protect private details while maintaining good utility for distance-based data mining analysis. We investigate the extent the perturbed data are able to preserve useful statistics for distance-based analysis and to provide protection against malicious attacks. We demonstrate that our method provides an adequate alternative to data randomisation approaches and other dimensionality reduction approaches. Testing is conducted on a wide range of benchmarked datasets and against some existing perturbation methods. The results confirm that our method has very good overall performance, is competitive with other techniques, and produces clustering and classification results at least as good, and in some cases better, than the results obtained from the original data
    corecore