193 research outputs found

    Using Empirical Recurrence Rates Ratio For Time Series Data Similarity

    Full text link
    Several methods exist in classification literature to quantify the similarity between two time series data sets. Applications of these methods range from the traditional Euclidean type metric to the more advanced Dynamic Time Warping metric. Most of these adequately address structural similarity but fail in meeting goals outside it. For example, a tool that could be excellent to identify the seasonal similarity between two time series vectors might prove inadequate in the presence of outliers. In this paper, we have proposed a unifying measure for binary classification that performed well while embracing several aspects of dissimilarity. This statistic is gaining prominence in various fields, such as geology and finance, and is crucial in time series database formation and clustering studies

    Deep Learning for Link Prediction in Dynamic Networks using Weak Estimators

    Full text link
    Link prediction is the task of evaluating the probability that an edge exists in a network, and it has useful applications in many domains. Traditional approaches rely on measuring the similarity between two nodes in a static context. Recent research has focused on extending link prediction to a dynamic setting, predicting the creation and destruction of links in networks that evolve over time. Though a difficult task, the employment of deep learning techniques have shown to make notable improvements to the accuracy of predictions. To this end, we propose the novel application of weak estimators in addition to the utilization of traditional similarity metrics to inexpensively build an effective feature vector for a deep neural network. Weak estimators have been used in a variety of machine learning algorithms to improve model accuracy, owing to their capacity to estimate changing probabilities in dynamic systems. Experiments indicate that our approach results in increased prediction accuracy on several real-world dynamic networks

    Privacy-Preserving Sequential Pattern Mining Over Vertically Partitioned Data

    Get PDF
    Privacy-preserving data mining in distributed environments is an important issue in the field of data mining. In this paper, we study how to conduct sequential patterns mining, which is one of the data mining computations, on private data in the following scenario: Multiple parties, each having a private data set, want to jointly conduct sequential pattern mining. Since no party wants to disclose its private data to other parties, a secure method needs to be provided to make such a computation feasible. We develop a practical solution to the above problem in this paper

    Privacy-Preserving Decision Tree Classification over Horizontally Partitioned Data

    Get PDF
    Protection of privacy is one of important problems in data mining. The unwillingness to share their data frequently results in failure of collaborative data mining. This paper studies how to build a decision tree classifier under the following scenario: a database is horizontally partitioned into multiple pieces, with each piece owned by a particular party. All the parties want to build a decision tree classifier based on such a database, but due to the privacy constraints, neither of them wants to disclose their private pieces. We build a privacy-preserving system, including a set of secure protocols, that allows the parties to construct such a classifier. We guarantee that the private data are securely protected

    Bayesian Network Induction with Incomplete Private Data

    Get PDF
    A Bayesian network is a graphical model for representing probabilistic relationships among a set of variables. It is an important model for business analysis. Bayesian network learning methods have been applied to business analysis where data privacy is not considered. However, how to learn a Bayesian network over private data presents a much greater challenge. In this paper, we develop an approach to tackle the problem of Bayesian network induction on private data which may contain missing values. The basic idea of our proposed approach is that we combine randomization technique with Expectation Maximization (EM) algorithm. The purpose of using randomization is to disguise the raw data. EM algorithm is applied for missing values in the private data set. We also present a method to conduct Bayesian network construction, which is one of data mining computations, from the disguised data

    Privacy-Preserving Naive Bayesian Classification Over Vertically Partitioned Data

    Get PDF
    Protection of privacy is a critical problem in data mining. Preserving data privacy in distributed data mining is even more challenging. In this paper, we consider the problem of privacy-preserving naive Bayesian classification over vertically partitioned data. The problem is one of important issues in privacypreserving distributed data mining. Our approach is based on homomorphic encryption. The scheme is very efficient in the term of computation and communication cost

    Privacy-Preserving Support Vector Machines Learning

    Get PDF
    This paper addresses the problem of data sharing among multiple parties, without disclosing the data between the parties. We focus on sharing of data among parties involved in a data mining task. We study how to share private or confidential data in the following scenario: without disclosing their private data to each other, multiple parties, each having a private data set, want to collaboratively construct support vector machines using a linear, polynomial or sigmoid kernel function. To tackle this problem, we develop a secure protocol for multiple parties to conduct the desired computation. The solution is distributed, i.e., there is no central, trusted party having access to all the data. Instead, we define a protocol using homomorphic encryption techniques to exchange the data while keeping it private. We analyze the protocol in the context of mistakes and malicious attacks, and show its robustness against such attacks. All the parties are treated symmetrically: they all participate in the encryption and in the computation involved in learning support vector machines

    Privacy-Preserving Collaborative Association Rule Mining

    Get PDF
    In recent times, the development of privacy technologies has promoted the speed of research on privacy-preserving collaborative data mining. People borrowed the ideas of secure multi-party computation and developed secure multi-party protocols to deal with privacy-preserving collaborative data mining problems. Random perturbation was also identified to be an efficient estimation technique to solve the problems. Both secure multi-party protocol and random perturbation technique have their advantages and shortcomings. In this paper, we develop a new approach that combines existing techniques in such a way that the new approach gains the advantages from both of them

    WaveNets: Wavelet Channel Attention Networks

    Full text link
    Channel Attention reigns supreme as an effective technique in the field of computer vision. However, the proposed channel attention by SENet suffers from information loss in feature learning caused by the use of Global Average Pooling (GAP) to represent channels as scalars. Thus, designing effective channel attention mechanisms requires finding a solution to enhance features preservation in modeling channel inter-dependencies. In this work, we utilize Wavelet transform compression as a solution to the channel representation problem. We first test wavelet transform as an Auto-Encoder model equipped with conventional channel attention module. Next, we test wavelet transform as a standalone channel compression method. We prove that global average pooling is equivalent to the recursive approximate Haar wavelet transform. With this proof, we generalize channel attention using Wavelet compression and name it WaveNet. Implementation of our method can be embedded within existing channel attention methods with a couple of lines of code. We test our proposed method using ImageNet dataset for image classification task. Our method outperforms the baseline SENet, and achieves the state-of-the-art results. Our code implementation is publicly available at https://github.com/hady1011/WaveNet-C.Comment: IEEE BigData2022 conferenc
    corecore