36,875 research outputs found

    Supervised Intrusions Detection System Using KNN

    Get PDF
    This paper is on implementations of intrusion detection system using Knn algorithm using R language. The dataset used is the KDDcup 1999 a well know bench mark for IDS. The machine learning algorithm K nearest neighbor(Knn) is use for the detection and classification for the known attacks. The experimental results are obtained using R programming language

    Epileptic Seizure Detection in EEGs by Using Random Tree Forest, Naïve Bayes and KNN Classification

    Get PDF
    Epilepsy is a disease that attacks the nerves. To detect epilepsy, it is necessary to analyze the results of an EEG test. In this study, we compared the naive bayes, random tree forest and K-nearest neighbour (KNN) classification algorithms to detect epilepsy. The raw EEG data were pre-processed before doing feature extraction. Then, we have done the training in three algorithms: KNN Classification, naïve bayes classification and random tree forest. The last step was validation of the trained machine learning. Comparing those three classifiers, we calculated accuracy, sensitivity, specificity, and precision. The best trained classifier is KNN classifier (accuracy: 92.7%), rather than random tree forest (accuracy: 86.6%) and naïve bayes classifier (accuracy: 55.6%). Seen from precision performance, KNN Classification also gives the best precision (82.5%) rather than Naïve Bayes classification (25.3%) and random tree forest (68.2%). But, for the sensitivity, Naïve Bayes classification is the best with 80.3% sensitivity, compare to KNN 73.2% and random tree forest (42.2%). For specificity, KNN classification gives 96.7% specificity, then random tree forest 95.9% and Naïve bayes 50.4%. The training time of naïve bayes was 0.166030 sec, while training time of random tree forest was 2.4094sec and KNN was the slower in training that was 4.789 sec. Therefore, KNN Classification gives better performance than naïve bayes and random tree forest classification

    Secure kk-ish Nearest Neighbors Classifier

    Get PDF
    In machine learning, classifiers are used to predict a class of a given query based on an existing (classified) database. Given a database S of n d-dimensional points and a d-dimensional query q, the k-nearest neighbors (kNN) classifier assigns q with the majority class of its k nearest neighbors in S. In the secure version of kNN, S and q are owned by two different parties that do not want to share their data. Unfortunately, all known solutions for secure kNN either require a large communication complexity between the parties, or are very inefficient to run. In this work we present a classifier based on kNN, that can be implemented efficiently with homomorphic encryption (HE). The efficiency of our classifier comes from a relaxation we make on kNN, where we allow it to consider kappa nearest neighbors for kappa ~ k with some probability. We therefore call our classifier k-ish Nearest Neighbors (k-ish NN). The success probability of our solution depends on the distribution of the distances from q to S and increase as its statistical distance to Gaussian decrease. To implement our classifier we introduce the concept of double-blinded coin-toss. In a doubly-blinded coin-toss the success probability as well as the output of the toss are encrypted. We use this coin-toss to efficiently approximate the average and variance of the distances from q to S. We believe these two techniques may be of independent interest. When implemented with HE, the k-ish NN has a circuit depth that is independent of n, therefore making it scalable. We also implemented our classifier in an open source library based on HELib and tested it on a breast tumor database. The accuracy of our classifier (F_1 score) were 98\% and classification took less than 3 hours compared to (estimated) weeks in current HE implementations

    AffinityNet: semi-supervised few-shot learning for disease type prediction

    Full text link
    While deep learning has achieved great success in computer vision and many other fields, currently it does not work very well on patient genomic data with the "big p, small N" problem (i.e., a relatively small number of samples with high-dimensional features). In order to make deep learning work with a small amount of training data, we have to design new models that facilitate few-shot learning. Here we present the Affinity Network Model (AffinityNet), a data efficient deep learning model that can learn from a limited number of training examples and generalize well. The backbone of the AffinityNet model consists of stacked k-Nearest-Neighbor (kNN) attention pooling layers. The kNN attention pooling layer is a generalization of the Graph Attention Model (GAM), and can be applied to not only graphs but also any set of objects regardless of whether a graph is given or not. As a new deep learning module, kNN attention pooling layers can be plugged into any neural network model just like convolutional layers. As a simple special case of kNN attention pooling layer, feature attention layer can directly select important features that are useful for classification tasks. Experiments on both synthetic data and cancer genomic data from TCGA projects show that our AffinityNet model has better generalization power than conventional neural network models with little training data. The code is freely available at https://github.com/BeautyOfWeb/AffinityNet .Comment: 14 pages, 6 figure

    Exemplar-Centered Supervised Shallow Parametric Data Embedding

    Full text link
    Metric learning methods for dimensionality reduction in combination with k-Nearest Neighbors (kNN) have been extensively deployed in many classification, data embedding, and information retrieval applications. However, most of these approaches involve pairwise training data comparisons, and thus have quadratic computational complexity with respect to the size of training set, preventing them from scaling to fairly big datasets. Moreover, during testing, comparing test data against all the training data points is also expensive in terms of both computational cost and resources required. Furthermore, previous metrics are either too constrained or too expressive to be well learned. To effectively solve these issues, we present an exemplar-centered supervised shallow parametric data embedding model, using a Maximally Collapsing Metric Learning (MCML) objective. Our strategy learns a shallow high-order parametric embedding function and compares training/test data only with learned or precomputed exemplars, resulting in a cost function with linear computational complexity for both training and testing. We also empirically demonstrate, using several benchmark datasets, that for classification in two-dimensional embedding space, our approach not only gains speedup of kNN by hundreds of times, but also outperforms state-of-the-art supervised embedding approaches.Comment: accepted to IJCAI201

    Adaptive Preferential Attached kNN Graph With Distribution-Awareness

    Full text link
    Graph-based kNN algorithms have garnered widespread popularity for machine learning tasks, due to their simplicity and effectiveness. However, the conventional kNN graph's reliance on a fixed value of k can hinder its performance, especially in scenarios involving complex data distributions. Moreover, like other classification models, the presence of ambiguous samples along decision boundaries often presents a challenge, as they are more prone to incorrect classification. To address these issues, we propose the Preferential Attached k-Nearest Neighbors Graph (paNNG), which combines adaptive kNN with distribution-based graph construction. By incorporating distribution information, paNNG can significantly improve performance for ambiguous samples by "pulling" them towards their original classes and hence enable enhanced overall accuracy and generalization capability. Through rigorous evaluations on diverse benchmark datasets, paNNG outperforms state-of-the-art algorithms, showcasing its adaptability and efficacy across various real-world scenarios
    • …
    corecore