26,303 research outputs found
Classification hardness for supervised learners on 20 years of intrusion detection data
This article consolidates analysis of established (NSL-KDD) and new intrusion detection datasets (ISCXIDS2012, CICIDS2017, CICIDS2018) through the use of supervised machine learning (ML) algorithms. The uniformity in analysis procedure opens up the option to compare the obtained results. It also provides a stronger foundation for the conclusions about the efficacy of supervised learners on the main classification task in network security. This research is motivated in part to address the lack of adoption of these modern datasets. Starting with a broad scope that includes classification by algorithms from different families on both established and new datasets has been done to expand the existing foundation and reveal the most opportune avenues for further inquiry. After obtaining baseline results, the classification task was increased in difficulty, by reducing the available data to learn from, both horizontally and vertically. The data reduction has been included as a stress-test to verify if the very high baseline results hold up under increasingly harsh constraints. Ultimately, this work contains the most comprehensive set of results on the topic of intrusion detection through supervised machine learning. Researchers working on algorithmic improvements can compare their results to this collection, knowing that all results reported here were gathered through a uniform framework. This work's main contributions are the outstanding classification results on the current state of the art datasets for intrusion detection and the conclusion that these methods show remarkable resilience in classification performance even when aggressively reducing the amount of data to learn from
Worst-case Optimal Submodular Extensions for Marginal Estimation
Submodular extensions of an energy function can be used to efficiently
compute approximate marginals via variational inference. The accuracy of the
marginals depends crucially on the quality of the submodular extension. To
identify the best possible extension, we show an equivalence between the
submodular extensions of the energy and the objective functions of linear
programming (LP) relaxations for the corresponding MAP estimation problem. This
allows us to (i) establish the worst-case optimality of the submodular
extension for Potts model used in the literature; (ii) identify the worst-case
optimal submodular extension for the more general class of metric labeling; and
(iii) efficiently compute the marginals for the widely used dense CRF model
with the help of a recently proposed Gaussian filtering method. Using synthetic
and real data, we show that our approach provides comparable upper bounds on
the log-partition function to those obtained using tree-reweighted message
passing (TRW) in cases where the latter is computationally feasible.
Importantly, unlike TRW, our approach provides the first practical algorithm to
compute an upper bound on the dense CRF model.Comment: Accepted to AISTATS 201
DMT Optimality of LR-Aided Linear Decoders for a General Class of Channels, Lattice Designs, and System Models
The work identifies the first general, explicit, and non-random MIMO
encoder-decoder structures that guarantee optimality with respect to the
diversity-multiplexing tradeoff (DMT), without employing a computationally
expensive maximum-likelihood (ML) receiver. Specifically, the work establishes
the DMT optimality of a class of regularized lattice decoders, and more
importantly the DMT optimality of their lattice-reduction (LR)-aided linear
counterparts. The results hold for all channel statistics, for all channel
dimensions, and most interestingly, irrespective of the particular lattice-code
applied. As a special case, it is established that the LLL-based LR-aided
linear implementation of the MMSE-GDFE lattice decoder facilitates DMT optimal
decoding of any lattice code at a worst-case complexity that grows at most
linearly in the data rate. This represents a fundamental reduction in the
decoding complexity when compared to ML decoding whose complexity is generally
exponential in rate.
The results' generality lends them applicable to a plethora of pertinent
communication scenarios such as quasi-static MIMO, MIMO-OFDM, ISI,
cooperative-relaying, and MIMO-ARQ channels, in all of which the DMT optimality
of the LR-aided linear decoder is guaranteed. The adopted approach yields
insight, and motivates further study, into joint transceiver designs with an
improved SNR gap to ML decoding.Comment: 16 pages, 1 figure (3 subfigures), submitted to the IEEE Transactions
on Information Theor
Feature Selection via Binary Simultaneous Perturbation Stochastic Approximation
Feature selection (FS) has become an indispensable task in dealing with
today's highly complex pattern recognition problems with massive number of
features. In this study, we propose a new wrapper approach for FS based on
binary simultaneous perturbation stochastic approximation (BSPSA). This
pseudo-gradient descent stochastic algorithm starts with an initial feature
vector and moves toward the optimal feature vector via successive iterations.
In each iteration, the current feature vector's individual components are
perturbed simultaneously by random offsets from a qualified probability
distribution. We present computational experiments on datasets with numbers of
features ranging from a few dozens to thousands using three widely-used
classifiers as wrappers: nearest neighbor, decision tree, and linear support
vector machine. We compare our methodology against the full set of features as
well as a binary genetic algorithm and sequential FS methods using
cross-validated classification error rate and AUC as the performance criteria.
Our results indicate that features selected by BSPSA compare favorably to
alternative methods in general and BSPSA can yield superior feature sets for
datasets with tens of thousands of features by examining an extremely small
fraction of the solution space. We are not aware of any other wrapper FS
methods that are computationally feasible with good convergence properties for
such large datasets.Comment: This is the Istanbul Sehir University Technical Report
#SHR-ISE-2016.01. A short version of this report has been accepted for
publication at Pattern Recognition Letter
Providing Diversity in K-Nearest Neighbor Query Results
Given a point query Q in multi-dimensional space, K-Nearest Neighbor (KNN)
queries return the K closest answers according to given distance metric in the
database with respect to Q. In this scenario, it is possible that a majority of
the answers may be very similar to some other, especially when the data has
clusters. For a variety of applications, such homogeneous result sets may not
add value to the user. In this paper, we consider the problem of providing
diversity in the results of KNN queries, that is, to produce the closest result
set such that each answer is sufficiently different from the rest. We first
propose a user-tunable definition of diversity, and then present an algorithm,
called MOTLEY, for producing a diverse result set as per this definition.
Through a detailed experimental evaluation on real and synthetic data, we show
that MOTLEY can produce diverse result sets by reading only a small fraction of
the tuples in the database. Further, it imposes no additional overhead on the
evaluation of traditional KNN queries, thereby providing a seamless interface
between diversity and distance.Comment: 20 pages, 11 figure
- …