157 research outputs found

    A traffic classification method using machine learning algorithm

    Get PDF
    Applying concepts of attack investigation in IT industry, this idea has been developed to design a Traffic Classification Method using Data Mining techniques at the intersection of Machine Learning Algorithm, Which will classify the normal and malicious traffic. This classification will help to learn about the unknown attacks faced by IT industry. The notion of traffic classification is not a new concept; plenty of work has been done to classify the network traffic for heterogeneous application nowadays. Existing techniques such as (payload based, port based and statistical based) have their own pros and cons which will be discussed in this literature later, but classification using Machine Learning techniques is still an open field to explore and has provided very promising results up till now

    Active Nearest-Neighbor Learning in Metric Spaces

    Full text link
    We propose a pool-based non-parametric active learning algorithm for general metric spaces, called MArgin Regularized Metric Active Nearest Neighbor (MARMANN), which outputs a nearest-neighbor classifier. We give prediction error guarantees that depend on the noisy-margin properties of the input sample, and are competitive with those obtained by previously proposed passive learners. We prove that the label complexity of MARMANN is significantly lower than that of any passive learner with similar error guarantees. MARMANN is based on a generalized sample compression scheme, and a new label-efficient active model-selection procedure

    The R Package MitISEM: Mixture of Student-t Distributions using Importance Sampling Weighted Expectation Maximization for Efficient and Robust Simulation

    Get PDF
    This paper presents the R package MitISEM, which provides an automatic and flexible method to approximate a non-elliptical target density using adaptive mixtures of Student-t densities, where only a kernel of the target density is required. The approximation can be used as a candidate density in Importance Sampling or Metropolis Hastings methods for Bayesian inference on model parameters and probabilities. The package provides also an extended MitISEM algorithm, ‘sequential MitISEM’, which substantially decreases the computational time when the target density has to be approximated for increasing data samples. This occurs when the posterior distribution is updated with new observations and/or when one computes model probabilities using predictive likelihoods. We illustrate the MitISEM algorithm using three canonical statistical and econometric models that are characterized by several types of non-elliptical posterior shapes and that describe well-known data patterns in econometrics and finance. We show that the candidate distribution obtained by MitISEM outperforms those obtained by ‘naive’ approximations in terms of numerical efficiency. Further, the MitISEM approach can be used for Bayesian model comparison, using the predictive likelihoods

    Robust detection of real-time power quality disturbances under noisy condition using FTDD features

    Get PDF
    To improve power quality (PQ), detecting the particular type of disturbance is the foremost thing before mitigation. So monitoring is needed to detect the PQ disturbance that occurs in a short duration of time. Classification of real-time PQ disturbances under noisy environment is investigated in this method by selecting an appropriate signal processing tool called fusion of time domain descriptors (FTDD) at the feature extraction stage. It’s a method of extracting power spectrum characteristics of various PQ disturbances. Few advantages like algorithmic simplicity and local time-based unique features makes the FTDD algorithm ahead of other techniques. PQ events like voltage sag, voltage swell, interruption, healthy, transient and harmonics mixed with different noise conditions are analysed. multiclass support vector machine and Naïves Bayes (NB) classifiers are applied to analyse the performance of the proposed method. As a result, NB classifier performs better in noiseless signal with 99.66%, wherein noise added signals both NB and SVM are showing better accuracy at different signal to noise ratios. Finally, Arduino controller-based hardware tool involved in the acquisition of real-time signals shows how our proposed system is applicable for industries that make detection simple

    The Kernel Density Integral Transformation

    Full text link
    Feature preprocessing continues to play a critical role when applying machine learning and statistical methods to tabular data. In this paper, we propose the use of the kernel density integral transformation as a feature preprocessing step. Our approach subsumes the two leading feature preprocessing methods as limiting cases: linear min-max scaling and quantile transformation. We demonstrate that, without hyperparameter tuning, the kernel density integral transformation can be used as a simple drop-in replacement for either method, offering protection from the weaknesses of each. Alternatively, with tuning of a single continuous hyperparameter, we frequently outperform both of these methods. Finally, we show that the kernel density transformation can be profitably applied to statistical data analysis, particularly in correlation analysis and univariate clustering.Comment: Published in Transactions on Machine Learning Research (10/2023

    Bayesian classification and survival analysis with curve predictors

    Get PDF
    We propose classification models for binary and multicategory data where the predictor is a random function. The functional predictor could be irregularly and sparsely sampled or characterized by high dimension and sharp localized changes. In the former case, we employ Bayesian modeling utilizing flexible spline basis which is widely used for functional regression. In the latter case, we use Bayesian modeling with wavelet basis functions which have nice approximation properties over a large class of functional spaces and can accommodate varieties of functional forms observed in real life applications. We develop an unified hierarchical model which accommodates both the adaptive spline or wavelet based function estimation model as well as the logistic classification model. These two models are coupled together to borrow strengths from each other in this unified hierarchical framework. The use of Gibbs sampling with conjugate priors for posterior inference makes the method computationally feasible. We compare the performance of the proposed models with the naive models as well as existing alternatives by analyzing simulated as well as real data. We also propose a Bayesian unified hierarchical model based on a proportional hazards model and generalized linear model for survival analysis with irregular longitudinal covariates. This relatively simple joint model has two advantages. One is that using spline basis simplifies the parameterizations while a flexible non-linear pattern of the function is captured. The other is that joint modeling framework allows sharing of the information between the regression of functional predictors and proportional hazards modeling of survival data to improve the efficiency of estimation. The novel method can be used not only for one functional predictor case, but also for multiple functional predictors case. Our methods are applied to analyze real data sets and compared with a parameterized regression method
    • …
    corecore