5,148 research outputs found
A traffic classification method using machine learning algorithm
Applying concepts of attack investigation in IT industry, this idea has been developed to design
a Traffic Classification Method using Data Mining techniques at the intersection of Machine
Learning Algorithm, Which will classify the normal and malicious traffic. This classification will
help to learn about the unknown attacks faced by IT industry. The notion of traffic classification
is not a new concept; plenty of work has been done to classify the network traffic for
heterogeneous application nowadays. Existing techniques such as (payload based, port based
and statistical based) have their own pros and cons which will be discussed in this
literature later, but classification using Machine Learning techniques is still an open field to explore and has provided very promising results up till now
Surveying human habit modeling and mining techniques in smart spaces
A smart space is an environment, mainly equipped with Internet-of-Things (IoT) technologies, able to provide services to humans, helping them to perform daily tasks by monitoring the space and autonomously executing actions, giving suggestions and sending alarms. Approaches suggested in the literature may differ in terms of required facilities, possible applications, amount of human intervention required, ability to support multiple users at the same time adapting to changing needs. In this paper, we propose a Systematic Literature Review (SLR) that classifies most influential approaches in the area of smart spaces according to a set of dimensions identified by answering a set of research questions. These dimensions allow to choose a specific method or approach according to available sensors, amount of labeled data, need for visual analysis, requirements in terms of enactment and decision-making on the environment. Additionally, the paper identifies a set of challenges to be addressed by future research in the field
Automated data pre-processing via meta-learning
The final publication is available at link.springer.comA data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way around.
As a matter of fact, a dataset usually needs to be pre-processed. Taking into account all the possible pre-processing operators, there exists a staggeringly large number of alternatives and nonexperienced users become overwhelmed.
We show that this problem can be addressed by an automated approach, leveraging ideas from metalearning.
Specifically, we consider a wide range of data pre-processing techniques and a set of data mining algorithms. For each data mining algorithm and selected dataset, we are able to predict the transformations that improve the result
of the algorithm on the respective dataset. Our approach will help non-expert users to more effectively identify the transformations appropriate to their applications, and hence to achieve improved results.Peer ReviewedPostprint (published version
Improvement of the Accuracy of Prediction Using Unsupervised Discretization Method: Educational Data Set Case Study
This paper presents a comparison of the efficacy of unsupervised and supervised discretization methods for educational data from blended learning environment. NaĂŻve Bayes classifier was trained for each discretized data set and comparative analysis of prediction models was conducted. The research goal was to transform numeric features into maximum independent discrete values with minimum loss of information and reduction of classification error. Proposed unsupervised discretization method was based on the histogram distribution and implementation of oversampling technique. The main contribution of this research is improvement of accuracy prediction using the unsupervised discretization method which reduces the effect of ignoring class feature for educational data set
- …