142,855 research outputs found

    DDoS Attacks Detection Method Using Feature Importance and Support Vector Machine

    Get PDF
    In this study, the author wants to prove the combination of feature importance and support vector machine relevant to detecting distributed denial-of-service attacks. A distributed denial-of-service attack is a very dangerous type of attack because it causes enormous losses to the victim server. The study begins with determining network traffic features, followed by collecting datasets. The author uses 1000 randomly selected network traffic datasets for the purposes of feature selection and modeling. In the next stage, feature importance is used to select relevant features as modeling inputs based on support vector machine algorithms. The modeling results were evaluated using a confusion matrix table. Based on the evaluation using the confusion matrix, the score for the recall is 93 percent, precision is 95 percent, and accuracy is 92 percent. The author also compares the proposed method to several other methods. The comparison results show the performance of the proposed method is at a fairly good level in detecting distributed denial-of-service attacks. We realized this result was influenced by many factors, so further studies are needed in the future

    A hybrid deep learning approach for texture analysis

    Get PDF
    Texture classification is a problem that has various applications such as remote sensing and forest species recognition. Solutions tend to be custom fit to the dataset used but fails to generalize. The Convolutional Neural Network (CNN) in combination with Support Vector Machine (SVM) form a robust selection between powerful invariant feature extractor and accurate classifier. The fusion of classifiers shows the stability of classification among different datasets and slight improvement compared to state of the art methods. The classifiers are fused using confusion matrix after independent training of each using the same training set, then put to test. Statistical information about each classifier is fed to a confusion matrix that generates two confidence measures used in building two binary classifiers. The binary classifier is allowed to activate or deactivate a classifier during testing time based on a confidence measure obtained from the confusion matrix. The method obtained results approaching state of the art with a difference less than 1% in classification success rates. Moreover, the method was able to maintain this success rate among different datasets while other methods had failed to obtain similar stability. Two datasets had been used in this research Brodatz and Kylberg where the results came 98.17% and 99.70%. In comparison to conventional methods in the literature, it came as 98.9% and 99.64% respectively

    Perbandingan Akurasi Metode Principal Component Analysis (PCA) dan Correlation-Based Feature Selection (CFS) Pada Klasifikasi Perpanjangan Kontrak Karyawan Menggunakan Metode Naïve Bayes

    Get PDF
    PT. Oasis Waters International Palembang conducts regular staff performance reviews, the findings of which are utilized to make recommendations for employee contract extension. The Human Resource Department has assigned a numerical value to 25 qualities (HRD). The process of giving a label or class to a number of examples when the value of each characteristic is known as classification. The Naïve Bayes technique is a basic classification approach that makes use of probability estimates. Based on the observations, it was discovered that one of the 25 criteria was deemed the most relevant in determining the recommendation for an employee contract renewal. As a result, in this study, a comparison of the pre-processing Principal Component Analysis (PCA) approach and the Correlation-based Feature Selection (CFS) method on the categorization of employee contract extensions at PT Oasis Waters International Palembang will be performed. According to the data, the CFS approach has a positive influence on classification performance, while PCA does not. This is demonstrated by a 30% increase in accuracy when utilizing the CFS approach. Meanwhile, both strategies have a positive influence on the model's dependability. This is demonstrated by a reduction in Root Mean Square Error (RMSE) when using the CFS approach from 0.6325 to 0.1845, whereas using the PCA method results in 0.5123.Keywords : Naïve Bayes, Principal Component Analysis, Correlation-based Feature Selection, Confusion Matrix, Root Mean Square Erro

    Effectiveness of Feature Selection and Machine Learning Techniques for Software Effort Estimation

    Get PDF
    Estimation of desired effort is one of the most important activities in software project management. This work presents an approach for estimation based upon various feature selection and machine learning techniques for non-quantitative data and is carried out in two phases. The first phase concentrates on selection of optimal feature set of high dimensional data, related to projects undertaken in past. A quantitative analysis using Rough Set Theory and Information Gain is performed for feature selection. The second phase estimates the effort based on the optimal feature set obtained from first phase. The estimation is carried out differently by applying various Artificial Neural Networks and Classification techniques separately. The feature selection process in the first phase considers public domain data (USP05). The effectiveness of the proposed approach is evaluated based on the parameters such as Mean Magnitude of Relative Error (MMRE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and Confusion Matrix. Machine learning methods, such as Feed Forward neural network, Radial Basis Function network, Functional Link neural network, Levenberg Marquadt neural network, Naive Bayes Classifier, Classification and Regression Tree and Support Vector classification, in combination of various feature selection techniques are compared with each other in order to find an optimal pair. It is observed that Functional Link neural network achieves better results among other neural networks and Naive Bayes classifier performs better for estimation when compared with other classification techniques

    Sentiment Analysis using an ensemble of Feature Selection Algorithms

    Get PDF
    To determine the opinion of any person experiencing any services or buying any product, the usage of Sentiment Analysis, a continuous research in the field of text mining, is a common practice. It is a process of using computation to identify and categorize opinions expressed in a piece of text. Individuals post their opinion via reviews, tweets, comments or discussions which is our unstructured information. Sentiment analysis gives a general conclusion of audits which benefit clients, individuals or organizations for decision making. The primary point of this paper is to perform an ensemble approach on feature reduction methods identified with natural language processing and performing the analysis based on the results. An ensemble approach is a process of combining two or more methodologies. The feature reduction methods used are Principal Component Analysis (PCA) for feature extraction and Pearson Chi squared statistical test for feature selection. The fundamental commitment of this paper is to experiment whether combined use of cautious feature determination and existing classification methodologies can yield better accuracy

    Detection of postural transitions using machine learning

    Get PDF
    The purpose of this project is to study the nature of human activity recognition and prepare a dataset from volunteers doing various activities which can be used for constructing the various parts of a machine learning model which is used to identify each volunteers posture transitions accurately. This report presents the problem definition, equipment used, previous work in this area of human activity recognition and the resolution of the problem along with results. Also this report sheds light on the process and the steps taken to undertake this endeavour of human activity recognition such as building of a dataset, pre-processing the data by applying filters and various windowing length techniques, splitting the data into training and testing data, performance of feature selection and feature extraction and finally selecting the model for training and testing which provides maximum accuracy and least misclassification rates. The tools used for this project includes a laptop equipped with MATLAB and EXCEL and MEDIA PLAYER CLASSIC respectively which have been used for data processing, model training and feature selection and Labelling respectively. The data has been collected using an Inertial Measurement Unit contains 3 tri-axial Accelerometers, 1 Gyroscope, 1 Magnetometer and 1 Pressure sensor. For this project only the Accelerometers, Gyroscope and the Pressure sensor is used. The sensor is made by the members of the lab named ‘The Technical Research Centre for Dependency Care and Autonomous Living (CETpD) at the UPC-ETSEIB campus. The results obtained have been satisfactory, and the objectives set have been fulfilled. There is room for possible improvements through expanding the scope of the project such as detection of chronic disorders or providing posture based statistics to the end user or even just achieving a higher rate of sensitivity of transitions of posture by using better features and increasing the dataset size by increasing the number of volunteers.Incomin

    Texture descriptors applied to digital mammography

    Get PDF
    Breast cancer is the second cause of death among women cancers. Computer Aided Detection has been demon- strated an useful tool for early diagnosis, a crucial as- pect for a high survival rate. In this context, several re- search works have incorporated texture features in mam- mographic image segmentation and description such as Gray-Level co-occurrence matrices, Local Binary Pat- terns, and many others. This paper presents an approach for breast density classi¯cation based on segmentation and texture feature extraction techniques in order to clas- sify digital mammograms according to their internal tis- sue. The aim of this work is to compare di®erent texture descriptors on the same framework (same algorithms for segmentation and classi¯cation, as well as same images). Extensive results prove the feasibility of the proposed ap- proach.Postprint (published version

    The role of circumstance monitoring on the diagnostic interpretation of condition monitoring data

    Get PDF
    Circumstance monitoring, a recently coined termed defines the collection of data reflecting the real network working environment of in-service equipment. This ideally complete data set should reflect the elements of the electrical, mechanical, thermal, chemical and environmental stress factors present on the network. This must be distinguished from condition monitoring, which is the collection of data reflecting the status of in-service equipment. This contribution investigates the significance of considering circumstance monitoring on diagnostic interpretation of condition monitoring data. Electrical treeing partial discharge activity from various harmonic polluted waveforms have been recorded and subjected to a series of machine learning techniques. The outcome provides a platform for improved interpretation of the harmonic influenced partial discharge patterns. The main conclusion of this exercise suggests that any diagnostic interpretation is dependent on the immunity of condition monitoring measurements to the stress factors influencing the operational conditions. This enables the asset manager to have an improved holistic view of an asset's health
    corecore