Search CORE

46,864 research outputs found

Web news classification using neural networks based on PCA

Author: Omatu Sigeru
Selamat Ali
Yanagimoto Hidekazu
Publication venue
Publication date: 01/01/2002
Field of study

In this paper, we propose a news web page classification method (WPCM). The WPCM uses a neural network with inputs obtained by both the principal components and class profile-based features (CPBF). The fixed number of regular words from each class will be used as a feature vectors with the reduced features from the PCA. These feature vectors are then used as the input to the neural networks for classification. The experimental evaluation demonstrates that the WPCM provides acceptable classification accuracy with the sports news datasets

Universiti Teknologi Malaysia Institutional Repository

High Accuracy Phishing Detection Based on Convolutional Neural Networks

Author: Alzaylaee Mohammed K.
Yerima Suleiman Y.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2020
Field of study

The persistent growth in phishing and the rising volume of phishing websites has led to individuals and organizations worldwide becoming increasingly exposed to various cyber-attacks. Consequently, more effective phishing detection is required for improved cyber defence. Hence, in this paper we present a deep learning-based approach to enable high accuracy detection of phishing sites. The proposed approach utilizes convolutional neural networks (CNN) for high accuracy classification to distinguish genuine sites from phishing sites. We evaluate the models using a dataset obtained from 6,157 genuine and 4,898 phishing websites. Based on the results of extensive experiments, our CNN based models proved to be highly effective in detecting unknown phishing sites. Furthermore, the CNN based approach performed better than traditional machine learning classifiers evaluated on the same dataset, reaching 98.2% phishing detection rate with an F1-score of 0.976. The method presented in this pa-per compares favourably to the state-of-the art in deep learning based phishing website detection

arXiv.org e-Print Archive

Crossref

De Montfort University Open Research Archive

Automated Website Fingerprinting through Deep Learning

Author: Joosen Wouter
Juarez Marc
Preuveneers Davy
Rimmer Vera
Van Goethem Tom
Publication venue: 'Internet Society'
Publication date: 05/12/2017
Field of study

Several studies have shown that the network traffic that is generated by a visit to a website over Tor reveals information specific to the website through the timing and sizes of network packets. By capturing traffic traces between users and their Tor entry guard, a network eavesdropper can leverage this meta-data to reveal which website Tor users are visiting. The success of such attacks heavily depends on the particular set of traffic features that are used to construct the fingerprint. Typically, these features are manually engineered and, as such, any change introduced to the Tor network can render these carefully constructed features ineffective. In this paper, we show that an adversary can automate the feature engineering process, and thus automatically deanonymize Tor traffic by applying our novel method based on deep learning. We collect a dataset comprised of more than three million network traces, which is the largest dataset of web traffic ever used for website fingerprinting, and find that the performance achieved by our deep learning approaches is comparable to known methods which include various research efforts spanning over multiple years. The obtained success rate exceeds 96% for a closed world of 100 websites and 94% for our biggest closed world of 900 classes. In our open world evaluation, the most performant deep learning model is 2% more accurate than the state-of-the-art attack. Furthermore, we show that the implicit features automatically learned by our approach are far more resilient to dynamic changes of web content over time. We conclude that the ability to automatically construct the most relevant traffic features and perform accurate traffic recognition makes our deep learning based approach an efficient, flexible and robust technique for website fingerprinting.Comment: To appear in the 25th Symposium on Network and Distributed System Security (NDSS 2018

arXiv.org e-Print Archive

Crossref

Recommended from our members

An Overview of the Use of Neural Networks for Data Mining Tasks

Author: Alberts B
Alpaydin E
Ando T
Blake CL
Bramer MA
Castanheira LG
Han J
Lu H
Mitchell M
Ni X
Quinlan RJ
Rumelhart DE
Shafer JC
Shendure J
Simić D
Stahl F
Steinwart I
Surjandari I
Wei JS
Widrow B
Witten IH
Zaslavsky B
Zhang D
Publication venue: 'Wiley'
Publication date: 01/01/2012
Field of study

In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

Central Archive at the University of Reading

Crossref

Portsmouth University Research Portal (Pure)

Bournemouth University Research Online

Undermining User Privacy on Mobile Devices Using AI

Author: Abadi Martin
Berk Gü
Diao Wenrui
Genkin Daniel
Green Marc
Gruss Daniel
Hu Wei-Ming
Lipp Moritz
Maghrebi Houssem
Martinasek Zdenek
Martinasek Zdenek
Prouff Emmanuel
Schuster Roei
Schwarz Michael
Varadarajan Venkatanathan
Vila Pepe
Yarom Yuval
Publication venue
Publication date: 01/01/2019
Field of study

Over the past years, literature has shown that attacks exploiting the microarchitecture of modern processors pose a serious threat to the privacy of mobile phone users. This is because applications leave distinct footprints in the processor, which can be used by malware to infer user activities. In this work, we show that these inference attacks are considerably more practical when combined with advanced AI techniques. In particular, we focus on profiling the activity in the last-level cache (LLC) of ARM processors. We employ a simple Prime+Probe based monitoring technique to obtain cache traces, which we classify with Deep Learning methods including Convolutional Neural Networks. We demonstrate our approach on an off-the-shelf Android phone by launching a successful attack from an unprivileged, zeropermission App in well under a minute. The App thereby detects running applications with an accuracy of 98% and reveals opened websites and streaming videos by monitoring the LLC for at most 6 seconds. This is possible, since Deep Learning compensates measurement disturbances stemming from the inherently noisy LLC monitoring and unfavorable cache characteristics such as random line replacement policies. In summary, our results show that thanks to advanced AI techniques, inference attacks are becoming alarmingly easy to implement and execute in practice. This once more calls for countermeasures that confine microarchitectural leakage and protect mobile phone applications, especially those valuing the privacy of their users

arXiv.org e-Print Archive

Crossref

OpenMETU (Middle East Technical University)

HTMLPhish: Enabling Phishing Web Page Detection by Applying Deep Learning Techniques on HTML Analysis

Author: Chen Yingke
Opara Chidimma
Wei Bo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Recently, the development and implementation of phishing attacks require little technical skills and costs. This uprising has led to an ever-growing number of phishing attacks on the World Wide Web. Consequently, proactive techniques to fight phishing attacks have become extremely necessary. In this paper, we propose HTMLPhish, a deep learning based datadriven end-to-end automatic phishing web page classification approach. Specifically, HTMLPhish receives the content of the HTML document of a web page and employs Convolutional Neural Networks (CNNs) to learn the semantic dependencies in the textual contents of the HTML. The CNNs learn appropriate feature representations from the HTML document embeddings without extensive manual feature engineering. Furthermore, our proposed approach of the concatenation of the word and character embeddings allows our model to manage new features and ensure easy extrapolation to test data. We conduct comprehensive experiments on a dataset of more than 50,000 HTML documents that provides a distribution of phishing to benign web pages obtainable in the real-world that yields over 93% Accuracy and True Positive Rate. Also, HTMLPhish is a completely language-independent and client-side strategy which can, therefore, conduct web page phishing detection regardless of the textual language

arXiv.org e-Print Archive

Northumbria University Research Portal

Crossref

Lancaster E-Prints

ServeNet: A Deep Neural Network for Web Services Classification

Author: Grolinger Katarina
Li Zhi
Liao Zhifang
Liu Peng
Qamar Nafees
Wang Weiru
Yang Yilong
Publication venue
Publication date: 06/08/2020
Field of study

Automated service classification plays a crucial role in service discovery, selection, and composition. Machine learning has been widely used for service classification in recent years. However, the performance of conventional machine learning methods highly depends on the quality of manual feature engineering. In this paper, we present a novel deep neural network to automatically abstract low-level representation of both service name and service description to high-level merged features without feature engineering and the length limitation, and then predict service classification on 50 service categories. To demonstrate the effectiveness of our approach, we conduct a comprehensive experimental study by comparing 10 machine learning methods on 10,000 real-world web services. The result shows that the proposed deep neural network can achieve higher accuracy in classification and more robust than other machine learning methods.Comment: Accepted by ICWS'2

arXiv.org e-Print Archive

Crossref

NeuroSVM: A Graphical User Interface for Identification of Liver Patients

Author: Nagaraj Kalyan
Sridhar Amulyashree
Publication venue
Publication date: 19/02/2015
Field of study

Diagnosis of liver infection at preliminary stage is important for better treatment. In todays scenario devices like sensors are used for detection of infections. Accurate classification techniques are required for automatic identification of disease samples. In this context, this study utilizes data mining approaches for classification of liver patients from healthy individuals. Four algorithms (Naive Bayes, Bagging, Random forest and SVM) were implemented for classification using R platform. Further to improve the accuracy of classification a hybrid NeuroSVM model was developed using SVM and feed-forward artificial neural network (ANN). The hybrid model was tested for its performance using statistical parameters like root mean square error (RMSE) and mean absolute percentage error (MAPE). The model resulted in a prediction accuracy of 98.83%. The results suggested that development of hybrid model improved the accuracy of prediction. To serve the medicinal community for prediction of liver disease among patients, a graphical user interface (GUI) has been developed using R. The GUI is deployed as a package in local repository of R platform for users to perform prediction.Comment: 9 pages, 6 figure

arXiv.org e-Print Archive

CiteSeerX

Feature selection for an SVM based webpage classifier

Author: Mtetwa Nhamoinesu
Reddy Viseshini
Yousefi Mohammadmehdi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2017
Field of study

Crossref

ResearchOnline@GCU