46,864 research outputs found
Web news classification using neural networks based on PCA
In this paper, we propose a news web page classification method (WPCM). The WPCM uses a neural network with inputs obtained by both the principal components and class profile-based features (CPBF). The fixed number of regular words from each class will be used as a feature vectors with the reduced features from the PCA. These feature vectors are then used as the input to the neural networks for classification. The experimental evaluation demonstrates that the WPCM provides acceptable classification accuracy with the sports news datasets
High Accuracy Phishing Detection Based on Convolutional Neural Networks
The persistent growth in phishing and the rising volume of phishing websites has led to individuals and organizations worldwide becoming increasingly exposed to various cyber-attacks. Consequently, more effective phishing detection is required for improved cyber defence. Hence, in this paper we present a deep learning-based approach to enable high accuracy detection of phishing sites. The proposed approach utilizes convolutional neural networks (CNN) for high accuracy classification to distinguish genuine sites from phishing sites. We evaluate the models using a dataset obtained from 6,157 genuine and 4,898 phishing websites. Based on the results of extensive experiments, our CNN based models proved to be highly effective in detecting unknown phishing sites. Furthermore, the CNN based approach performed better than traditional machine learning classifiers evaluated on the same dataset, reaching 98.2% phishing detection rate with an F1-score of 0.976. The method presented in this pa-per compares favourably to the state-of-the art in deep learning based phishing website detection
Automated Website Fingerprinting through Deep Learning
Several studies have shown that the network traffic that is generated by a
visit to a website over Tor reveals information specific to the website through
the timing and sizes of network packets. By capturing traffic traces between
users and their Tor entry guard, a network eavesdropper can leverage this
meta-data to reveal which website Tor users are visiting. The success of such
attacks heavily depends on the particular set of traffic features that are used
to construct the fingerprint. Typically, these features are manually engineered
and, as such, any change introduced to the Tor network can render these
carefully constructed features ineffective. In this paper, we show that an
adversary can automate the feature engineering process, and thus automatically
deanonymize Tor traffic by applying our novel method based on deep learning. We
collect a dataset comprised of more than three million network traces, which is
the largest dataset of web traffic ever used for website fingerprinting, and
find that the performance achieved by our deep learning approaches is
comparable to known methods which include various research efforts spanning
over multiple years. The obtained success rate exceeds 96% for a closed world
of 100 websites and 94% for our biggest closed world of 900 classes. In our
open world evaluation, the most performant deep learning model is 2% more
accurate than the state-of-the-art attack. Furthermore, we show that the
implicit features automatically learned by our approach are far more resilient
to dynamic changes of web content over time. We conclude that the ability to
automatically construct the most relevant traffic features and perform accurate
traffic recognition makes our deep learning based approach an efficient,
flexible and robust technique for website fingerprinting.Comment: To appear in the 25th Symposium on Network and Distributed System
Security (NDSS 2018
Recommended from our members
An Overview of the Use of Neural Networks for Data Mining Tasks
In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks
Undermining User Privacy on Mobile Devices Using AI
Over the past years, literature has shown that attacks exploiting the
microarchitecture of modern processors pose a serious threat to the privacy of
mobile phone users. This is because applications leave distinct footprints in
the processor, which can be used by malware to infer user activities. In this
work, we show that these inference attacks are considerably more practical when
combined with advanced AI techniques. In particular, we focus on profiling the
activity in the last-level cache (LLC) of ARM processors. We employ a simple
Prime+Probe based monitoring technique to obtain cache traces, which we
classify with Deep Learning methods including Convolutional Neural Networks. We
demonstrate our approach on an off-the-shelf Android phone by launching a
successful attack from an unprivileged, zeropermission App in well under a
minute. The App thereby detects running applications with an accuracy of 98%
and reveals opened websites and streaming videos by monitoring the LLC for at
most 6 seconds. This is possible, since Deep Learning compensates measurement
disturbances stemming from the inherently noisy LLC monitoring and unfavorable
cache characteristics such as random line replacement policies. In summary, our
results show that thanks to advanced AI techniques, inference attacks are
becoming alarmingly easy to implement and execute in practice. This once more
calls for countermeasures that confine microarchitectural leakage and protect
mobile phone applications, especially those valuing the privacy of their users
HTMLPhish: Enabling Phishing Web Page Detection by Applying Deep Learning Techniques on HTML Analysis
Recently, the development and implementation of phishing attacks require little technical skills and costs. This uprising has led to an ever-growing number of phishing attacks on the World Wide Web. Consequently, proactive techniques to fight phishing attacks have become extremely necessary. In this paper, we propose HTMLPhish, a deep learning based datadriven end-to-end automatic phishing web page classification approach. Specifically, HTMLPhish receives the content of the HTML document of a web page and employs Convolutional Neural Networks (CNNs) to learn the semantic dependencies in the textual contents of the HTML. The CNNs learn appropriate feature representations from the HTML document embeddings without extensive manual feature engineering. Furthermore, our proposed approach of the concatenation of the word and character embeddings allows our model to manage new features and ensure easy extrapolation to test data. We conduct comprehensive experiments on a dataset of more than 50,000 HTML documents that provides a distribution of phishing to benign web pages obtainable in the real-world that yields over 93% Accuracy and True Positive Rate. Also, HTMLPhish is a completely language-independent and client-side strategy which can, therefore, conduct web page phishing detection regardless of the textual language
ServeNet: A Deep Neural Network for Web Services Classification
Automated service classification plays a crucial role in service discovery,
selection, and composition. Machine learning has been widely used for service
classification in recent years. However, the performance of conventional
machine learning methods highly depends on the quality of manual feature
engineering. In this paper, we present a novel deep neural network to
automatically abstract low-level representation of both service name and
service description to high-level merged features without feature engineering
and the length limitation, and then predict service classification on 50
service categories. To demonstrate the effectiveness of our approach, we
conduct a comprehensive experimental study by comparing 10 machine learning
methods on 10,000 real-world web services. The result shows that the proposed
deep neural network can achieve higher accuracy in classification and more
robust than other machine learning methods.Comment: Accepted by ICWS'2
NeuroSVM: A Graphical User Interface for Identification of Liver Patients
Diagnosis of liver infection at preliminary stage is important for better
treatment. In todays scenario devices like sensors are used for detection of
infections. Accurate classification techniques are required for automatic
identification of disease samples. In this context, this study utilizes data
mining approaches for classification of liver patients from healthy
individuals. Four algorithms (Naive Bayes, Bagging, Random forest and SVM) were
implemented for classification using R platform. Further to improve the
accuracy of classification a hybrid NeuroSVM model was developed using SVM and
feed-forward artificial neural network (ANN). The hybrid model was tested for
its performance using statistical parameters like root mean square error (RMSE)
and mean absolute percentage error (MAPE). The model resulted in a prediction
accuracy of 98.83%. The results suggested that development of hybrid model
improved the accuracy of prediction. To serve the medicinal community for
prediction of liver disease among patients, a graphical user interface (GUI)
has been developed using R. The GUI is deployed as a package in local
repository of R platform for users to perform prediction.Comment: 9 pages, 6 figure
- …