Search CORE

10,557 research outputs found

Reviewer Integration and Performance Measurement for Malware Detection

Author: Afroz Sadia
Bachwani Rekha
Faizullabhoy Riyaz
Huang Ling
Joseph Anthony D.
Kantchelian Alex
Miller Brad
Shankar Vaishaal
Tschantz Michael Carl
Tygar J. D.
Wu Tony
Yiu George
Publication venue
Publication date: 26/05/2016
Field of study

We present and evaluate a large-scale malware detection system integrating machine learning with expert reviewers, treating reviewers as a limited labeling resource. We demonstrate that even in small numbers, reviewers can vastly improve the system's ability to keep pace with evolving threats. We conduct our evaluation on a sample of VirusTotal submissions spanning 2.5 years and containing 1.1 million binaries with 778GB of raw feature data. Without reviewer assistance, we achieve 72% detection at a 0.5% false positive rate, performing comparable to the best vendors on VirusTotal. Given a budget of 80 accurate reviews daily, we improve detection to 89% and are able to detect 42% of malicious binaries undetected upon initial submission to VirusTotal. Additionally, we identify a previously unnoticed temporal inconsistency in the labeling of training datasets. We compare the impact of training labels obtained at the same time training data is first seen with training labels obtained months later. We find that using training labels obtained well after samples appear, and thus unavailable in practice for current training data, inflates measured detection by almost 20 percentage points. We release our cluster-based implementation, as well as a list of all hashes in our evaluation and 3% of our entire dataset.Comment: 20 papers, 11 figures, accepted at the 13th Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA 2016

arXiv.org e-Print Archive

Crossref

Evolutionary intelligent agents for e-commerce: Generic preference detection with feature analysis

Author: Alba
Fangming Zhu
Guan
Guan
Guan
Guan
Guan
Guan
Ingber
Linden
Michalewicz
Nwana
Shahabi
Sheng-Uei Guan
Sung
Tai Kheng Chan
Tamma
Publication venue: 'Elsevier BV'
Publication date: 01/01/2005
Field of study

Product recommendation and preference tracking systems have been adopted extensively in e-commerce businesses. However, the heterogeneity of product attributes results in undesired impediment for an efficient yet personalized e-commerce product brokering. Amid the assortment of product attributes, there are some intrinsic generic attributes having significant relation to a customer’s generic preference. This paper proposes a novel approach in the detection of generic product attributes through feature analysis. The objective is to provide an insight to the understanding of customers’ generic preference. Furthermore, a genetic algorithm is used to find the suitable feature weight set, hence reducing the rate of misclassification. A prototype has been implemented and the experimental results are promising

Crossref

Brunel University Research Archive

ScholarBank@NUS

Customizing kernel functions for SVM-based hyperspectral image classification

Author: Baofeng Guo
James D. B. Nelson
R. I. Damper
Senior Member
Steve R. Gunn
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Previous research applying kernel methods such as support vector machines (SVMs) to hyperspectral image classification has achieved performance competitive with the best available algorithms. However, few efforts have been made to extend SVMs to cover the specific requirements of hyperspectral image classification, for example, by building tailor-made kernels. Observation of real-life spectral imagery from the AVIRIS hyperspectral sensor shows that the useful information for classification is not equally distributed across bands, which provides potential to enhance the SVM's performance through exploring different kernel functions. Spectrally weighted kernels are, therefore, proposed, and a set of particular weights is chosen by either optimizing an estimate of generalization error or evaluating each band's utility level. To assess the effectiveness of the proposed method, experiments are carried out on the publicly available 92AV3C dataset collected from the 220-dimensional AVIRIS hyperspectral sensor. Results indicate that the method is generally effective in improving performance: spectral weighting based on learning weights by gradient descent is found to be slightly better than an alternative method based on estimating ";relevance"; between band information and ground trut

CiteSeerX

Southampton (e-Prints Soton)

Crossref

Network Traffic Analysis Framework For Cyber Threat Detection

Author: Cherie Meshesha K.
Publication venue: Beadle Scholar
Publication date: 01/03/2020
Field of study

The growing sophistication of attacks and newly emerging cyber threats requires advanced cyber threat detection systems. Although there are several cyber threat detection tools in use, cyber threats and data breaches continue to rise. This research is intended to improve the cyber threat detection approach by developing a cyber threat detection framework using two complementary technologies, search engine and machine learning, combining artificial intelligence and classical technologies. In this design science research, several artifacts such as a custom search engine library, a machine learning-based engine and different algorithms have been developed to build a new cyber threat detection framework based on self-learning search and machine learning engines. Apache Lucene.Net search engine library was customized in order to function as a cyber threat detector, and Microsoft ML.NET was used to work with and train the customized search engine. This research proves that a custom search engine can function as a cyber threat detection system. Using both search and machine learning engines in the newly developed framework provides improved cyber threat detection capabilities such as self-learning and predicting attack details. When the two engines run together, the search engine is continuously trained by the machine learning engine and grow smarter to predict yet unknown threats with greater accuracy. While customizing the search engine to function as a cyber threat detector, this research also identified and proved the best algorithms for the search engine based cyber threat detection model. For example, the best scoring algorithm was found to be the Manhattan distance. The validation case study also shows that not every network traffic feature makes an equal contribution to determine the status of the traffic, and thus the variable-dimension Vector Space Model (VSM) achieves better detection accuracy than n-dimensional VSM. Although the use of different technologies and approaches improved detection results, this research is primarily focused on developing techniques rather than building a complete threat detection system. Additional components such as those that can track and investigate the impact of network traffic on the destination devices make the newly developed framework robust enough to build a comprehensive cyber threat detection appliance

Beadle Scholar at Dakota State University