992 research outputs found

    Machine Learning Aided Static Malware Analysis: A Survey and Tutorial

    Full text link
    Malware analysis and detection techniques have been evolving during the last decade as a reflection to development of different malware techniques to evade network-based and host-based security protections. The fast growth in variety and number of malware species made it very difficult for forensics investigators to provide an on time response. Therefore, Machine Learning (ML) aided malware analysis became a necessity to automate different aspects of static and dynamic malware investigation. We believe that machine learning aided static analysis can be used as a methodological approach in technical Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware analysis that has been thoroughly studied before. In this paper, we address this research gap by conducting an in-depth survey of different machine learning methods for classification of static characteristics of 32-bit malicious Portable Executable (PE32) Windows files and develop taxonomy for better understanding of these techniques. Afterwards, we offer a tutorial on how different machine learning techniques can be utilized in extraction and analysis of a variety of static characteristic of PE binaries and evaluate accuracy and practical generalization of these techniques. Finally, the results of experimental study of all the method using common data was given to demonstrate the accuracy and complexity. This paper may serve as a stepping stone for future researchers in cross-disciplinary field of machine learning aided malware forensics.Comment: 37 Page

    PPS-ADS: A Framework for Privacy-Preserved and Secured Distributed System Architecture for Handling Big Data

    Get PDF
    The exponential expansion of Big Data in 7V’s (velocity, variety, veracity, value, variability and visualization) brings forth new challenges to security, reliability, availability and privacy of these data sets. Traditional security techniques and algorithms fail to complement this gigantic big data. This paper aims to improve the recently proposed Atrain Distributed System (ADS) by incorporating new features which will cater to the end-to-end availability and security aspects of the big data in the distributed system. The paper also integrates the concept of Software Defined Networking (SDN) in ADS to effectively control and manage the routing of the data item in the ADS. The storage of data items in the ADS is done on the basis of the type of data (structured or unstructured), the capacity of the distributed system (or coach) and the distance of coach from the pilot computer (PC). In order to maintain the consistency of data and to eradicate the possible loss of data, the concept of “forward positive” and “backward positive” acknowledgment is proposed. Furthermore, we have incorporated “Twofish” cryptographic technique to encrypt the big data in the ADS. Issues like “data ownership”, “data security, “data privacy” and data reliability” are pivotal while handling the big data. The current paper presents a framework for a privacy-preserved architecture for handling the big data in an effective manner

    Framework and Algorithms for Operator-Managed Content Caching

    Get PDF
    We propose a complete framework targeting operator-driven content caching that can be equally applied to both ISP-operated Content Delivery Networks (CDNs) and future Information-Centric Networks (ICNs). In contrast to previous proposals in this area, our solution leverages operators’ control on cache placement and content routing, managing to considerably reduce network operating costs by minimizing the amount of transit traffic and balancing load among available network resources. In addition, our solution provides two key advantages over previous proposals. First, it allows for a simple computation of the optimal cache placement. Second, it provides knobs for operators to fine-tune performance. We validate our design through both analytical modeling and trace-driven simulations and show that our proposed solution achieves on average twice as many cache hits in comparison to previously proposed techniques, without increasing delivery latency. In addition, we show that the proposed framework achieves 19-33% better load balancing across links and caching nodes, being also robust to traffic spikes

    Hadoop neural network for parallel and distributed feature selection

    Get PDF
    In this paper, we introduce a theoretical basis for a Hadoop-based neural network for parallel and distributed feature selection in Big Data sets. It is underpinned by an associative memory (binary) neural network which is highly amenable to parallel and distributed processing and fits with the Hadoop paradigm. There are many feature selectors described in the literature which all have various strengths and weaknesses. We present the implementation details of five feature selection algorithms constructed using our artificial neural network framework embedded in Hadoop YARN. Hadoop allows parallel and distributed processing. Each feature selector can be divided into subtasks and the subtasks can then be processed in parallel. Multiple feature selectors can also be processed simultaneously (in parallel) allowing multiple feature selectors to be compared. We identify commonalities among the five features selectors. All can be processed in the framework using a single representation and the overall processing can also be greatly reduced by only processing the common aspects of the feature selectors once and propagating these aspects across all five feature selectors as necessary. This allows the best feature selector and the actual features to select to be identified for large and high dimensional data sets through exploiting the efficiency and flexibility of embedding the binary associative-memory neural network in Hadoop

    Container Handling Algorithms and Outbound Heavy Truck Movement Modeling for Seaport Container Transshipment Terminals

    Get PDF
    This research is divided into four main parts. The first part considers the basic block relocation problem (BRP) in which a set of shipping containers is retrieved using the minimum number of moves by a single gantry crane that handles cargo in the storage area in a container terminal. For this purpose a new algorithm called the look ahead algorithm has been created and tested. The look ahead algorithm is applicable under limited and unlimited stacking height conditions. The look ahead algorithm is compared to the existing algorithms in the literature. The experimental results show that the look ahead algorithm is more efficient than any other algorithm in the literature. The second part of this research considers an extension of the BRP called the block relocation problem with weights (BRP-W). The main goal is to minimize the total fuel consumption of the crane to retrieve all the containers in a bay and to minimize the movements of the heavy containers. The trolleying, hoisting, and lowering movements of the containers are explicitly considered in this part. The twelve parameters to quantify various preferences when moving individual containers are defined. Near-optimal values of the twelve parameters for different bay configurations are found using a genetic algorithm. The third part introduces a shipping cost model that can estimate the cost of shipping specific commodity groups using one freight transportation mode-trucking- from any origin to any destination inside the United States. The model can also be used to estimate general shipping costs for different economic sectors, with significant ramifications for public policy. The last part mimics heavy truck movements for shipping different kinds of containerized commodities between a container terminal and different facilities. The highly detailed cost model from part three is used to evaluate the effect of public policies on truckers\u27 route choices. In particular, the influence of time, distance, and tolls on truckers\u27 route selection is investigated.

    Object detection, recognition and re-identification in video footage

    Get PDF
    There has been a significant number of security concerns in recent times; as a result, security cameras have been installed to monitor activities and to prevent crimes in most public places. These analysis are done either through video analytic or forensic analysis operations on human observations. To this end, within the research context of this thesis, a proactive machine vision based military recognition system has been developed to help monitor activities in the military environment. The proposed object detection, recognition and re-identification systems have been presented in this thesis. A novel technique for military personnel recognition is presented in this thesis. Initially the detected camouflaged personnel are segmented using a grabcut segmentation algorithm. Since in general a camouflaged personnel's uniform appears to be similar both at the top and the bottom of the body, an image patch is initially extracted from the segmented foreground image and used as the region of interest. Subsequently the colour and texture features are extracted from each patch and used for classification. A second approach for personnel recognition is proposed through the recognition of the badge on the cap of a military person. A feature matching metric based on the extracted Speed Up Robust Features (SURF) from the badge on a personnel's cap enabled the recognition of the personnel's arm of service. A state-of-the-art technique for recognising vehicle types irrespective of their view angle is also presented in this thesis. Vehicles are initially detected and segmented using a Gaussian Mixture Model (GMM) based foreground/background segmentation algorithm. A Canny Edge Detection (CED) stage, followed by morphological operations are used as pre-processing stage to help enhance foreground vehicular object detection and segmentation. Subsequently, Region, Histogram Oriented Gradient (HOG) and Local Binary Pattern (LBP) features are extracted from the refined foreground vehicle object and used as features for vehicle type recognition. Two different datasets with variant views of front/rear and angle are used and combined for testing the proposed technique. For night-time video analytics and forensics, the thesis presents a novel approach to pedestrian detection and vehicle type recognition. A novel feature acquisition technique named, CENTROG, is proposed for pedestrian detection and vehicle type recognition in this thesis. Thermal images containing pedestrians and vehicular objects are used to analyse the performance of the proposed algorithms. The video is initially segmented using a GMM based foreground object segmentation algorithm. A CED based pre-processing step is used to enhance segmentation accuracy prior using Census Transforms for initial feature extraction. HOG features are then extracted from the Census transformed images and used for detection and recognition respectively of human and vehicular objects in thermal images. Finally, a novel technique for people re-identification is proposed in this thesis based on using low-level colour features and mid-level attributes. The low-level colour histogram bin values were normalised to 0 and 1. A publicly available dataset (VIPeR) and a self constructed dataset have been used in the experiments conducted with 7 clothing attributes and low-level colour histogram features. These 7 attributes are detected using features extracted from 5 different regions of a detected human object using an SVM classifier. The low-level colour features were extracted from the regions of a detected human object. These 5 regions are obtained by human object segmentation and subsequent body part sub-division. People are re-identified by computing the Euclidean distance between a probe and the gallery image sets. The experiments conducted using SVM classifier and Euclidean distance has proven that the proposed techniques attained all of the aforementioned goals. The colour and texture features proposed for camouflage military personnel recognition surpasses the state-of-the-art methods. Similarly, experiments prove that combining features performed best when recognising vehicles in different views subsequent to initial training based on multi-views. In the same vein, the proposed CENTROG technique performed better than the state-of-the-art CENTRIST technique for both pedestrian detection and vehicle type recognition at night-time using thermal images. Finally, we show that the proposed 7 mid-level attributes and the low-level features results in improved performance accuracy for people re-identification

    Modeling Suspicious Email Detection using Enhanced Feature Selection

    Full text link
    The paper presents a suspicious email detection model which incorporates enhanced feature selection. In the paper we proposed the use of feature selection strategies along with classification technique for terrorists email detection. The presented model focuses on the evaluation of machine learning algorithms such as decision tree (ID3), logistic regression, Na\"ive Bayes (NB), and Support Vector Machine (SVM) for detecting emails containing suspicious content. In the literature, various algorithms achieved good accuracy for the desired task. However, the results achieved by those algorithms can be further improved by using appropriate feature selection mechanisms. We have identified the use of a specific feature selection scheme that improves the performance of the existing algorithms
    • …
    corecore