71 research outputs found

    Malware Detection Approaches based on Operational Codes (OpCodes) of Executable Programs: A Review

    Get PDF
    A malicious software, or Malware for a short, poses a threat to computer systems, which need to be analyzed, detected, and eliminated. Generally, malware is analyzed in two ways: dynamic malware analysis and static malware analysis. The former collects features dataset during running of the malware, and involves malware APIs, registry activities, file activities, process activities, and network activities based features. The latter collects features dataset prior and without running the malware, and involves Operational Codes (OpCodes) and text based (Bytecodes) features. However, several previous researchers addressed and reviewed malware detection approaches based on various aspects, but none of them addressed and reviewed the approaches merely based on malware OpCodes. Therefore, this paper aims to review Malware Detection Approaches based on OpCodes. The review explores, demonstrates, and compares the existing approaches for detecting malware according to their OpCodes only, and finally presents a comprehensive comparable envisage about them

    Advanced Security Analysis for Emergent Software Platforms

    Get PDF
    Emergent software ecosystems, boomed by the advent of smartphones and the Internet of Things (IoT) platforms, are perpetually sophisticated, deployed into highly dynamic environments, and facilitating interactions across heterogeneous domains. Accordingly, assessing the security thereof is a pressing need, yet requires high levels of scalability and reliability to handle the dynamism involved in such volatile ecosystems. This dissertation seeks to enhance conventional security detection methods to cope with the emergent features of contemporary software ecosystems. In particular, it analyzes the security of Android and IoT ecosystems by developing rigorous vulnerability detection methods. A critical aspect of this work is the focus on detecting vulnerable and unsafe interactions between applications that share common components and devices. Contributions of this work include novel insights and methods for: (1) detecting vulnerable interactions between Android applications that leverage dynamic loading features for concealing the interactions; (2) identifying unsafe interactions between smart home applications by considering physical and cyber channels; (3) detecting malicious IoT applications that are developed to target numerous IoT devices; (4) detecting insecure patterns of emergent security APIs that are reused from open-source software. In all of the four research thrusts, we present thorough security analysis and extensive evaluations based on real-world applications. Our results demonstrate that the proposed detection mechanisms can efficiently and effectively detect vulnerabilities in contemporary software platforms. Advisers: Hamid Bagheri and Qiben Ya

    Challenges and Outlook in Machine Learning-based Malware Detection for Android

    Get PDF
    Just like in traditional desktop computing, one of the major security issues in mobile computing lies in malicious software. Several recent studies have shown that Android, as today’s most widespread Operating System, is the target of most of the new families of malware. Manually analysing an Android application to determine whether it is malicious or not is a time- consuming process. Furthermore, because of the complexity of analysing an application, this task can only be conducted by highly-skilled—hence hard to come by—professionals. Researchers naturally sought to transfer this process from humans to computers to lower the cost of detecting malware. Machine-Learning techniques, looking at patterns amongst known malware and inferring models of what discriminates malware from goodware, have long been summoned to build malware detectors. The vast quantity of data involved in malware detection, added to the fact that we do not know a priori how to express in technical terms the difference between malware and goodware, indeed makes the malware detection question a seemingly textbook example of a possible Machine- Learning application. Despite the vast amount of literature published on the topic of detecting malware with machine- learning, malware detection is not a solved problem. In this Thesis, we investigate issues that affect performance evaluation and that thus may render current machine learning-based mal- ware detectors for Android hardly usable in practical settings, and we propose an approach to overcome those issues. While the experiments presented in this thesis all rely on feature-sets obtained through lightweight static analysis, several of our findings could apply equally to all Machine Learning-based malware detection approaches. In the first part of this thesis, background information on machine-learning and on malware detection is provided, and the related work is described. A snapshot of the malware landscape in Android application markets is then presented. The second part discusses three pitfalls hindering the evaluation of malware detectors. We show with extensive experiments how validation methodology, History-unaware dataset construction and the choice of a ground truth can heavily interfere with the performance results of malware detectors. In a third part, we present an practical approach to detect Android Malware in real-world settings. We then propose several research paths to get closer to our long term goal of building practical, dependable and predictable Android Malware detectors

    Toward Building an Intelligent and Secure Network: An Internet Traffic Forecasting Perspective

    Get PDF
    Internet traffic forecast is a crucial component for the proactive management of self-organizing networks (SON) to ensure better Quality of Service (QoS) and Quality of Experience (QoE). Given the volatile and random nature of traffic data, this forecasting influences strategic development and investment decisions in the Internet Service Provider (ISP) industry. Modern machine learning algorithms have shown potential in dealing with complex Internet traffic prediction tasks, yet challenges persist. This thesis systematically explores these issues over five empirical studies conducted in the past three years, focusing on four key research questions: How do outlier data samples impact prediction accuracy for both short-term and long-term forecasting? How can a denoising mechanism enhance prediction accuracy? How can robust machine learning models be built with limited data? How can out-of-distribution traffic data be used to improve the generalizability of prediction models? Based on extensive experiments, we propose a novel traffic forecast/prediction framework and associated models that integrate outlier management and noise reduction strategies, outperforming traditional machine learning models. Additionally, we suggest a transfer learning-based framework combined with a data augmentation technique to provide robust solutions with smaller datasets. Lastly, we propose a hybrid model with signal decomposition techniques to enhance model generalization for out-of-distribution data samples. We also brought the issue of cyber threats as part of our forecast research, acknowledging their substantial influence on traffic unpredictability and forecasting challenges. Our thesis presents a detailed exploration of cyber-attack detection, employing methods that have been validated using multiple benchmark datasets. Initially, we incorporated ensemble feature selection with ensemble classification to improve DDoS (Distributed Denial-of-Service) attack detection accuracy with minimal false alarms. Our research further introduces a stacking ensemble framework for classifying diverse forms of cyber-attacks. Proceeding further, we proposed a weighted voting mechanism for Android malware detection to secure Mobile Cyber-Physical Systems, which integrates the mobility of various smart devices to exchange information between physical and cyber systems. Lastly, we employed Generative Adversarial Networks for generating flow-based DDoS attacks in Internet of Things environments. By considering the impact of cyber-attacks on traffic volume and their challenges to traffic prediction, our research attempts to bridge the gap between traffic forecasting and cyber security, enhancing proactive management of networks and contributing to resilient and secure internet infrastructure

    Introductory Computer Forensics

    Get PDF
    INTERPOL (International Police) built cybercrime programs to keep up with emerging cyber threats, and aims to coordinate and assist international operations for ?ghting crimes involving computers. Although signi?cant international efforts are being made in dealing with cybercrime and cyber-terrorism, ?nding effective, cooperative, and collaborative ways to deal with complicated cases that span multiple jurisdictions has proven dif?cult in practic

    Behavioral analysis in cybersecurity using machine learning: a study based on graph representation, class imbalance and temporal dissection

    Get PDF
    The main goal of this thesis is to improve behavioral cybersecurity analysis using machine learning, exploiting graph structures, temporal dissection, and addressing imbalance problems.This main objective is divided into four specific goals: OBJ1: To study the influence of the temporal resolution on highlighting micro-dynamics in the entity behavior classification problem. In real use cases, time-series information could be not enough for describing the entity behavior classification. For this reason, we plan to exploit graph structures for integrating both structured and unstructured data in a representation of entities and their relationships. In this way, it will be possible to appreciate not only the single temporal communication but the whole behavior of these entities. Nevertheless, entity behaviors evolve over time and therefore, a static graph may not be enoughto describe all these changes. For this reason, we propose to use a temporal dissection for creating temporal subgraphs and therefore, analyze the influence of the temporal resolution on the graph creation and the entity behaviors within. Furthermore, we propose to study how the temporal granularity should be used for highlighting network micro-dynamics and short-term behavioral changes which can be a hint of suspicious activities. OBJ2: To develop novel sampling methods that work with disconnected graphs for addressing imbalanced problems avoiding component topology changes. Graph imbalance problem is a very common and challenging task and traditional graph sampling techniques that work directly on these structures cannot be used without modifying the graph’s intrinsic information or introducing bias. Furthermore, existing techniques have shown to be limited when disconnected graphs are used. For this reason, novel resampling methods for balancing the number of nodes that can be directly applied over disconnected graphs, without altering component topologies, need to be introduced. In particular, we propose to take advantage of the existence of disconnected graphs to detect and replicate the most relevant graph components without changing their topology, while considering traditional data-level strategies for handling the entity behaviors within. OBJ3: To study the usefulness of the generative adversarial networks for addressing the class imbalance problem in cybersecurity applications. Although traditional data-level pre-processing techniques have shown to be effective for addressing class imbalance problems, they have also shown downside effects when highly variable datasets are used, as it happens in cybersecurity. For this reason, new techniques that can exploit the overall data distribution for learning highly variable behaviors should be investigated. In this sense, GANs have shown promising results in the image and video domain, however, their extension to tabular data is not trivial. For this reason, we propose to adapt GANs for working with cybersecurity data and exploit their ability in learning and reproducing the input distribution for addressing the class imbalance problem (as an oversampling technique). Furthermore, since it is not possible to find a unique GAN solution that works for every scenario, we propose to study several GAN architectures with several training configurations to detect which is the best option for a cybersecurity application. OBJ4: To analyze temporal data trends and performance drift for enhancing cyber threat analysis. Temporal dynamics and incoming new data can affect the quality of the predictions compromising the model reliability. This phenomenon makes models get outdated without noticing. In this sense, it is very important to be able to extract more insightful information from the application domain analyzing data trends, learning processes, and performance drifts over time. For this reason, we propose to develop a systematic approach for analyzing how the data quality and their amount affect the learning process. Moreover, in the contextof CTI, we propose to study the relations between temporal performance drifts and the input data distribution for detecting possible model limitations, enhancing cyber threat analysis.Programa de Doctorado en Ciencias y Tecnologías Industriales (RD 99/2011) Industria Zientzietako eta Teknologietako Doktoretza Programa (ED 99/2011

    Detection and Classification of Malicious Processes Using System Call Analysis

    Get PDF
    Despite efforts to mitigate the malware threat, the proliferation of malware continues, with record-setting numbers of malware samples being discovered each quarter. Malware are any intentionally malicious software, including software designed for extortion, sabotage, and espionage. Traditional malware defenses are primarily signature-based and heuristic-based, and include firewalls, intrusion detection systems, and antivirus software. Such defenses are reactive, performing well against known threats but struggling against new malware variants and zero-day threats. Together, the reactive nature of traditional defenses and the continuing spread of malware motivate the development of new techniques to detect such threats. One promising set of techniques uses features extracted from system call traces to infer malicious behaviors. This thesis studies the problem of detecting and classifying malicious processes using system call trace analysis. The goal of this study is to identify techniques that are `lightweight' enough and exhibit a low enough false positive rate to be deployed in production environments. The major contributions of this work are (1) a study of the effects of feature extraction strategy on malware detection performance; (2) the comparison of signature-based and statistical analysis techniques for malware detection and classification; (3) the use of sequential detection techniques to identify malicious behaviors as quickly as possible; (4) a study of malware detection performance at very low false positive rates; and (5) an extensive empirical evaluation, wherein the performance of the malware detection and classification systems are evaluated against data collected from production hosts and from the execution of recently discovered malware samples. The outcome of this study is a proof-of-concept system that detects the execution of malicious processes in production environments and classifies them according to their similarity to known malware.Ph.D., Electrical Engineering -- Drexel University, 201

    A Survey on Modality Characteristics, Performance Evaluation Metrics, and Security for Traditional and Wearable Biometric Systems

    Get PDF
    Biometric research is directed increasingly towards Wearable Biometric Systems (WBS) for user authentication and identification. However, prior to engaging in WBS research, how their operational dynamics and design considerations differ from those of Traditional Biometric Systems (TBS) must be understood. While the current literature is cognizant of those differences, there is no effective work that summarizes the factors where TBS and WBS differ, namely, their modality characteristics, performance, security and privacy. To bridge the gap, this paper accordingly reviews and compares the key characteristics of modalities, contrasts the metrics used to evaluate system performance, and highlights the divergence in critical vulnerabilities, attacks and defenses for TBS and WBS. It further discusses how these factors affect the design considerations for WBS, the open challenges and future directions of research in these areas. In doing so, the paper provides a big-picture overview of the important avenues of challenges and potential solutions that researchers entering the field should be aware of. Hence, this survey aims to be a starting point for researchers in comprehending the fundamental differences between TBS and WBS before understanding the core challenges associated with WBS and its design
    corecore