143 research outputs found

    Classifying malicious windows executables using anomaly based detection

    Get PDF
    A malicious executable is broadly defined as any program or piece of code designed to cause damage to a system or the information it contains, or to prevent the system from being used in a normal manner. A generic term used to describe any kind of malicious software is Maiware, which includes Viruses, Worms, Trojans, Backdoors, Root-kits, Spyware and Exploits. Anomaly detection is technique which builds a statistical profile of the normal and malicious data and classifies unseen data based on these two profiles. A detection system is presented here which is anomaly based and focuses on the Windows® platform. Several file infection techniques were studied to understand what particular features in the executable binary are more susceptible to being used for the malicious code propagation. A framework is presented for collecting data for both static (non-execution based) as well as dynamic (execution based) analysis of the malicious executables. Two specific features are extracted using static analysis, Windows API (from the Import Address Table of the Portable Executable Header) and the hex byte frequency count (collected using Hexdump utility) which have been explained in detail. Dynamic analysis features which were extracted are briefly mentioned and the major challenges faced using this data is explained. Classification results using Support Vector Machines for anomaly detection is shown for the two static analysis features. Experimental results have provided classification results with up to 94% accuracy for new, previously unseen executables

    Automatic Generation of Input Grammars Using Symbolic Execution

    Get PDF
    Invalid input often leads to unexpected behavior in a program and is behind a plethora of known and unknown vulnerabilities. To prevent improper input from being processed, the input needs to be validated before the rest of the program executes. Formal language theory facilitates the definition and recognition of proper inputs. We focus on the problem of defining valid input after the program has already been written. We construct a parser that infers the structure of inputs which avoid vulnerabilities while existing work focuses on inferring the structure of input the program anticipates. We present a tool that constructs an input language, given the program as input, using symbolic execution on symbolic arguments. This differs from existing work which tracks the execution of concrete inputs to infer a grammar. We test our tool on programs with known vulnerabilities, including programs in the GNU Coreutils library, and we demonstrate how the parser catches known invalid inputs. We conclude that the synthesis of the complete parser cannot be entirely automated due to limitations of symbolic execution tools and issues of computability. A more comprehensive parser must additionally be informed by examples and counterexamples of the input language

    PolyFS Visualizer

    Get PDF
    One of the most important operating system topics, file systems, control how we store and access data and form a key point in a computer scientists understanding of the underlying mechanisms of a computer. However, file systems, with their abstract concepts and lack of concrete learning aids, is a confusing subjects for students. Historically at Cal Poly, the CPE 453 Introduction to Operating Systems has been on of the most failed classes in the computing majors, leading to the need for better teaching and learning tools. Tools allowing students to gain concrete examples of abstract concepts could be used to better prepare students for industry. The PolyFS Visualizer is a block level file system visualization service built for the PolyFS and TinyFS file systems design specifications currently used by some of professors teaching CPE 453. The service allows students to easily view the blocks of their file system and see metadata, the blocks binary content and the interlinked structure. Students can either compile their file system code with a provided block emulation library to build their disk on a remote server and make use of a visualization website or place the file mounted as their file system directly into the visualization service to view it locally. This allows students to easily view, debug and explore their implementation of a file system to understand how different design decisions affect its operation. The implementation includes three main components: a disk emulation library in C for compilation with students code, a node JS back-end to handle students file systems and block operations and a read only visualization service. We have conducted two surveys of students in order to determine the usefulness of the PolyFS Visualizer. Students responded that the use of the PolyFS visualizer helps with the PolyFS file system design project and has several ideas for future features and expansions

    Malware Target Recognition via Static Heuristics

    Get PDF
    Organizations increasingly rely on the confidentiality, integrity and availability of their information and communications technologies to conduct effective business operations while maintaining their competitive edge. Exploitation of these networks via the introduction of undetected malware ultimately degrades their competitive edge, while taking advantage of limited network visibility and the high cost of analyzing massive numbers of programs. This article introduces the novel Malware Target Recognition (MaTR) system which combines the decision tree machine learning algorithm with static heuristic features for malware detection. By focusing on contextually important static heuristic features, this research demonstrates superior detection results. Experimental results on large sample datasets demonstrate near ideal malware detection performance (99.9+% accuracy) with low false positive (8.73e-4) and false negative rates (8.03e-4) at the same point on the performance curve. Test results against a set of publicly unknown malware, including potential advanced competitor tools, show MaTR’s superior detection rate (99%) versus the union of detections from three commercial antivirus products (60%). The resulting model is a fine granularity sensor with potential to dramatically augment cyberspace situation awareness

    Logging and Analysis of Internet of Things (IoT) Device Network Traffic and Power Consumption

    Get PDF
    An increasing number of devices, from coffee makers to electric kettles, are becoming connected to the Internet. These are all a part of the Internet of Things, or IoT. Each device generates unique network traffic and power consumption patterns. Until now, there has not been a comprehensive set of data that captures these traffic and power patterns. This thesis documents how we collected 10 to 15 weeks of network traffic and power consumption data from 15 different IoT devices and provides an analysis of a subset of 6 devices. Devices including an Amazon Echo Dot, Google Home Mini, and Google Chromecast were used on a regular basis and all of their network traffic and power consumption was logged to a MySQL database. The database currently contains 64 million packets and 71 gigabytes of data and is still growing in size as more data is collected 24/7 from each device. We show that it is possible to see when users are asking their smart speaker a question or whether the lights in their home are on or off based on power consumption and network traffic from the devices. These trends can be seen even if the data being sent is encrypted
    • …
    corecore