5 research outputs found

    How Phishing Pages Look Like?

    Get PDF
    Recent phishing campaigns are increasingly targeted to specific, small population of users and last for increasingly shorter life spans. There is thus an urgent need for developing defense mechanisms that do not rely on any forms of blacklisting or reputation: there is simply no time for detecting novel phishing campaigns and notify all interested organizations quickly enough. Such mechanisms should be close to browsers and based solely on the visual appearance of the rendered page. One of the major impediments to research in this area is the lack of systematic knowledge about how phishing pages actually look like. In this work we describe the technical challenges in collecting a large and diverse collection of screenshots of phishing pages and propose practical solutions. We also analyze systematically the visual similarity between phishing pages and pages of targeted organizations, from the point of view of a similarity metric that has been proposed as a foundation for visual phishing detection and from the point of view of a human operator

    Visual analysis of research paper collections using normalized relative compression

    Get PDF
    The analysis of research paper collections is an interesting topic that can give insights on whether a research area is stalled in the same problems, or there is a great amount of novelty every year. Previous research has addressed similar tasks by the analysis of keywords or reference lists, with different degrees of human intervention. In this paper, we demonstrate how, with the use of Normalized Relative Compression, together with a set of automated data-processing tasks, we can successfully visually compare research articles and document collections. We also achieve very similar results with Normalized Conditional Compression that can be applied with a regular compressor. With our approach, we can group papers of different disciplines, analyze how a conference evolves throughout the different editions, or how the profile of a researcher changes through the time. We provide a set of tests that validate our technique, and show that it behaves better for these tasks than other techniques previously proposed.Peer ReviewedPostprint (published version

    File Carving and Malware Identification Algorithms Applied to Firmware Reverse Engineering

    Get PDF
    Modern society depends on critical infrastructure (CI) managed by Programmable Logic Controllers (PLCs). PLCs depend on firmware, though firmware security vulnerabilities and contents remain largely unexplored. Attackers are acquiring the knowledge required to construct and install malicious firmware on CI. To the defender, firmware reverse engineering is a critical, but tedious, process. This thesis applies machine learning algorithms, from the le carving and malware identification fields, to firmware reverse engineering. It characterizes the algorithms\u27 performance. This research describes and characterizes a process to speed and simplify PLC firmware analysis. The system partitions binary firmwares into segments, labels each segment with a le type, determines the target architecture of code segments, then disassembles and performs rudimentary analysis on the code segments. The research discusses the system\u27s accuracy on a set of pseudo-firmwares. Of the algorithms this research considers, a combination of a byte-value frequency file carving algorithm and a support vector machine (SVM) algorithm using information gain (IG) for feature selection achieve the best performance. That combination correctly identifies the file types of 57.4% of non-code bytes, and the architectures of 85.3% of code bytes. This research applies the Firmware Disassembly System to a real-world firmware and discusses the contents
    corecore