576 research outputs found

    Image-based malware classification hybrid framework based on space-filling curves

    Get PDF
    There exists a never-ending “arms race” between malware analysts and adversarial malicious code developers as malevolent programs evolve and countermeasures are developed to detect and eradicate them. Malware has become more complex in its intent and capabilities over time, which has prompted the need for constant improvement in detection and defence methods. Of particular concern are the anti-analysis obfuscation techniques, such as packing and encryption, that are employed by malware developers to evade detection and thwart the analysis process. In such cases, malware is generally impervious to basic analysis methods and so analysts must use more invasive techniques to extract signatures for classification, which are inevitably not scalable due to their complexity. In this article, we present a hybrid framework for malware classification designed to overcome the challenges incurred by current approaches. The framework incorporates novel static and dynamic malware analysis methods, where static malware executables and dynamic process memory dumps are converted to images mapped through space-filling curves, from which visual features are extracted for classification. The framework is less invasive than traditional analysis methods in that there is no reverse engineering required, nor does it suffer from the obfuscation limitations of static analysis. On a dataset of 13,599 obfuscated and non-obfuscated malware samples from 23 families, the framework outperformed both static and dynamic standalone methods with precision, recall and accuracy scores of 97.6%, 97.6% and 97.6% respectively

    Text–to–Video: Image Semantics and NLP

    Get PDF
    When aiming at automatically translating an arbitrary text into a visual story, the main challenge consists in finding a semantically close visual representation whereby the displayed meaning should remain the same as in the given text. Besides, the appearance of an image itself largely influences how its meaningful information is transported towards an observer. This thesis now demonstrates that investigating in both, image semantics as well as the semantic relatedness between visual and textual sources enables us to tackle the challenging semantic gap and to find a semantically close translation from natural language to a corresponding visual representation. Within the last years, social networking became of high interest leading to an enormous and still increasing amount of online available data. Photo sharing sites like Flickr allow users to associate textual information with their uploaded imagery. Thus, this thesis exploits this huge knowledge source of user generated data providing initial links between images and words, and other meaningful data. In order to approach visual semantics, this work presents various methods to analyze the visual structure as well as the appearance of images in terms of meaningful similarities, aesthetic appeal, and emotional effect towards an observer. In detail, our GPU-based approach efficiently finds visual similarities between images in large datasets across visual domains and identifies various meanings for ambiguous words exploring similarity in online search results. Further, we investigate in the highly subjective aesthetic appeal of images and make use of deep learning to directly learn aesthetic rankings from a broad diversity of user reactions in social online behavior. To gain even deeper insights into the influence of visual appearance towards an observer, we explore how simple image processing is capable of actually changing the emotional perception and derive a simple but effective image filter. To identify meaningful connections between written text and visual representations, we employ methods from Natural Language Processing (NLP). Extensive textual processing allows us to create semantically relevant illustrations for simple text elements as well as complete storylines. More precisely, we present an approach that resolves dependencies in textual descriptions to arrange 3D models correctly. Further, we develop a method that finds semantically relevant illustrations to texts of different types based on a novel hierarchical querying algorithm. Finally, we present an optimization based framework that is capable of not only generating semantically relevant but also visually coherent picture stories in different styles.Bei der automatischen Umwandlung eines beliebigen Textes in eine visuelle Geschichte, besteht die größte Herausforderung darin eine semantisch passende visuelle Darstellung zu finden. Dabei sollte die Bedeutung der Darstellung dem vorgegebenen Text entsprechen. Darüber hinaus hat die Erscheinung eines Bildes einen großen Einfluß darauf, wie seine bedeutungsvollen Inhalte auf einen Betrachter übertragen werden. Diese Dissertation zeigt, dass die Erforschung sowohl der Bildsemantik als auch der semantischen Verbindung zwischen visuellen und textuellen Quellen es ermöglicht, die anspruchsvolle semantische Lücke zu schließen und eine semantisch nahe Übersetzung von natürlicher Sprache in eine entsprechend sinngemäße visuelle Darstellung zu finden. Des Weiteren gewann die soziale Vernetzung in den letzten Jahren zunehmend an Bedeutung, was zu einer enormen und immer noch wachsenden Menge an online verfügbaren Daten geführt hat. Foto-Sharing-Websites wie Flickr ermöglichen es Benutzern, Textinformationen mit ihren hochgeladenen Bildern zu verknüpfen. Die vorliegende Arbeit nutzt die enorme Wissensquelle von benutzergenerierten Daten welche erste Verbindungen zwischen Bildern und Wörtern sowie anderen aussagekräftigen Daten zur Verfügung stellt. Zur Erforschung der visuellen Semantik stellt diese Arbeit unterschiedliche Methoden vor, um die visuelle Struktur sowie die Wirkung von Bildern in Bezug auf bedeutungsvolle Ähnlichkeiten, ästhetische Erscheinung und emotionalem Einfluss auf einen Beobachter zu analysieren. Genauer gesagt, findet unser GPU-basierter Ansatz effizient visuelle Ähnlichkeiten zwischen Bildern in großen Datenmengen quer über visuelle Domänen hinweg und identifiziert verschiedene Bedeutungen für mehrdeutige Wörter durch die Erforschung von Ähnlichkeiten in Online-Suchergebnissen. Des Weiteren wird die höchst subjektive ästhetische Anziehungskraft von Bildern untersucht und "deep learning" genutzt, um direkt ästhetische Einordnungen aus einer breiten Vielfalt von Benutzerreaktionen im sozialen Online-Verhalten zu lernen. Um noch tiefere Erkenntnisse über den Einfluss des visuellen Erscheinungsbildes auf einen Betrachter zu gewinnen, wird erforscht, wie alleinig einfache Bildverarbeitung in der Lage ist, tatsächlich die emotionale Wahrnehmung zu verändern und ein einfacher aber wirkungsvoller Bildfilter davon abgeleitet werden kann. Um bedeutungserhaltende Verbindungen zwischen geschriebenem Text und visueller Darstellung zu ermitteln, werden Methoden des "Natural Language Processing (NLP)" verwendet, die der Verarbeitung natürlicher Sprache dienen. Der Einsatz umfangreicher Textverarbeitung ermöglicht es, semantisch relevante Illustrationen für einfache Textteile sowie für komplette Handlungsstränge zu erzeugen. Im Detail wird ein Ansatz vorgestellt, der Abhängigkeiten in Textbeschreibungen auflöst, um 3D-Modelle korrekt anzuordnen. Des Weiteren wird eine Methode entwickelt die, basierend auf einem neuen hierarchischen Such-Anfrage Algorithmus, semantisch relevante Illustrationen zu Texten verschiedener Art findet. Schließlich wird ein optimierungsbasiertes Framework vorgestellt, das nicht nur semantisch relevante, sondern auch visuell kohärente Bildgeschichten in verschiedenen Bildstilen erzeugen kann

    14th SC@RUG 2017 proceedings 2016-2017

    Get PDF

    14th SC@RUG 2017 proceedings 2016-2017

    Get PDF

    14th SC@RUG 2017 proceedings 2016-2017

    Get PDF

    14th SC@RUG 2017 proceedings 2016-2017

    Get PDF

    14th SC@RUG 2017 proceedings 2016-2017

    Get PDF

    14th SC@RUG 2017 proceedings 2016-2017

    Get PDF

    Development of an autonomous distributed multiagent monitoring system for the automatic classification of end users

    Get PDF
    The purpose of this study is to investigate the feasibility of constructing a software Multi-Agent based monitoring and classification system and utilizing it to provide an automated and accurate classification for end users developing applications in the spreadsheet domain. Resulting in, is the creation of the Multi-Agent Classification System (MACS). The Microsoft‘s .NET Windows Service based agents were utilized to develop the Monitoring Agents of MACS. These agents function autonomously to provide continuous and periodic monitoring of spreadsheet workbooks by content. .NET Windows Communication Foundation (WCF) Services technology was used together with the Service Oriented Architecture (SOA) approach for the distribution of the agents over the World Wide Web in order to satisfy the monitoring and classification of the multiple developer aspect. The Prometheus agent oriented design methodology and its accompanying Prometheus Design Tool (PDT) was employed for specifying and designing the agents of MACS, and Visual Studio.NET 2008 for creating the agency using visual C# programming language. MACS was evaluated against classification criteria from the literature with the support of using real-time data collected from a target group of excel spreadsheet developers over a network. The Monitoring Agents were configured to execute automatically, without any user intervention as windows service processes in the .NET web server application of the system. These distributed agents listen to and read the contents of excel spreadsheets development activities in terms of file and author properties, function and formulas used, and Visual Basic for Application (VBA) macro code constructs. Data gathered by the Monitoring Agents from various resources over a period of time was collected and filtered by a Database Updater Agent residing in the .NET client application of the system. This agent then transfers and stores the data in Oracle server database via Oracle stored procedures for further processing that leads to the classification of the end user developers. Oracle data mining classification algorithms: Naive Bayes, Adaptive Naive Bayes, Decision Trees, and Support Vector Machine were utilized to analyse the results from the data gathering process in order to automate the classification of excel spreadsheet developers. The accuracy of the predictions achieved by the models was compared. The results of the comparison showed that Naive Bayes classifier achieved the best results with accuracy of 0.978. Therefore, the MACS can be utilized to provide a Multi-Agent based automated classification solution to spreadsheet developers with a high degree of accuracy

    14th SC@RUG 2017 proceedings 2016-2017

    Get PDF
    corecore