169 research outputs found

    Software for malicious macro detection

    Get PDF
    The objective of this work is to give a detailed study of the development process of a software tool for the detection of the Emotet virus in Microsoft Office files, Emotet is a virus that has been wreaking havoc mainly in the business environment, from its beginnings as a banking Trojan to nowadays. In fact, this polymorphic family has managed to generate evident, incalculable and global inconveniences in the business activity without discriminating by corporate typology, affecting any company regardless of its size or sector, even entering into government agencies, as well as the citizens themselves as a whole. The existence of two main obstacles for the detection of this virus, constitute an intrinsic reality to it, on the one hand, the obfuscation in its macros and on the other, its polymorphism, are essential pieces of the analysis, focusing our tool in facing precisely two obstacles, descending to the analysis of the macros features and the creation of a neuron network that uses machine learning to recognize the detection patterns and deliberate its malicious nature. With Emotet's in-depth nature analysis, our goal is to draw out a set of features from the malicious macros and build a machine learning model for their detection. After the feasibility study of this project, its design and implementation, the results that emerge endorse the intention to detect Emotet starting only from the static analysis and with the application of machine learning techniques. The detection ratios shown by the tests performed on the final model, present a accuracy of 84% and only 3% of false positives during this detection process.Grado en Ingeniería Informátic

    Analyzing Web Server Access Log Files Using Data Mining Techniques

    Get PDF
    Nowadays web is not only considered as a network for acquiring data, buying products and obtaining services but as a social environment for interaction and information sharing. As the number of web sites continues to grow it becomes more difficult for users to find and extract information. As a solution to that problem, during the last decade, web mining is used to evaluate the web sites, to personalize the information that is displayed to a user or set of users or to adapt the indexing structure of a web site to meet the needs of the users. In this work we describe a methodology for web usage mining that enables discovering user access patterns. Particularly we are interested whether the topology of the web site matches the desires of the users. Data collections that are used for analysis and interpretation of user viewing patterns are taken from the web server log files. Data mining techniques, such as classification, clustering and association rules are applied on preprocessed data. The intent of this research is to propose techniques for improvement of user perception and interaction with a web site

    Workflow Provenance: from Modeling to Reporting

    Get PDF
    Workflow provenance is a crucial part of a workflow system as it enables data lineage analysis, error tracking, workflow monitoring, usage pattern discovery, and so on. Integrating provenance into a workflow system or modifying a workflow system to capture or analyze different provenance information is burdensome, requiring extensive development because provenance mechanisms rely heavily on the modelling, architecture, and design of the workflow system. Various tools and technologies exist for logging events in a software system. Unfortunately, logging tools and technologies are not designed for capturing and analyzing provenance information. Workflow provenance is not only about logging, but also about retrieving workflow related information from logs. In this work, we propose a taxonomy of provenance questions and guided by these questions, we created a workflow programming model 'ProvMod' with a supporting run-time library to provide automated provenance and log analysis for any workflow system. The design and provenance mechanism of ProvMod is based on recommendations from prominent research and is easy to integrate into any workflow system. ProvMod offers Neo4j graph database support to manage semi-structured heterogeneous JSON logs. The log structure is adaptable to any NoSQL technology. For each provenance question in our taxonomy, ProvMod provides the answer with data visualization using Neo4j and the ELK Stack. Besides analyzing performance from various angles, we demonstrate the ease of integration by integrating ProvMod with Apache Taverna and evaluate ProvMod usability by engaging users. Finally, we present two Software Engineering research cases (clone detection and architecture extraction) where our proposed model ProvMod and provenance questions taxonomy can be applied to discover meaningful insights
    • …
    corecore