76,184 research outputs found

    PDF-Malware Detection: A Survey and Taxonomy of Current Techniques

    Get PDF
    Portable Document Format, more commonly known as PDF, has become, in the last 20 years, a standard for document exchange and dissemination due its portable nature and widespread adoption. The flexibility and power of this format are not only leveraged by benign users, but from hackers as well who have been working to exploit various types of vulnerabilities, overcome security restrictions, and then transform the PDF format in one among the leading malicious code spread vectors. Analyzing the content of malicious PDF files to extract the main features that characterize the malware identity and behavior, is a fundamental task for modern threat intelligence platforms that need to learn how to automatically identify new attacks. This paper surveys existing state of the art about systems for the detection of malicious PDF files and organizes them in a taxonomy that separately considers the used approaches and the data analyzed to detect the presence of malicious code. © Springer International Publishing AG, part of Springer Nature 2018

    Detecting and Refactoring Operational Smells within the Domain Name System

    Full text link
    The Domain Name System (DNS) is one of the most important components of the Internet infrastructure. DNS relies on a delegation-based architecture, where resolution of names to their IP addresses requires resolving the names of the servers responsible for those names. The recursive structures of the inter dependencies that exist between name servers associated with each zone are called dependency graphs. System administrators' operational decisions have far reaching effects on the DNSs qualities. They need to be soundly made to create a balance between the availability, security and resilience of the system. We utilize dependency graphs to identify, detect and catalogue operational bad smells. Our method deals with smells on a high-level of abstraction using a consistent taxonomy and reusable vocabulary, defined by a DNS Operational Model. The method will be used to build a diagnostic advisory tool that will detect configuration changes that might decrease the robustness or security posture of domain names before they become into production.Comment: In Proceedings GaM 2015, arXiv:1504.0244

    The cultural, ethnic and linguistic classification of populations and neighbourhoods using personal names

    Get PDF
    There are growing needs to understand the nature and detailed composition of ethnicgroups in today?s increasingly multicultural societies. Ethnicity classifications areoften hotly contested, but still greater problems arise from the quality and availabilityof classifications, with knock on consequences for our ability meaningfully tosubdivide populations. Name analysis and classification has been proposed as oneefficient method of achieving such subdivisions in the absence of ethnicity data, andmay be especially pertinent to public health and demographic applications. However,previous approaches to name analysis have been designed to identify one or a smallnumber of ethnic minorities, and not complete populations.This working paper presents a new methodology to classify the UK population andneighbourhoods into groups of common origin using surnames and forenames. Itproposes a new ontology of ethnicity that combines some of its multidimensionalfacets; language, religion, geographical region, and culture. It uses data collected atvery fine temporal and spatial scales, and made available, subject to safeguards, at thelevel of the individual. Such individuals are classified into 185 independentlyassigned categories of Cultural Ethnic and Linguistic (CEL) groups, based on theprobable origins of names. We include a justification for the need of classifyingethnicity, a proposed CEL taxonomy, a description of how the CEL classification wasbuilt and applied, a preliminary external validation, and some examples of current andpotential applications

    Evolutionary history of Leishmania killicki (synonymous Leishmania tropica) and taxonomic implications

    Get PDF
    Background: Leishmania (L.) killicki is responsible for the chronic cutaneous leishmaniasis. The taxonomic status of this parasite is still not well defined. It was suggested on one hand to include this taxon within L. tropica complex but also on the other hand to consider it as a distinct phylogenetic complex. The present work represents the more detailed study on the evolutionary history of L. killicki relative to L. tropica and the taxonomic implications. Methods: Thirty five L. killicki and 25 L. tropica strains isolated from humans and from several countries were characterized using the MultiLocus Enzyme Electrophoresis (MLEE) and the MultiLocus Sequence Typing (MLST) approaches. Results: The genetic and phylogenetic analyses strongly support that L. killicki belongs to L. tropica complex. The study suggests the emergence of L. killicki by a funder effect followed by an independent evolution from L. tropica, but does not validate the species status of this taxon. In this context, we suggest to call this taxon L. killicki (synonymous L. tropica) until further epidemiological and phylogenetic studies justify the L. killicki denomination. Conclusions: These findings provided taxonomic and phylogenetic informations on L. killicki and helped to better know the evolutionary history of this taxon

    Dissection of a Bug Dataset: Anatomy of 395 Patches from Defects4J

    Full text link
    Well-designed and publicly available datasets of bugs are an invaluable asset to advance research fields such as fault localization and program repair as they allow directly and fairly comparison between competing techniques and also the replication of experiments. These datasets need to be deeply understood by researchers: the answer for questions like "which bugs can my technique handle?" and "for which bugs is my technique effective?" depends on the comprehension of properties related to bugs and their patches. However, such properties are usually not included in the datasets, and there is still no widely adopted methodology for characterizing bugs and patches. In this work, we deeply study 395 patches of the Defects4J dataset. Quantitative properties (patch size and spreading) were automatically extracted, whereas qualitative ones (repair actions and patterns) were manually extracted using a thematic analysis-based approach. We found that 1) the median size of Defects4J patches is four lines, and almost 30% of the patches contain only addition of lines; 2) 92% of the patches change only one file, and 38% has no spreading at all; 3) the top-3 most applied repair actions are addition of method calls, conditionals, and assignments, occurring in 77% of the patches; and 4) nine repair patterns were found for 95% of the patches, where the most prevalent, appearing in 43% of the patches, is on conditional blocks. These results are useful for researchers to perform advanced analysis on their techniques' results based on Defects4J. Moreover, our set of properties can be used to characterize and compare different bug datasets.Comment: Accepted for SANER'18 (25th edition of IEEE International Conference on Software Analysis, Evolution and Reengineering), Campobasso, Ital

    Survey of Machine Learning Techniques for Malware Analysis

    Get PDF
    Coping with malware is getting more and more challenging, given their relentless growth in complexity and volume. One of the most common approaches in literature is using machine learning techniques, to automatically learn models and patterns behind such complexity, and to develop technologies for keeping pace with the speed of development of novel malware. This survey aims at providing an overview on the way machine learning has been used so far in the context of malware analysis. We systematize surveyed papers according to their objectives (i.e., the expected output, what the analysis aims to), what information about malware they specifically use (i.e., the features), and what machine learning techniques they employ (i.e., what algorithm is used to process the input and produce the output). We also outline a number of problems concerning the datasets used in considered works, and finally introduce the novel concept of malware analysis economics, regarding the study of existing tradeoffs among key metrics, such as analysis accuracy and economical costs
    corecore