17 research outputs found

    Youtube for software security?: Youtube Videos Provide Pointers for Microservice Security

    Get PDF
    Microservice applications are defined as software applications, which include services that interact with one another but failure of one service does not impact the execution of another. Microservice oriented design has become a popular software application design paradigm among software companies, such as Uber, Netflix, and Amazon as well as small startup companies due to delivery speed, reliability and greater flexibility. However, any insecure coding pattern in the code while developing microservice applications can make the entire system vulnerable to hackers. The goal of the abstract is to help software developers in building secure microservice applications. We have conducted a qualitative analysis of 6 youtube videos on microservice design antipatterns and an empirical study on open source microservice repositories. We have observed insecure coding patterns in those microservice repositories. We have defined 9 categories each with an associated pattern namely HTTP without TLS, authentication vs authorization, hard coded secret, weak encryption algorithm, use of default ports, violation of least privilege principle, insufficient logging, poor orchestration layer configuration, API service sharing and distributed deadlock. We advocate for future research that will create a taxonomy of insecure coding patterns so that developers can find and resolve insecure coding patterns during code review

    Towards Semantic Detection of Smells in Cloud Infrastructure Code

    Full text link
    Automated deployment and management of Cloud applications relies on descriptions of their deployment topologies, often referred to as Infrastructure Code. As the complexity of applications and their deployment models increases, developers inadvertently introduce software smells to such code specifications, for instance, violations of good coding practices, modular structure, and more. This paper presents a knowledge-driven approach enabling developers to identify the aforementioned smells in deployment descriptions. We detect smells with SPARQL-based rules over pattern-based OWL 2 knowledge graphs capturing deployment models. We show the feasibility of our approach with a prototype and three case studies.Comment: 5 pages, 6 figures. The 10 th International Conference on Web Intelligence, Mining and Semantics (WIMS 2020

    Detecting and Characterizing Propagation of Security Weaknesses in Puppet-based Infrastructure Management

    Full text link
    Despite being beneficial for managing computing infrastructure automatically, Puppet manifests are susceptible to security weaknesses, e.g., hard-coded secrets and use of weak cryptography algorithms. Adequate mitigation of security weaknesses in Puppet manifests is thus necessary to secure computing infrastructure that are managed with Puppet manifests. A characterization of how security weaknesses propagate and affect Puppet-based infrastructure management, can inform practitioners on the relevance of the detected security weaknesses, as well as help them take necessary actions for mitigation. To that end, we conduct an empirical study with 17,629 Puppet manifests mined from 336 open source repositories. We construct Taint Tracker for Puppet Manifests (TaintPup), for which we observe 2.4 times more precision compared to that of a state-of-the-art security static analysis tool. TaintPup leverages Puppet-specific information flow analysis using which we characterize propagation of security weaknesses. From our empirical study, we observe security weaknesses to propagate into 4,457 resources, i.e, Puppet-specific code elements used to manage infrastructure. A single instance of a security weakness can propagate into as many as 35 distinct resources. We observe security weaknesses to propagate into 7 categories of resources, which include resources used to manage continuous integration servers and network controllers. According to our survey with 24 practitioners, propagation of security weaknesses into data storage-related resources is rated to have the most severe impact for Puppet-based infrastructure management.Comment: 14 pages, currently under revie

    Detecting Missing Dependencies and Notifiers in Puppet Programs

    Full text link
    Puppet is a popular computer system configuration management tool. It provides abstractions that enable administrators to setup their computer systems declaratively. Its use suffers from two potential pitfalls. First, if ordering constraints are not specified whenever an abstraction depends on another, the non-deterministic application of abstractions can lead to race conditions. Second, if a service is not tied to its resources through notification constructs, the system may operate in a stale state whenever a resource gets modified. Such faults can degrade a computing infrastructure's availability and functionality. We have developed an approach that identifies these issues through the analysis of a Puppet program and its system call trace. Specifically, we present a formal model for traces, which allows us to capture the interactions of Puppet abstractions with the file system. By analyzing these interactions we identify (1) abstractions that are related to each other (e.g., operate on the same file), and (2) abstractions that should act as notifiers so that changes are correctly propagated. We then check the relationships from the trace's analysis against the program's dependency graph: a representation containing all the ordering constraints and notifications declared in the program. If a mismatch is detected, our system reports a potential fault. We have evaluated our method on a large set of Puppet modules, and discovered 57 previously unknown issues in 30 of them. Benchmarking further shows that our approach can analyze in minutes real-world configurations with a magnitude measured in thousands of lines and millions of system calls

    A Dataset for GitHub Repository Deduplication: Extended Description

    Full text link
    GitHub projects can be easily replicated through the site's fork process or through a Git clone-push sequence. This is a problem for empirical software engineering, because it can lead to skewed results or mistrained machine learning models. We provide a dataset of 10.6 million GitHub projects that are copies of others, and link each record with the project's ultimate parent. The ultimate parents were derived from a ranking along six metrics. The related projects were calculated as the connected components of an 18.2 million node and 12 million edge denoised graph created by directing edges to ultimate parents. The graph was created by filtering out more than 30 hand-picked and 2.3 million pattern-matched clumping projects. Projects that introduced unwanted clumping were identified by repeatedly visualizing shortest path distances between unrelated important projects. Our dataset identified 30 thousand duplicate projects in an existing popular reference dataset of 1.8 million projects. An evaluation of our dataset against another created independently with different methods found a significant overlap, but also differences attributed to the operational definition of what projects are considered as related.Comment: 33 pages, 33 figures, 17 listing
    corecore