2,665 research outputs found

    Efficient and Flexible Discovery of PHP Application Vulnerabilities

    Get PDF
    The Web today is a growing universe of pages and applications teeming with interactive content. The security of such applications is of the utmost importance, as exploits can have a devastating impact on personal and economic levels. The number one programming language in Web applications is PHP, powering more than 80% of the top ten million websites. Yet it was not designed with security in mind, and, today, bears a patchwork of fixes and inconsistently designed functions with often unexpected and hardly predictable behavior that typically yield a large attack surface. Consequently, it is prone to different types of vulnerabilities, such as SQL Injection or Cross-Site Scripting. In this paper, we present an interprocedural analysis technique for PHP applications based on code property graphs that scales well to large amounts of code and is highly adaptable in its nature. We implement our prototype using the latest features of PHP 7, leverage an efficient graph database to store code property graphs for PHP, and subsequently identify different types of Web application vulnerabilities by means of programmable graph traversals. We show the efficacy and the scalability of our approach by reporting on an analysis of 1,854 popular open-source projects, comprising almost 80 million lines of code

    A Hybrid Graph Neural Network Approach for Detecting PHP Vulnerabilities

    Full text link
    This paper presents DeepTective, a deep learning approach to detect vulnerabilities in PHP source code. Our approach implements a novel hybrid technique that combines Gated Recurrent Units and Graph Convolutional Networks to detect SQLi, XSS and OSCI vulnerabilities leveraging both syntactic and semantic information. We evaluate DeepTective and compare it to the state of the art on an established synthetic dataset and on a novel real-world dataset collected from GitHub. Experimental results show that DeepTective achieves near perfect classification on the synthetic dataset, and an F1 score of 88.12% on the realistic dataset, outperforming related approaches. We validate DeepTective in the wild by discovering 4 novel vulnerabilities in established WordPress plugins.Comment: A poster version of this paper appeared as https://doi.org/10.1145/3412841.344213

    The approaches to quantify web application security scanners quality: A review

    Get PDF
    The web application security scanner is a computer program that assessed web application security with penetration testing technique. The benefit of automated web application penetration testing is huge, which web application security scanner not only reduced the time, cost, and resource required for web application penetration testing but also eliminate test engineer reliance on human knowledge. Nevertheless, web application security scanners are possessing weaknesses of low test coverage, and the scanners are generating inaccurate test results. Consequently, experimentations are frequently held to quantitatively quantify web application security scanner's quality to investigate the web application security scanner's strengths and limitations. However, there is a discovery that neither a standard methodology nor criterion is available for quantifying the web application security scanner's quality. Hence, in this paper systematic review is conducted and analysed the methodology and criterion used for quantifying web application security scanners' quality. In this survey, the experiment methodologies and criterions that had been used to quantify web application security scanner's quality is classified and review using the preferred reporting items for systematic reviews and meta-analyses (PRISMA) protocol. The objectives are to provide practitioners with the understanding of methodologies and criterions that available for measuring web application security scanners' test coverage, attack coverage, and vulnerability detection rate, while provides the critical hint for development of the next testing framework, model, methodology, or criterions, to measure web application security scanner quality

    Lom: discovering logic flaws within MongoDB-based web applications

    Get PDF
    Logic flaws within web applications will allow malicious operations to be triggered towards back-end database. Existing approaches to identifying logic flaws of database accesses are strongly tied to structured query language (SQL) statement construction and cannot be applied to the new generation of web applications that use not only structured query language (NoSQL) databases as the storage tier. In this paper, we present Lom, a black-box approach for discovering many categories of logic flaws within MongoDBbased web applications. Our approach introduces a MongoDB operation model to support new features of MongoDB and models the application logic as a mealy finite state machine. During the testing phase, test inputs which emulate state violation attacks are constructed for identifying logic flaws at each application state. We apply Lom to several MongoDB-based web applications and demonstrate its effectiveness

    Detecting Web Vulnerabilities in an Intermediate Language by Resorting to Machine Learning Techniques

    Get PDF
    Tese de mestrado, Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2020The number of vulnerabilities has grown exponentially over the last years, with SQL Injection being especially troublesome for web applications. In parallel, novel research has shown the potential of Machine Learning to find vulnerabilities, which can aid experts to reduce the search space or even classify programs on its own. Previous work, however, rarely includes SQL Injection or considers popular serverside languages for web application development like PHP. In our work, we construct a Deep Learning model capable of classifying PHP excerpts as vulnerable (or not) to SQL Injection. We use an intermediate language to represent the excerpts and interpret them as text, resorting to well-studied Natural Language Processing techniques. This work can help back-end programmers discover SQL Injection in an early stage of the project, avoiding attacks that would eventually cost a lot to repair their damage. We also investigate which information should be fed to the model. Hence, we built four datasets (the Opcode Dataset, the Opcode+Operand Dataset, the Slice Dataset, and the Simplified Slice Dataset) from the bytecode dataset that represent each PHP excerpt differently. This approach is a simpler alternative to complex data structures previously used to represent code’s control flow. For each of those datasets, we performed several experiments to evaluate alternative configurations for the model. For all datasets, we managed to find a setting that leads to a score, on average, above 60% for the accuracy, precision, and recall

    M-STAR: A Modular, Evidence-based Software Trustworthiness Framework

    Full text link
    Despite years of intensive research in the field of software vulnerabilities discovery, exploits are becoming ever more common. Consequently, it is more necessary than ever to choose software configurations that minimize systems' exposure surface to these threats. In order to support users in assessing the security risks induced by their software configurations and in making informed decisions, we introduce M-STAR, a Modular Software Trustworthiness ARchitecture and framework for probabilistically assessing the trustworthiness of software systems, based on evidence, such as their vulnerability history and source code properties. Integral to M-STAR is a software trustworthiness model, consistent with the concept of computational trust. Computational trust models are rooted in Bayesian probability and Dempster-Shafer Belief theory, offering mathematical soundness and expressiveness to our framework. To evaluate our framework, we instantiate M-STAR for Debian Linux packages, and investigate real-world deployment scenarios. In our experiments with real-world data, M-STAR could assess the relative trustworthiness of complete software configurations with an error of less than 10%. Due to its modular design, our proposed framework is agile, as it can incorporate future advances in the field of code analysis and vulnerability prediction. Our results point out that M-STAR can be a valuable tool for system administrators, regular users and developers, helping them assess and manage risks associated with their software configurations.Comment: 18 pages, 13 figure
    corecore