54 research outputs found

    Essays on software vulnerability coordination

    Get PDF
    Software vulnerabilities are software bugs with security implications. Exposure to a security bug makes a software system behave in unexpected ways when the bug is exploited. As software vulnerabilities are thus a classical way to compromise a software system, these have long been coordinated in the global software industry in order to lessen the risks. This dissertation claims that the coordination occurs in a complex and open socio-technical system composed of decentralized software units and heterogeneous software agents, including not only software engineers but also other actors, from security specialists and software testers to attackers with malicious motives. Vulnerability disclosure is a classical example of the associated coordination; a security bug is made known to a software vendor by the discoverer of the bug, a third-party coordinator, or public media. The disclosure is then used to patch the bug. In addition to patching, the bug is typically archived to databases, cataloged and quantified for additional information, and communicated to users with a security advisory. Although commercial solutions have become increasingly important, the underlying coordination system is still governed by multiple stakeholders with vested interests. This governance has continued to result in different inefficiencies. Thus, this dissertation examines four themes: (i) disclosure of software vulnerabilities; (ii) coordination of these; (iii) evolution of these across time; and (iv) automation potential. The philosophical position is rooted in scientific realism and positivism, while regression analysis forms the kernel of the methodology. Based on these themes, the results indicate that (a) when vulnerability disclosure has worked, it has been relatively efficient; the obstacles have been social rather than technical in nature, originating from the diverging interests of the stakeholders who have different incentives. Furthermore, (b) the efficiency applies also to the coordination of different identifiers and classifications for the vulnerabilities disclosed. Longitudinally, (c) also the evolution of software vulnerabilities across time reflect distinct software and vulnerability life cycle models and the incentives underneath. Finally, (d) there is potential to improve the coordination efficiency through software automation

    Backup To The Rescue: Automated Forensic Techniques For Advanced Website-Targeting Cyber Attacks

    Get PDF
    The last decade has seen a significant rise in non-technical users gaining a web presence, often via the easy-to-use functionalities of Content Management Systems (CMS). In fact, over 60% of the world’s websites run on CMSs. Unfortunately, this huge user population has made CMS-based websites a high-profile target for hackers. Worse still, the vast majority of the website hosting industry has shifted to a “backup and restore” model of security, which relies on error-prone AV scanners to prompt non-technical users to roll back to a pre-infection nightly snapshot. My cyber forensics research directly addresses this emergent problem by developing next-generation techniques for the investigation of advanced cyber crimes. Driven by economic incentives, attackers abuse the trust in this economy: selling malware on legitimate marketplaces, pirating popular website plugins, and infecting websites post-deployment. Furthermore, attackers are exploiting these websites at scale by carelessly dropping thousands of obfuscated and packed malicious files on the webserver. This is counter-intuitive since attackers are assumed to be stealthy. Despite the rise in web attacks, efficiently locating and accurately analyzing the malware dropped on compromised webservers has remained an open research challenge. This dissertation posits that the already collected webserver nightly backup snapshots contain all required information to enable automated and scalable detection of website compromises. This dissertation presents a web attack forensics framework that leverages program analysis to automatically understand the webserver’s nightly backup snapshots. This will enable the recovery of temporal phases of a webserver compromise and its origin within the website supply chain.Ph.D

    A study of EU data protection regulation and appropriate security for digital services and platforms

    Get PDF
    A law often has more than one purpose, more than one intention, and more than one interpretation. A meticulously formulated and context agnostic law text will still, when faced with a field propelled by intense innovation, eventually become obsolete. The European Data Protection Directive is a good example of such legislation. It may be argued that the technological modifications brought on by the EU General Data Protection Regulation (GDPR) are nominal in comparison to the previous Directive, but from a business perspective the changes are significant and important. The Directive’s lack of direct economic incentive for companies to protect personal data has changed with the Regulation, as companies may now have to pay severe fines for violating the legislation. The objective of the thesis is to establish the notion of trust as a key design goal for information systems handling personal data. This includes interpreting the EU legislation on data protection and using the interpretation as a foundation for further investigation. This interpretation is connected to the areas of analytics, security, and privacy concerns for intelligent service development. Finally, the centralised platform business model and its challenges is examined, and three main resolution themes for regulating platform privacy are proposed. The aims of the proposed resolutions are to create a more trustful relationship between providers and data subjects, while also improving the conditions for competition and thus providing data subjects with service alternatives. The thesis contributes new insights into the evolving privacy practices in the digital society at an important time of transition from the service driven business models to the platform business models. Firstly, privacy-related regulation and state of the art analytics development are examined to understand their implications for intelligent services that are based on automated processing and profiling. The ability to choose between providers of intelligent services is identified as the core challenge. Secondly, the thesis examines what is meant by appropriate security for systems that handle personal data, something the GDPR requires that organisations use without however specifying what can be considered appropriate. We propose a method for active network security in web software that is developed through the use of analytics for detection and by inserting data generators into a software installation. The active network security method is proposed as a framework for achieving compliance with the GDPR requirements for services and platforms to use appropriate security. Thirdly, the platform business model is considered from the privacy point of view and the implication of “processing silos” for intelligent services. The centralised platform model is considered problematic from both the data subject and from the competition standpoint. A resolution is offered for enabling user-initiated open data flow to counter the centralised “processing silos”, and thereby to facilitate the introduction of decentralised platforms. The thesis provides an interdisciplinary analysis considering the legal study (lex lata) and additionally the resolution (lex ferenda) is defined through argumentativist legal dogmatics and (de lege ferenda) of how the legal framework ought to be adapted to fit the described environment. User-friendly Legal Science is applied as a theory framework to provide a holistic approach to answering the research questions. The User-friendly Legal Science theory has its roots in design science and offers a way towards achieving interdisciplinary research in the fields of information systems and legal science

    Empirical Notes on the Interaction Between Continuous Kernel Fuzzing and Development

    Full text link
    Fuzzing has been studied and applied ever since the 1990s. Automated and continuous fuzzing has recently been applied also to open source software projects, including the Linux and BSD kernels. This paper concentrates on the practical aspects of continuous kernel fuzzing in four open source kernels. According to the results, there are over 800 unresolved crashes reported for the four kernels by the syzkaller/syzbot framework. Many of these have been reported relatively long ago. Interestingly, fuzzing-induced bugs have been resolved in the BSD kernels more rapidly. Furthermore, assertions and debug checks, use-after-frees, and general protection faults account for the majority of bug types in the Linux kernel. About 23% of the fixed bugs in the Linux kernel have either went through code review or additional testing. Finally, only code churn provides a weak statistical signal for explaining the associated bug fixing times in the Linux kernel.Comment: The 4th IEEE International Workshop on Reliability and Security Data Analysis (RSDA), 2019 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), Berlin, IEE

    Practical and Effcient Runtime Taint Tracking

    No full text
    Runtime taint tracking is a technique for controlling data propagation in applications. It is typically used to prevent disclosure of confidential information or to avoid application vulnerabilities. Taint tracking systems intercept application operations at runtime, associate meta-data with the data being processed and inspect the meta-data to detect unauthorised data propagation. To keep metadata up-to-date, every attempt of the application to access and process data is intercepted. To ensure that all data propagation is monitored, different categories of data (e.g. confidential and public data) are kept isolated. In practice, the interception of application operations and the isolation of different categories of data are hard to achieve. Existing applications, language interpreters and operating systems need to be re-engineered while keeping metadata up-to-date incurs significant overhead at runtime. In this thesis we show that runtime taint tracking can be implemented with minimal changes to existing infrastructure and with reduced overhead compared to previous approaches. In other words, we suggest methods to achieve both practical and efficient runtime taint tracking. Our key observation is that applications in specific domains are typically implemented in high-level languages and use a subset of the available language features. This facilitates the implementation of a taint tracking system because it needs to support only parts of a programming language and it may leverage features of the execution platform. This thesis explores three different applications domains. We start with event processing applications in Java, for which we introduce a novel solution to achieve isolation and a practical method to declare restrictions about data propagation. We then focus on securing PHP web applications. We show that if taint tracking is restricted to a small part of an application, the runtime overhead is significantly reduced without sacrificing effectiveness. Finally, we target accidental data disclosure in Ruby web applications. Ruby emerges as an ideal choice for a practical taint tracking system because it supports meta-programming facilities that simplify interception and isolation

    A Forensic Web Log Analysis Tool: Techniques and Implementation

    Get PDF
    Methodologies presently in use to perform forensic analysis of web applications are decidedly lacking. Although the number of log analysis tools available is exceedingly large, most only employ simple statistical analysis or rudimentary search capabilities. More precisely these tools were not designed to be forensically capable. The threat of online assault, the ever growing reliance on the performance of necessary services conducted online, and the lack of efficient forensic methods in this area provide a background outlining the need for such a tool. The culmination of study emanating from this thesis not only presents a forensic log analysis framework, but also outlines an innovative methodology of analyzing log files based on a concept that uses regular expressions, and a variety of solutions to problems associated with existing tools. The implementation is designed to detect critical web application security flaws gleaned from event data contained within the access log files of the underlying Apache Web Service (AWS). Of utmost importance to a forensic investigator or incident responder is the generation of an event timeline preceeding the incident under investigation. Regular expressions power the search capability of our framework by enabling the detection of a variety of injection-based attacks that represent significant timeline interactions. The knowledge of the underlying event structure of each access log entry is essential to efficiently parse log files and determine timeline interactions. Another feature added to our tool includes the ability to modify, remove, or add regular expressions. This feature addresses the need for investigators to adapt the environment to include investigation specific queries along with suggested default signatures. The regular expressions are signature definitions used to detect attacks toward both applications whose functionality requires a web service and the service itself. The tool provides a variety of default vulnerability signatures to scan for and outputs resulting detections

    TOWARDS REDESIGNING WEB BROWSERS WITH SECURITY PRINCIPLES

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    A Common Digital Twin Platform for Education, Training and Collaboration

    Get PDF
    The world is in transition driven by digitalization; industrial companies and educational institutions are adopting Industry 4.0 and Education 4.0 technologies enabled by digitalization. Furthermore, digitalization and the availability of smart devices and virtual environments have evolved to pro- duce a generation of digital natives. These digital natives whose smart devices have surrounded them since birth have developed a new way to process information; instead of reading literature and writing essays, the digital native generation uses search engines, discussion forums, and on- line video content to study and learn. The evolved learning process of the digital native generation challenges the educational and industrial sectors to create natural training, learning, and collaboration environments for digital natives. Digitalization provides the tools to overcome the aforementioned challenge; extended reality and digital twins enable high-level user interfaces that are natural for the digital natives and their interaction with physical devices. Simulated training and education environments enable a risk-free way of training safety aspects, programming, and controlling robots. To create a more realistic training environment, digital twins enable interfacing virtual and physical robots to train and learn on real devices utilizing the virtual environment. This thesis proposes a common digital twin platform for education, training, and collaboration. The proposed solution enables the teleoperation of physical robots from distant locations, enabling location and time-independent training and collaboration in robotics. In addition to teleoperation, the proposed platform supports social communication, video streaming, and resource sharing for efficient collaboration and education. The proposed solution enables research collaboration in robotics by allowing collaborators to utilize each other’s equipment independent of the distance between the physical locations. Sharing of resources saves time and travel costs. Social communication provides the possibility to exchange ideas and discuss research. The students and trainees can utilize the platform to learn new skills in robotic programming, controlling, and safety aspects. Cybersecurity is considered from the planning phase to the implementation phase. Only cybersecure methods, protocols, services, and components are used to implement the presented platform. Securing the low-level communication layer of the digital twins is essential to secure the safe teleoperation of the robots. Cybersecurity is the key enabler of the proposed platform, and after implementation, periodic vulnerability scans and updates enable maintaining cybersecurity. This thesis discusses solutions and methods for cyber securing an online digital twin platform. In conclusion, the thesis presents a common digital twin platform for education, training, and collaboration. The presented solution is cybersecure and accessible using mobile devices. The proposed platform, digital twin, and extended reality user interfaces contribute to the transitions to Education 4.0 and Industry 4.0

    Blogs as Infrastructure for Scholarly Communication.

    Full text link
    This project systematically analyzes digital humanities blogs as an infrastructure for scholarly communication. This exploratory research maps the discourses of a scholarly community to understand the infrastructural dynamics of blogs and the Open Web. The text contents of 106,804 individual blog posts from a corpus of 396 blogs were analyzed using a mix of computational and qualitative methods. Analysis uses an experimental methodology (trace ethnography) combined with unsupervised machine learning (topic modeling), to perform an interpretive analysis at scale. Methodological findings show topic modeling can be integrated with qualitative and interpretive analysis. Special attention must be paid to data fitness, or the shape and re-shaping practices involved with preparing data for machine learning algorithms. Quantitative analysis of computationally generated topics indicates that while the community writes about diverse subject matter, individual scholars focus their attention on only a couple of topics. Four categories of informal scholarly communication emerged from the qualitative analysis: quasi-academic, para-academic, meta-academic, and extra-academic. The quasi and para-academic categories represent discourse with scholarly value within the digital humanities community, but do not necessarily have an obvious path into formal publication and preservation. A conceptual model, the (in)visible college, is introduced for situating scholarly communication on blogs and the Open Web. An (in)visible college is a kind of scholarly communication that is informal, yet visible at scale. This combination of factors opens up a new space for the study of scholarly communities and communication. While (in)invisible colleges are programmatically observable, care must be taken with any effort to count and measure knowledge work in these spaces. This is the first systematic, data driven analysis of the digital humanities and lays the groundwork for subsequent social studies of digital humanities.PhDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111592/1/mcburton_1.pd
    corecore