Search CORE

344 research outputs found

Human Factors in Secure Software Development

Author: Acar Yasemin
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2021
Field of study

While security research has made significant progress in the development of theoretically secure methods, software and algorithms, software still comes with many possible exploits, many of those using the human factor. The human factor is often called ``the weakest link'' in software security. To solve this, human factors research in security and privacy focus on the users of technology and consider their security needs. The research then asks how technology can serve users while minimizing risks and empowering them to retain control over their own data. However, these concepts have to be implemented by developers whose security errors may proliferate to all of their software's users. For example, software that stores data in an insecure way, does not secure network traffic correctly, or otherwise fails to adhere to secure programming best practices puts all of the software's users at risk. It is therefore critical that software developers implement security correctly. However, in addition to security rarely being a primary concern while producing software, developers may also not have extensive awareness, knowledge, training or experience in secure development. A lack of focus on usability in libraries, documentation, and tools that they have to use for security-critical components may exacerbate the problem by blowing up the investment of time and effort needed to "get security right". This dissertation's focus is how to support developers throughout the process of implementing software securely. This research aims to understand developers' use of resources, their mindsets as they develop, and how their background impacts code security outcomes. Qualitative, quantitative and mixed methods were employed online and in the laboratory, and large scale datasets were analyzed to conduct this research. This research found that the information sources developers use can contribute to code (in)security: copying and pasting code from online forums leads to achieving functional code quickly compared to using official documentation resources, but may introduce vulnerable code. We also compared the usability of cryptographic APIs, finding that poor usability, unsafe (possibly obsolete) defaults and unhelpful documentation also lead to insecure code. On the flip side, well-thought out documentation and abstraction levels can help improve an API's usability and may contribute to secure API usage. We found that developer experience can contribute to better security outcomes, and that studying students in lieu of professional developers can produce meaningful insights into developers' experiences with secure programming. We found that there is a multitude of online secure development advice, but that these advice sources are incomplete and may be insufficient for developers to retrieve help, which may cause them to choose un-vetted and potentially insecure resources. This dissertation supports that (a) secure development is subject to human factor challenges and (b) security can be improved by addressing these challenges and supporting developers. The work presented in this dissertation has been seminal in establishing human factors in secure development research within the security and privacy community and has advanced the dialogue about the rigorous use of empirical methods in security and privacy research. In these research projects, we repeatedly found that usability issues of security and privacy mechanisms, development practices, and operation routines are what leads to the majority of security and privacy failures that affect millions of end users

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

Dos and Don'ts of Machine Learning in Computer Security

Author: Arp Daniel
Cavallaro Lorenzo
Pendlebury Feargus
Pierazzi Fabio
Quiring Erwin
Rieck Konrad
Warnecke Alexander
Wressnegger Christian
Publication venue
Publication date: 30/11/2021
Field of study

With the growing processing power of computing systems and the increasing availability of massive datasets, machine learning algorithms have led to major breakthroughs in many different areas. This development has influenced computer security, spawning a series of work on learning-based security systems, such as for malware detection, vulnerability discovery, and binary code analysis. Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance and render learning-based systems potentially unsuitable for security tasks and practical deployment. In this paper, we look at this problem with critical eyes. First, we identify common pitfalls in the design, implementation, and evaluation of learning-based security systems. We conduct a study of 30 papers from top-tier security conferences within the past 10 years, confirming that these pitfalls are widespread in the current security literature. In an empirical analysis, we further demonstrate how individual pitfalls can lead to unrealistic performance and interpretations, obstructing the understanding of the security problem at hand. As a remedy, we propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible. Furthermore, we identify open problems when applying machine learning in security and provide directions for further research.Comment: to appear at USENIX Security Symposium 202

arXiv.org e-Print Archive

UCL Discovery

Challenging Software Developers:Dialectic as a Foundation for Security Assurance Techniques

Author: Noble James
Rashid Awaid
Weir Charles
Publication venue
Publication date: 01/01/2020
Field of study

Development teams are increasingly expected to deliver secure code, but how can they best achieve this? Traditional security practice, which emphasises 'telling developers what to do' using checklists, processes and errors to avoid, has proved difficult to introduce. From analysis of industry interviews with a dozen experts in app development security, we find that secure development requires dialectic: a challenging dialog between the developers and a range of counterparties, continued throughout the development cycle. Analysing a further survey of sixteen industry developer security advocates, we identify the six assurance techniques that are most effective at achieving this dialectic in existing development teams, and conclude that the introduction of these techniques is best driven by the developers themselves. Concentrating on these six assurance techniques, and the dialectical interactions they involve, has the potential to increase the security of development activities and thus improve software security for everyone

Lancaster E-Prints

Explore Bristol Research

Secure code review : supporting developers in secure code review

Author: Braz Brasileiro Barbosa Larissa Nadja
Publication venue
Publication date: 01/01/2023
Field of study

ZORA

The enchanted house:An analysis of the interaction of intelligent personal home assistants (IPHAs) with the private sphere and its legal protection

Author: de Conca Silvia
Publication venue
Publication date: 01/01/2021
Field of study

Abstract In less than five years, Alexa has become a familiar presence in many households, and even those who do not own one have stumbled into it, be it at a friend’s house or in the news. Amazon Alexa and its friend Google Assistant represent an evolution of IoT: they have an advanced ‘intelligence’ based on Cloud computing and Machine Learning; they collect data and process them to profile and understand users, and they are placed inside our home. I refer to them as intelligent personal and home assistants, or IPHAs. This research applies multidisciplinary resources to explore the phenomenon of IPHAs from two perspectives. From a more socio-technical angle, the research reflects upon what happens to the private sphere and the home once IPHAs enter it. To do so, it looks at theories and concepts borrowed from history, behavioural science, STSs, philosophy, and behavioural design. All these disciplines contribute to highlight different attributes that individuals and society associate with the private sphere and the home. When the functioning of IPHAs is mapped against these attributes it is possible to identify where Alexa and Assistant might have an impact: there is a potential conflict between the privacy expectations and norms existing in the home (as sanctuary of the private sphere) and the marketing interests introduced in the home by IPHAs’ profiling. Because of the voice-interaction, IPHAs are also potentially highly persuasive, can influence and manipulate users and affect their autonomy and control in their daily lives. From the legal perspective, the research explores the application of the GDPR and proposal for e-Privacy Regulation to IPHAs, as legislative tools for the protection of the private sphere in horizontal relationships. The analysis focuses in particular on those provisions whose application to IPHAs is more challenging, based on the technology but also on the sociotechnical analysis above. Special attention is dedicated to the consent of users to the processing, the general principles of the GDPR, attributing the role of controllers or processors to the stakeholders involved, profiling and automated decisions, data protection by design and default, as well as spam and robocalls. For some of the issues, suggestions are offered on how to interpret and apply the legal framework, in order to mitigate undesired effects. This is the case, for instance, of determining whether the owners of IPHAs should be considered controllers vis-à-vis the data of their guests, or of the implications of data protection by design and default on the design of IPHAs. Some questions, however, require a wider debate at societal and political level. This is the case of the behavioural design techniques used to entice users and stimulate them to use the vocal assistants, which present high levels of persuasion and can affect the agency and autonomy of individuals. The research brings forward the necessity to determine where the line should be drawn between acceptable practices and unacceptable ones

Tilburg University Repository

Beyond Traditional Software Development: Studying and Supporting the Role of Reusing Crowdsourced Knowledge in Software Development

Author: Abdalkareem Rabe
Publication venue
Publication date: 01/02/2019
Field of study

As software development is becoming increasingly complex, developers often need to reuse others’ code or knowledge made available online to tackle problems encountered during software development and maintenance. This phenomenon of using others' code or knowledge, often found on online forums, is referred to as crowdsourcing. A good example of crowdsourcing is posting a coding question on the Stack Overflow website and having others contribute code that solves that question. Recently, the phenomenon of crowdsourcing has attracted much attention from researchers and practitioners and recent studies show that crowdsourcing improves productivity and reduces time-to-market. However, like any solution, crowdsourcing brings with it challenges such as quality, maintenance, and even legal issues. The research presented in this thesis presents the result of a series of large-scale empirical studies involving some of the most popular crowdsourcing platforms such as Stack Overflow, Node Package Manager (npm), and Python Package Index (PyPI). The focus of these empirical studies is to investigate the role of reusing crowdsourcing knowledge and more particularly crowd code in the software development process. We first present two empirical studies on the reuse of knowledge from crowdsourcing platforms namely Stack Overflow. We found that reusing knowledge from this crowdsourcing platform has the potential to assist software development practices, specifically through source code reuse. However, relying on such crowdsourced knowledge might also negatively affect the quality of the software projects. Second, we empirically examine the type of development knowledge constructed on crowdsourcing platforms. We examine the use of trivial packages on npm and PyPI platforms. We found that trivial packages are common and developers tend to use them because they provide them with well tested and implemented code. However, developers are concerned about the maintenance overhead of these trivial packages due to the extra dependencies that trivial packages introduce. Finally, we used the gained knowledge to propose a pragmatic solution to improve the efficiency of relying on the crowd in software development. We proposed a rule-based technique that automatically detects commits that can skip the continuous integration process. We evaluate the performance of the proposed technique on a dataset of open-source Java projects. Our results show that continuous integration can be used to improve the efficiency of the reused code from crowdsourcing platforms. Among the findings of this thesis are that the way software is developed has changed dramatically. Developers rely on crowdsourcing to address problems encountered during software development and maintenance. The results presented in this thesis provides new insights on how knowledge from these crowdsourced platforms is reused in software systems and how some of this knowledge can be better integrated into current software development processes and best practices

Concordia University Research Repository

The Example Guru: Suggesting Examples to Novice Programmers in an Artifact-Based Context

Author: Ichinco Michelle
Publication venue: Washington University Open Scholarship
Publication date: 15/08/2018
Field of study

Programmers in artifact-based contexts could likely benefit from skills that they do not realize exist. We define artifact-based contexts as contexts where programmers have a goal project, like an application or game, which they must figure out how to accomplish and can change along the way. Artifact-based contexts do not have quantifiable goal states, like the solution to a puzzle or the resolution of a bug in task-based contexts. Currently, programmers in artifact-based contexts have to seek out information, but may be unaware of useful information or choose not to seek out new skills. This is especially problematic for young novice programmers in blocks programming environments. Blocks programming environments often lack even minimal in-context support, such as auto-complete or in-context documentation. Novices programming independently in these blocks-based programming environments often plateau in the programming skills and API methods they use. This work aims to encourage novices in artifact-based programming contexts to explore new API methods and skills. One way to support novices may be with examples, as examples are effective for learning and highly available. In order to better understand how to use examples for supporting novice programmers, I first ran two studies exploring novices\u27 use and focus on example code. I used those results to design a system called the Example Guru. The Example Guru suggests example snippets to novice programmers that contain previously unused API methods or code concepts. Finally, I present an approach for semi-automatically generating content for this type of suggestion system. This approach reduces the amount of expert effort required to create suggestions. This work contains three contributions: 1) a better understanding of difficulties novices have using example code, 2) a system that encourages exploration and use of new programming skills, and 3) an approach for generating content for a suggestion system with less expert effort

Washington University St. Louis: Open Scholarship

Recommended from our members

In Search of a “Fair Explanation”: Helping Young People to Consider the Possibilities, Limitations, and Risks of Computer- and Data-Mediated Systems

Author: Van Wart Sarah Jane
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Significant resources have been directed towards K-12 computing and data education over the past ten years, as part of what has come to be known as the CSforAll initiative. This initiative has focused on raising awareness of computing education among parents and students, developing situated learning progressions that resonate with many different interests and pursuits, training teachers, and addressing issues of underrepresentation in computing among females and racial minorities. In this dissertation, I argue that as the CSforAll initiative continues to expand, it is important for the education community to also reflect on the forms of knowledge that are believed to be essential, and the presumed benefits of computing and data education. Specifically, how might the goal of producing citizens with robust computing and data literacies change what is considered to be fundamental to a computing education; as well as the kinds of contexts in which computing and data science are situated?I use the term sociotechnical literacy to name this vision for computing education, which I define as a broad set of social and technical practices, strategies, ideas, and dispositions that can help people to reason about the computer-mediated systems that shape their everyday lives. As the term suggests, I argue that it is important for learners to engage with technical ideas as well as their social applications and implications. To examine what this might mean for teaching and learning, I describe two design experiments that I conducted with young people (ages 14 – 22). Each approach aimed to make the applications of computing primary (rather than treating applications as the backdrop from which the abstractions of computation are motivated), so that learners could examine some of the specific ways in which data and computing might be directed to particular goals, subject to real possibilities and constraints, and in relation to alternative forms of participation. I examine the possibilities and limitations of each approach. I also analyze some of the assumptions that framed the design experiments – which were naïve, but also reflective of a broader ethos that pervades CSforAll. I reflect on what these studies collectively reveal about the possibilities, limitations, and risks of data and computing, as situated in the lives of young people; as well as what this might mean for helping young people develop a robust sociotechnical literacy. There are very real limits to what can be accomplished with computing and data alone. There are also significant benefits and risks associated with the many sociotechnical systems that shape our lives. As such, I argue that rather than positioning computing education as a remedy to various social ills, we instead offer young people a fair explanation of what computing is and is not capable of, grounded within specific contexts involving real people. I conclude with what this fair explanation might include, and how it might be fostered

eScholarship - University of California

ProQuest OAI Repository

Code similarity and clone search in large-scale source code data

Author: Ragkhitwetsagul Chaiyong
Publication venue: UCL (University College London)
Publication date: 28/10/2018
Field of study

Software development is tremendously benefited from the Internet by having online code corpora that enable instant sharing of source code and online developer's guides and documentation. Nowadays, duplicated code (i.e., code clones) not only exists within or across software projects but also between online code repositories and websites. We call them "online code clones."' They can lead to license violations, bug propagation, and re-use of outdated code similar to classic code clones between software systems. Unfortunately, they are difficult to locate and fix since the search space in online code corpora is large and no longer confined to a local repository. This thesis presents a combined study of code similarity and online code clones. We empirically show that many code snippets on Stack Overflow are cloned from open source projects. Several of them become outdated or violate their original license and are possibly harmful to reuse. To develop a solution for finding online code clones, we study various code similarity techniques to gain insights into their strengths and weaknesses. A framework, called OCD, for evaluating code similarity and clone search tools is introduced and used to compare 34 state-of-the-art techniques on pervasively modified code and boiler-plate code. We also found that clone detection techniques can be enhanced by compilation and decompilation. Using the knowledge from the comparison of code similarity analysers, we create and evaluate Siamese, a scalable token-based clone search technique via multiple code representations. Our evaluation shows that Siamese scales to large-scale source code data of 365 million lines of code and offers high search precision and recall. Its clone search precision is comparable to seven state-of-the-art clone detection tools on the OCD framework. Finally, we demonstrate the usefulness of Siamese by applying the tool to find online code clones, automatically analyse clone licenses, and recommend tests for reuse

UCL Discovery

Using Graph Databases to Address Network Complexity Problems that can Hinder Security Incident Response

Author: Erickson Andrew
Publication venue: The Repository at St. Cloud State
Publication date: 01/05/2019
Field of study

The network complexity problem within computer security incident response is an issue pertaining to the complexity of a computer network as it grows in both size and scale. The larger the computer network grows, the more difficult reconnaissance becomes, which is necessary to execute correction and prevention measures that address issues that arise during security incident response. Leveraging graph databases can help solve problems present in relational databases with large, tree-like structures, like those present in computer networks, and along with solving those problems adds flexibility that is needed due to the mutability of computer networks. This paper focuses on using graph databases to discover the blast radius of day zero vulnerabilities on the fly by using the properties of graph databases to find intuitive infection vectors that may be present during a day zero vulnerability. Additionally, options for visualizing security data in ways that make the data more actionable will be explored

St. Cloud State University