97 research outputs found

    Avatud lĂ€htekoodiga tarkvaraprojektide vearaportite ja tehniliste sĂ”ltuvuste haldamise analĂŒĂŒsimine

    Get PDF
    NĂŒĂŒdisaegses tarkvaraarenduses kasutatakse avatud lĂ€htekoodiga tarkvara komponente, et vĂ€hendada korratava töö hulka. Tarkvaraarendajad lisavad vaba lĂ€htekoodiga komponente oma projektidesse, omamata ĂŒlevaadet kasutatud komponentide arendamisest ja hooldamisest. Selle töö eesmĂ€rk on analĂŒĂŒsida tarkvaraprojektide vearaporteid ja sĂ”ltuvuste haldamist ning arendada vĂ€lja kohased meetodid. Tarkvaraprojektides kasutatakse töö organiseerimiseks veahaldussĂŒsteeme, mille abil hallatakse tĂ¶Ă¶ĂŒlesandeid, vearaporteid ja uusi kasutajanĂ”udeid. Enamat kui 4000 avatud lĂ€htekoodiga projekti analĂŒĂŒsides selgus, et paljud vearaportid jÀÀvad pikaks ajaks lahendamata. Muu hulgas vĂ”ib nii ka mĂ”ni kriitiline turvaviga parandamata jÀÀda. Doktoritöös arendatakse vĂ€lja meetod, mis vĂ”imaldab automaatselt hinnata vearaporti lahendamiseks kuluvat aega. Meetod pĂ”hineb veahaldussĂŒsteemi talletunud andmete analĂŒĂŒsil. Vearaporti eluaja hindamine aitab projektiosalistel prioriseerida tĂ¶Ă¶ĂŒlesandeid ja planeerida ressursse. Töö teises osas uuritakse, kuidas avatud lĂ€htekoodiga projektide koodis kolmanda poole komponente kasutatakse. Tarkvaraarendajad kasutavad varem vĂ€ljaarendatud komponente, et kiirendada arendust ja vĂ€hendada korratava töö hulka. Samamoodi kasutavad spetsiifilised komponendid veel omakorda teisi komponente, mislĂ€bi moodustub komponentide vaheliste seoste kaudu sĂ”ltuvuslik vĂ”rgustik. Selles doktoritöös analĂŒĂŒsitakse sĂ”ltuvuste vĂ”rgustikku populaarsete programmeerimiskeelte nĂ€idetel. Töö kĂ€igus arendatud meetod on rakendatav sĂ”ltuvuste vĂ”rgustiku struktuuri ja kasvu analĂŒĂŒsimiseks. Töös demonstreeritakse, kuidas vĂ”rgustiku struktuuri analĂŒĂŒsi abil saab hinnata tarkvaraprojektide riski hĂ”lmata sĂ”ltuvusahela kaudu mĂ”ni turvaviga. Doktoritöös arendatud meetodid ja tulemused aitavad avatud lĂ€htekoodiga projektide vearaportite ja tehniliste sĂ”ltuvuste haldamise praktikat lĂ€bipaistvamaks muuta.Modern software development relies on open-source software to facilitate reuse and reduce redundant work. Software developers use open-source packages in their projects without having insights into how these components are being developed and maintained. The aim of this thesis is to develop approaches for analyzing issue and dependency management in software projects. Software projects organize their work with issue trackers, tools for tracking issues such as development tasks, bug reports, and feature requests. By analyzing issue handling in more than 4,000 open-source projects, we found that many issues are left open for long periods of time, which can result in bugs and vulnerabilities not being fixed in a timely manner. This thesis proposes a method for predicting the amount of time it takes to resolve an issue by using the historical data available in issue trackers. Methods for predicting issue lifetime can help software project managers to prioritize issues and allocate resources accordingly. Another problem studied in this thesis is how software dependencies are used. Software developers often include third-party open-source software packages in their project code as a dependency. The included dependencies can also have their own dependencies. A complex network of dependency relationships exists among open-source software packages. This thesis analyzes the structure and the evolution of dependency networks of three popular programming languages. We propose an approach to measure the growth and the evolution of dependency networks. This thesis demonstrates that dependency network analysis can quantify what is the likelihood of acquiring vulnerabilities through software packages and how it changes over time. The approaches and findings developed here could help to bring transparency into open-source projects with respect to how issues are handled, or dependencies are updated

    Towards Understanding and Improving Code Review Quality

    Get PDF
    Code review is an essential element of any mature software development project, it is key to ensuring the long-term quality of the code base. Code review aims at evaluating code contributions submitted by developers before they are committed into the project's version control system. Code review is considered to be one of the most effective QA practices for software projects. In principle, the code review process should improve the quality of committed code changes. However, in practice, the execution of this process can still allow bugs to enter into the codebase unnoticed. Moreover, the notion of the quality of the code review process is not limited to the quality of the source code that passed a review. It goes beyond that, the quality of the code review process can affect how successful a software development project is. For instance, in the world of open source software (OSS), a particular execution code review process may encourage or deter the contributions from ``external" developers, the people who are essential to OSS projects. We claim that by analyzing various software artifacts as well as assessing developers' daily experience, we can create models that represent the established code review processes and highlight potentially weak points in their execution. Having this information, the stakeholders can channel the available resources to address the deficiencies in their code review process. To support such a claim, we perform the following studies. First, we study the tool-based code review processes of two large OSS projects that use traditional model of evaluating code contributions. We analyse the software artifacts extracted from the issue tracking systems to understand what can affect code review response time and eventual outcome. We found that code review is affected not only by technical factors (e.g., patch size, priority, etc.) but also by non-technical ones (e.g., developers' affiliation, their experience, etc.). Second, we investigate the quality of contributions that passed the code review process and explore the relationships between the reviewers' code inspections and a set of factors, both personal and social in nature, that might affect the quality of such inspections. By mining the software repository and the issue tracking system of the Mozilla project, as well as applying the SZZ algorithm to detect bug-inducing changes, we were able to find that 54\% of the reviewed changes introduced bugs in the code. Our findings also showed that both personal metrics, such as reviewer workload and experience, and participation metrics, such as the number of involved developers, are associated with the quality of the code review process. Third, we further study the topic of code review quality by studying the developers' attitude and perception of review quality as well as the factors they believe to be important. To accomplish this, we surveyed 88 Mozilla core developers, and applied grounded theory to analyze their responses. The results provide developer insights into how they define review quality, what factors contribute to how they evaluate submitted code and what challenges they face when performing review tasks. Finally, we examined the code review processes executed in a completely different environment --- an industrial project that uses pull-based development model. Our case study was Active Merchant project developed by Shopify Inc. We performed a quantitative analysis of their software repository to understand the effects of a variety of factors on pull request review time and outcome. After that, we surveyed the developers to understand their perception of the review process and how it is different from developers' perception in traditional development model. The studies presented in this thesis focus on code review processes performed by projects of different nature --- OSS vs. industrial, traditional vs. pull-based. Nevertheless, we observed similar patterns in the execution of code review that the stakeholder should be aware of to maintain the long-term health of the projects

    Pull request latency explained:an empirical overview

    Get PDF
    Pull request latency evaluation is an essential application of effort evaluation in the pull-based development scenario. It can help the reviewers sort the pull request queue, remind developers about the review processing time, speed up the review process and accelerate software development. There is a lack of work that systematically organizes the factors that affect pull request latency. Also, there is no related work discussing the differences and variations in characteristics in different scenarios and contexts. In this paper, we collected relevant factors through a literature review approach. Then we assessed their relative importance in five scenarios and six different contexts using the mixed-effects linear regression model. The most important factors differ in different scenarios. The length of the description is most important when pull requests are submitted. The existence of comments is most important when closing pull requests, using CI tools, and when the contributor and the integrator are different. When there exist comments, the latency of the first comment is the most important. Meanwhile, the influence of factors may change in different contexts. For example, the number of commits in a pull request has a more significant impact on pull request latency when closing than submitting due to changes in contributions brought about by the review process. Both human and bot comments are positively correlated with pull request latency. In contrast, the bot’s first comments are more strongly correlated with latency, but the number of comments is less correlated. Future research and tool implementation needs to consider the impact of different contexts. Researchers can conduct related studies based on our publicly available datasets and replication scripts

    Reification as the key to augmenting software development: an object is worth a thousand words

    Get PDF
    Software development has become more and more pervasive, with influence in almost every human activity. To be able to fit in so many different scenarios and constantly implement new features, software developers adopted methodologies with tight development cycles, sometimes with more than one release per day. With the constant growth of modern software projects and the consequent expansion of development teams, understanding all the components of a system becomes a task too big to handle. In this context understanding the cause of an error or identifying its source is not an easy task, and correcting the erroneous behavior can lead to unexpected downtime of vital services. Being able to keep track of software defects, usually referred to as bugs, is crucial in the development of a project and in containing maintenance costs. For this purpose, the correctness and completeness of the information available has a great impact on the time required to understand and solve a problem. In this thesis we present an overview of the current techniques commonly used to report software defects. We show why we believe that the state of the art needs to be improved, and present a set of approaches and tools to collect data from software failures, model it, and turn it into actionable knowledge. Our goal is to show that data generated from errors can have a great impact on daily software development, and how it can be employed to augment the development environment to assist software engineers to build and maintain software systems

    A Framework for anonymous background data delivery and feedback

    Get PDF
    The current state of the industry’s methods of collecting background data reflecting diagnostic and usage information are often opaque and require users to place a lot of trust in the entity receiving the data. For vendors, having a centralized database of potentially sensitive data is a privacy protection headache and a potential liability should a breach of that database occur. Unfortunately, high profile privacy failures are not uncommon, so many individuals and companies are understandably skeptical and choose not to contribute any information. It is a shame, since the data could be used for improving reliability, or getting stronger security, or for valuable academic research into real-world usage patterns. We propose, implement and evaluate a framework for non-realtime anonymous data collection, aggregation for analysis, and feedback. Departing from the usual “trusted core” approach, we aim to maintain reporters’ anonymity even if the centralized part of the system is compromised. We design a peer-to-peer mix network and its protocol that are tuned to the properties of background diagnostic traffic. Our system delivers data to a centralized repository while maintaining (i) source anonymity, (ii) privacy in transit, and (iii) the ability to provide analysis feedback back to the source. By removing the core’s ability to identify the source of data and to track users over time, we drastically reduce its attractiveness as a potential attack target and allow vendors to make concrete and verifiable privacy and anonymity claims

    On Wasted Contributions: Understanding the Dynamics of Contributor-Abandoned Pull Requests

    Full text link
    Pull-based development has enabled numerous volunteers to contribute to open-source projects with fewer barriers. Nevertheless, a considerable amount of pull requests (PRs) with valid contributions are abandoned by their contributors, wasting the effort and time put in by both the contributors and maintainers. To better understand the underlying dynamics of contributor-abandoned PRs, we conduct a mixed-methods study using both quantitative and qualitative methods. We curate a dataset consisting of 265,325 PRs including 4,450 abandoned ones from ten popular and mature GitHub projects and measure 16 features characterizing PRs, contributors, review processes, and projects. Using statistical and machine learning techniques, we find that complex PRs, novice contributors, and lengthy reviews have a higher probability of abandonment and the rate of PR abandonment fluctuates alongside the projects' maturity or workload. To identify why contributors abandon their PRs, we also manually examine a random sample of 354 abandoned PRs. We observe that the most frequent abandonment reasons are related to the obstacles faced by contributors, followed by the hurdles imposed by maintainers during the review process. Finally, we survey the top core maintainers of the studied projects to understand their perspectives on dealing with PR abandonment and on our findings.Comment: Manuscript accepted for publication in ACM Transactions on Software Engineering and Methodology (TOSEM

    Harnessing Sources of Innovation, Useful Knowledge and Leadership within a Complex Public Sector Agency Network: A Reflective Practice Perspective

    Get PDF
    This Innovation Portfolio Project focuses on the development and implementation of a single workplace innovation, namely the "Portal2Progress" (P2P) to the context of the Western Australian Department of Fire and Emergency Services (DFES). The P2P endeavour sought to harness emergent grassroots innovation ideas within the complexity of the contemporary public sector environment of the DFES, which I lead. The P2P is the Innovation Project that underpins my Professional Doctorate study, which is essentially insider research on the introduction and embedding of P2P as a workplace innovation. Within my role, I was actively involved in the research process and in the innovation project delivery. The organisational goal of this Innovation Portfolio Project was that DFES would benefit practically and culturally from the adoption of the P2P. The P2P mechanism of the cultivation of innovative ideas, percolating within DFES, was intended to make a real difference to the business of the agency; and culturally, by the adoption of those ideas leading to the organisation's embracing of innovation and learning. The social aim was to add public value to DFES operations through the delivery of improved service to the community and by making a contribution to the field of public sector management. This Innovation Portfolio Project provides a vehicle for the sharing of knowledge, derived from this endeavour. It also provides a reference, available for the benefit of others that might seek to embed an innovation strategy across their organisation. My personal aim from this research was that of self-improvement as a thinker, as a leader and as a scholar. The Innovation Portfolio Project of this workplace research project, articulates the results of my study from a practical, organisational, academic and personal perspective. It also presents my reflections on the contextual conditions I see as more broadly necessary for the successful implementation of change in public service organisations and more specifically, the leadership, organisational structure and power relationships that I believe made change possible in the DFES. Through my reflection on the findings of this study and its significance, I have explored its potential within DFES, the challenges into future and how these might be managed. I also briefly consider the wider impacts for the wider public sector of P2P and what might be achieved by broader adoption into a public sector organisation

    The Computer Made Me Do It: Is There a Future for False Claims Act Liability Against Electronic Health Record Vendors?

    Get PDF
    Since the advent of the movement toward the use of electronic medical records, an axiom in the promotion of electronic health records (EHRs) has been the idea that the use of EHRs will reduce medical errors. Certainly, there are countless examples of how technology can improve the health care experience and aid providers in reducing medical errors, including errors of medication administration, medication management, access to decision support tools, telemedicine, immediate access to diagnostic tests and other clinical information and treatment results—just to name a few. Even with such improvements, however, EHRs have not entirely eliminated medical errors and new technology has in fact created its own challenges and issues that might lead to liability in a different way. As the use of EHRs proliferates, so too does the reliance of healthcare workers on the systems themselves and the inevitable blame game wherein an individual claims that whatever errors occurred were the result of “the computer” or the “system” that dictated the manner in which the care was provided or the manner in which the services were reimbursed. Ultimately, this “blame game” leads all to ask the question—whose fault is that? Can one blame the EHR vendor? To the extent that the answer may in fact be “Yes,” and the EHR vendor is at fault, are such claims easy to maintain? Historically, providers and other purchasers of EHRs have had little leverage against EHR vendors. One of the primary challenges arises out of the contract between the provider-purchaser and the EHR vendor. Ultimately, the purchase or licensing of an EHR system is actually just the purchase or licensing of software and, as such, the contracts resemble standard software licensing agreements, replete with disclaimers of implied and express warranties and “hold-harmless” or indemnification clauses that protect the vendor from third party liability. Recent litigation—including one particular case involving the federal government’s allegations of fraud—has started to erode the disconnect between the potential responsibility of the EHR vendor and the ability to hold the vendor actually liable for its actions related to its software. On May 31, 2017, the United States Department of Justice (DOJ) entered into a $155 million settlement with eClinicalWorks (eCW), one of the nation’s largest electronic health records vendors, to resolve a False Claims Act (FCA) lawsuit in which the DOJ alleged that eCW caused the submission of false claims for federal incentive payments made under the Electronic Health Records (EHR) Incentive Program. This settlement is unique not only because of the rarity of settlements or judgments against an entity based on an allegation that the falsity was in causing another to submit a false claim—as opposed to an action against an entity that has falsely filed its own claim and received payment directly—but also because it is one of the first of its kind against an EHR vendor. Following this case, many are wondering whether this settlement with eCW stands alone as an example of the government simply snaring one “bad actor,” or if this settlement is indicative of what might lie ahead for EHR vendors under the FCA. Will the FCA be a new tool under which EHR vendors are going to be held responsible for the role that their software might play in the delivery of care or the billing and collection of services rendered? Indeed, many in the information technology industry took note of this settlement and have speculated that this may not be a singular incident. Farzad Mostahsari—former National Coordinator for Health IT—stated, “Let me be plain-spoken. eClinicalWorks is not the only EHR vendor who flouted certification/misled customers[.] Other vendors better clean up.” Is Mr. Mostahsari correct and this could be a sign of things to come if EHR vendors are not careful about their actions? This Article will examine the existing eCW settlement agreement, along with other case law against EHR vendors, to determine whether this settlement is simply an outlier among FCA cases, meant only to punish particularly egregious behavior, or the beginning of a new era of FCA activity akin to other industries, like the pharmaceutical industry. Part II of this Article will provide a brief history of the FCA and the instances in which the DOJ has utilized provisions under the law against entities that or individuals who cause another to submit a false claim or make a material, false record. It will further review the types of cases outside of the FCA context that have been filed against EHR vendors since providers began more widespread adoption of EHR systems, especially after the enactment of the Health Information Technology for Economic and Clinical Health Act (HITECH) and the EHR Incentive Program. Part III will then study the eCW case in more detail, examining the actions that led to the settlement and determine whether such actions are an indication of a new era of FCA cases and EHR vendor liability. Part III will additionally examine existing case law against EHR vendors to determine whether any patterns can be gleaned from the cases that will predict the continued use of the FCA as an enforcement tool against EHR vendors. In Part IV, this Article will argue that, although the eCW case is based on unique facts, it is likely that EHR vendors will face other FCA cases as the healthcare industry places increasing responsibility and reliance on electronic systems. These suits will likely include allegations of fraud arising not only out of the EHR Incentive Program but also the submission of claims more generally. Unlike in other FCA cases involving entities that do not contract directly with the federal government, however, it is unlikely that the federal government will be able to realize as much success or generate the same type of monetary rewards against EHR vendors as it has against the pharmaceutical industry because of the distinctions between these two disparate sectors of the health care industry. Finally, the Article will conclude by providing some thoughts on the impact of the eCW settlement agreement, which puts the EHR industry on notice regarding the potential for future liability
    • 

    corecore