22,258 research outputs found

    Should I Bug You? Identifying Domain Experts in Software Projects Using Code Complexity Metrics

    Full text link
    In any sufficiently complex software system there are experts, having a deeper understanding of parts of the system than others. However, it is not always clear who these experts are and which particular parts of the system they can provide help with. We propose a framework to elicit the expertise of developers and recommend experts by analyzing complexity measures over time. Furthermore, teams can detect those parts of the software for which currently no, or only few experts exist and take preventive actions to keep the collective code knowledge and ownership high. We employed the developed approach at a medium-sized company. The results were evaluated with a survey, comparing the perceived and the computed expertise of developers. We show that aggregated code metrics can be used to identify experts for different software components. The identified experts were rated as acceptable candidates by developers in over 90% of all cases

    Revisiting the License v. Sale Conundrum

    Get PDF
    This Article seeks to answer a question that has become increasingly more important as commerce moves from the tangible to the intangible—to what extent may a business use a contract to control the use of a fully paid product? The characterization of a transaction as a license or a sale determines what may be done with a product, who controls how the product may be used, and what happens in the event of a dispute. The past generation has seen a seismic shift in the way businesses distribute their products to consumers. Businesses often “license” rather than “sell” their products, and view consumers as licensees, rather than owners, of the products they buy. Customers own their print copies of books, movies, and music but merely license the same content when they purchase it in digital form. The marketplace transition from sale to license has far and wide ripple effects affecting a range of issues from innovation to the environment. The rapid emergence of the Internet of Things adds to the urgency and importance of the question— are goods licensed or sold? The question of whether a digital product is licensed or sold is often conflated with the question of whether a product should be licensed or sold. The problem lies, in large part, with the well-intentioned but misguided turn that contract law has taken away from the intent of the parties and toward a narrow vision of efficiency. When it comes to commercial transactions, the narrow efficiency view prioritizes quantity of completed transactions over quality, ignoring consumer expectations and the way in which distrust creates uncertainty in the marketplace. This Article proposes a methodology for resolving the license v. sale conundrum that promotes a more expansive view of efficiency and brings more predictability and fairness to an increasingly muddled area of the law

    Mitigating Turnover with Code Review Recommendation: Balancing Expertise, Workload, and Knowledge Distribution

    Get PDF
    Developer turnover is inevitable on software projects and leads to knowledge loss, a reduction in productivity, and an increase in defects. Mitigation strategies to deal with turnover tend to disrupt and increase workloads for developers. In this work, we suggest that through code review recommendation we can distribute knowledge and mitigate turnover with minimal impact on the development process. We evaluate review recommenders in the context of ensuring expertise during review, Expertise, reducing the review workload of the core team, CoreWorkload, and reducing the Files at Risk to turnover, FaR. We find that prior work that assigns reviewers based on file ownership concentrates knowledge on a small group of core developers increasing risk of knowledge loss from turnover by up to 65%. We propose learning and retention aware review recommenders that when combined are effective at reducing the risk of turnover by -29% but they unacceptably reduce the overall expertise during reviews by -26%. We develop the Sophia recommender that suggest experts when none of the files under review are hoarded by developers but distributes knowledge when files are at risk. In this way, we are able to simultaneously increase expertise during review with a ΔExpertise of 6%, with a negligible impact on workload of ΔCoreWorkload of 0.09%, and reduce the files at risk by ΔFaR -28%. Sophia is integrated into GitHub pull requests allowing developers to select an appropriate expert or “learner” based on the context of the review. We release the Sophia bot as well as the code and data for replication purposes

    Where is the Author: the Copyright Protection for AI-Generated Works

    Get PDF
    The two groups of the human-or-machine questions, whether AI-generated works are copyrightable and whether AI-generated works have human authors, are revisiting the current copyright law with the emergence of AI-generated works. These revisiting questions reveal that the current authorship requirement fails to provide a clear and operable standard on evaluating a human contributor’s intellectual labor for creative output. Such a defect of the current authorship requirement has to be fixed to respond to the technological change of artificial intelligence and the burgeoning prevalence of AI- or advanced computer program-generated works. This dissertation’s main goal is to fix the flaw of the authorship requirement by establishing an improved authorship spectrum. The improved authorship spectrum can serve as a guide to evaluate whether a human contributor provide sufficient intellectual labor for creative output, and to locate the human author(s) behind a creative output in this AI era. I argue that by applying the improved spectrum to AI-generated works, such types of works can be distinguished into the two categories “the authored and copyrightable AI-generated works” and “the authorless and uncopyrighted ones.” Therefore, my intended conclusion for the revisiting human-or-machine questions is: not every AI-generated work falls out of the scope of copyright protection; some of the AI-generated works do have human authors and thus are copyrighted works of authorship, but some are authorless works because their human contributors all failed to offer the sufficient intellectual labor for the work

    Defectors: A Large, Diverse Python Dataset for Defect Prediction

    Full text link
    Defect prediction has been a popular research topic where machine learning (ML) and deep learning (DL) have found numerous applications. However, these ML/DL-based defect prediction models are often limited by the quality and size of their datasets. In this paper, we present Defectors, a large dataset for just-in-time and line-level defect prediction. Defectors consists of ≈\approx 213K source code files (≈\approx 93K defective and ≈\approx 120K defect-free) that span across 24 popular Python projects. These projects come from 18 different domains, including machine learning, automation, and internet-of-things. Such a scale and diversity make Defectors a suitable dataset for training ML/DL models, especially transformer models that require large and diverse datasets. We also foresee several application areas of our dataset including defect prediction and defect explanation. Dataset link: https://doi.org/10.5281/zenodo.770898
    • 

    corecore