22,258 research outputs found
Should I Bug You? Identifying Domain Experts in Software Projects Using Code Complexity Metrics
In any sufficiently complex software system there are experts, having a
deeper understanding of parts of the system than others. However, it is not
always clear who these experts are and which particular parts of the system
they can provide help with. We propose a framework to elicit the expertise of
developers and recommend experts by analyzing complexity measures over time.
Furthermore, teams can detect those parts of the software for which currently
no, or only few experts exist and take preventive actions to keep the
collective code knowledge and ownership high. We employed the developed
approach at a medium-sized company. The results were evaluated with a survey,
comparing the perceived and the computed expertise of developers. We show that
aggregated code metrics can be used to identify experts for different software
components. The identified experts were rated as acceptable candidates by
developers in over 90% of all cases
Revisiting the License v. Sale Conundrum
This Article seeks to answer a question that has become increasingly more important as commerce moves from the tangible to the intangibleâto what extent may a business use a contract to control the use of a fully paid product? The characterization of a transaction as a license or a sale determines what may be done with a product, who controls how the product may be used, and what happens in the event of a dispute. The past generation has seen a seismic shift in the way businesses distribute their products to consumers. Businesses often âlicenseâ rather than âsellâ their products, and view consumers as licensees, rather than owners, of the products they buy. Customers own their print copies of books, movies, and music but merely license the same content when they purchase it in digital form. The marketplace transition from sale to license has far and wide ripple effects affecting a range of issues from innovation to the environment. The rapid emergence of the Internet of Things adds to the urgency and importance of the questionâ are goods licensed or sold?
The question of whether a digital product is licensed or sold is often conflated with the question of whether a product should be licensed or sold. The problem lies, in large part, with the well-intentioned but misguided turn that contract law has taken away from the intent of the parties and toward a narrow vision of efficiency. When it comes to commercial transactions, the narrow efficiency view prioritizes quantity of completed transactions over quality, ignoring consumer expectations and the way in which distrust creates uncertainty in the marketplace. This Article proposes a methodology for resolving the license v. sale conundrum that promotes a more expansive view of efficiency and brings more predictability and fairness to an increasingly muddled area of the law
Mitigating Turnover with Code Review Recommendation: Balancing Expertise, Workload, and Knowledge Distribution
Developer turnover is inevitable on software projects and leads to knowledge loss, a reduction in productivity, and an increase in defects. Mitigation strategies to deal with turnover tend to disrupt and increase workloads for developers. In this work, we suggest that through code review recommendation we can distribute knowledge and mitigate turnover with minimal impact on the development process. We evaluate review recommenders in the context of ensuring expertise during review, Expertise, reducing the review workload of the core team, CoreWorkload, and reducing the
Files at Risk to turnover, FaR. We find that prior work that assigns reviewers based on file ownership concentrates knowledge on a small group of core developers increasing risk of knowledge loss
from turnover by up to 65%. We propose learning and retention aware review recommenders that when combined are effective at reducing the risk of turnover by -29% but they unacceptably reduce
the overall expertise during reviews by -26%. We develop the Sophia recommender that suggest experts when none of the files under review are hoarded by developers but distributes knowledge when files are at risk. In this way, we are able to simultaneously increase expertise during review with a ÎExpertise of 6%, with a negligible impact on workload of ÎCoreWorkload of 0.09%, and reduce the files at risk by ÎFaR -28%. Sophia is integrated into GitHub pull requests allowing developers to select an appropriate expert or âlearnerâ based on the context of the review. We release the Sophia bot as well as the code and data for replication purposes
Where is the Author: the Copyright Protection for AI-Generated Works
The two groups of the human-or-machine questions, whether AI-generated works are copyrightable and whether AI-generated works have human authors, are revisiting the current copyright law with the emergence of AI-generated works. These revisiting questions reveal that the current authorship requirement fails to provide a clear and operable standard on evaluating a human contributorâs intellectual labor for creative output. Such a defect of the current authorship requirement has to be fixed to respond to the technological change of artificial intelligence and the burgeoning prevalence of AI- or advanced computer program-generated works.
This dissertationâs main goal is to fix the flaw of the authorship requirement by establishing an improved authorship spectrum. The improved authorship spectrum can serve as a guide to evaluate whether a human contributor provide sufficient intellectual labor for creative output, and to locate the human author(s) behind a creative output in this AI era. I argue that by applying the improved spectrum to AI-generated works, such types of works can be distinguished into the two categories âthe authored and copyrightable AI-generated worksâ and âthe authorless and uncopyrighted ones.â Therefore, my intended conclusion for the revisiting human-or-machine questions is: not every AI-generated work falls out of the scope of copyright protection; some of the AI-generated works do have human authors and thus are copyrighted works of authorship, but some are authorless works because their human contributors all failed to offer the sufficient intellectual labor for the work
Defectors: A Large, Diverse Python Dataset for Defect Prediction
Defect prediction has been a popular research topic where machine learning
(ML) and deep learning (DL) have found numerous applications. However, these
ML/DL-based defect prediction models are often limited by the quality and size
of their datasets. In this paper, we present Defectors, a large dataset for
just-in-time and line-level defect prediction. Defectors consists of
213K source code files ( 93K defective and 120K defect-free)
that span across 24 popular Python projects. These projects come from 18
different domains, including machine learning, automation, and
internet-of-things. Such a scale and diversity make Defectors a suitable
dataset for training ML/DL models, especially transformer models that require
large and diverse datasets. We also foresee several application areas of our
dataset including defect prediction and defect explanation.
Dataset link: https://doi.org/10.5281/zenodo.770898
- âŠ