Search CORE

9 research outputs found

Determining systematic differences in human graders for machine learning-based automated hiring

Author: Abhishek Unnam
Marios Kokkodis
Mike H. M. Teodorescu
Nailya Ordabayeva
Varun Aggarwal
Publication venue: 'Brookings Institution Press'
Publication date: 06/06/2022
Field of study

Firms routinely utilize natural language processing combined with other machine learning (ML) tools to assess prospective employees through automated resume classification based on pre-codified skill databases. The rush to automation can however backfire by encoding unintentional bias against groups of candidates. We run two experiments with human evaluators from two different countries to determine how cultural differences may affect hiring decisions. We use hiring materials provided by an international skill testing firm which runs hiring assessments for Fortune 500 companies. The company conducts a video-based interview assessment using machine learning, which grades job applicants automatically based on verbal and visual cues. Our study has three objectives: to compare the automatic assessments of the video interviews to assessments of the same interviews by human graders in order to assess how they differ; to examine which characteristics of human graders may lead to systematic differences in their assessments; and to propose a method to correct human evaluations using automation. We find that systematic differences can exist across human graders and that some of these differences can be accounted for by an ML tool if measured at the time of training

IssueLab

Property-based testing for functional programs

Author: Smallbone Nicholas
Publication venue
Publication date: 01/01/2011
Field of study

This thesis advances the view that property-based testing is a powerful way of testing functional programs, that has advantages not shared by traditional unit testing. It does this by showing two new applications of property-based testing to functional programming as well as a study of the effectiveness of property-based testing.First, we present a tool, QuickSpec, which attempts to infer an equational specification from a functional program with the help of testing. The resulting specifications can be used to improve your understanding of the code or as properties in a test suite. The tool is applicable to quite a wide variety of situations.Second, we describe a system that helps to find race conditions in Erlang programs. It consists of two parts: a randomised scheduler to provoke unusual behaviour in the program under test and allow replayability of test cases, and a module that tests that all of the functions of an API behave atomically with respect to each other.Finally, we present an experiment we carried out to compare property-based testing against test-driven development. The results were inconclusive, but in the process we developed a black-box algorithm for automatically grading student programs by testing, by inferring for each program a set of bugs that the program contains

Chalmers Research

Chalmers Publication Library

Property-based testing for functional programs

Author: Smallbone Nicholas
Publication venue
Publication date: 01/01/2011
Field of study

Chalmers Research

Holistic recommender systems for software engineering

Author: Lanza Michele
Mocci Andrea
Ponzanelli Luca
Publication venue
Publication date: 04/07/2017
Field of study

The knowledge possessed by developers is often not sufficient to overcome a programming problem. Short of talking to teammates, when available, developers often gather additional knowledge from development artifacts (e.g., project documentation), as well as online resources. The web has become an essential component in the modern developer’s daily life, providing a plethora of information from sources like forums, tutorials, Q&A websites, API documentation, and even video tutorials. Recommender Systems for Software Engineering (RSSE) provide developers with assistance to navigate the information space, automatically suggest useful items, and reduce the time required to locate the needed information. Current RSSEs consider development artifacts as containers of homogeneous information in form of pure text. However, text is a means to represent heterogeneous information provided by, for example, natural language, source code, interchange formats (e.g., XML, JSON), and stack traces. Interpreting the information from a pure textual point of view misses the intrinsic heterogeneity of the artifacts, thus leading to a reductionist approach. We propose the concept of Holistic Recommender Systems for Software Engineering (H-RSSE), i.e., RSSEs that go beyond the textual interpretation of the information contained in development artifacts. Our thesis is that modeling and aggregating information in a holistic fashion enables novel and advanced analyses of development artifacts. To validate our thesis we developed a framework to extract, model and analyze information contained in development artifacts in a reusable meta- information model. We show how RSSEs benefit from a meta-information model, since it enables customized and novel analyses built on top of our framework. The information can be thus reinterpreted from an holistic point of view, preserving its multi-dimensionality, and opening the path towards the concept of holistic recommender systems for software engineering

RERO DOC Digital Library

Automatic continuous testing to speed software development

Author: Saff David, 1976-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2004
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 147-152).Continuous testing is a new feature for software development environments that uses excess cycles on a developer's workstation to continuously run regression tests in the background, providing rapid feedback about test failures as source code is edited. It is intended to reduce the time and energy required to keep code well-tested, and to prevent regression errors from persisting uncaught for long periods of time. The longer that regression errors are allowed to linger during development, the more time is wasted debugging and fixing them once they are discovered. By monitoring and measuring software projects, we estimate that the wasted time, consisting of this preventable extra fixing cost added to the time spent running tests and waiting for them to complete, accounts for 10-15% of total development time. We present a model of developer behavior that uses data from past projects to infer developer beliefs and predict behavior in new environments -in particular, when changing testing methodologies or tools to reduce wasted time. This model predicts that continuous testing would reduce wasted time by 92-98%, a substantial improvement over other approaches we evaluated, such as automatic test prioritization and changing manual test frequencies. A controlled human experiment indicates that student developers using continuous testing were three times more likely to complete a task before the deadline than those without, with no significant effect on time worked.(cont.) Most participants found continuous testing to be useful and believed that it helped them write better code faster. 90% would recommend the tool to others. We show the first empirical evidence of a benefit from continuous compilation, a popular related feature. Continuous testing has been integrated into Emacs and Eclipse. We detail the functional and technical design of the Eclipse plug-in, which is publicly beta-released.by David Saff.S.M

DSpace@MIT

Code similarity and clone search in large-scale source code data

Author: Ragkhitwetsagul Chaiyong
Publication venue: UCL (University College London)
Publication date: 28/10/2018
Field of study

Software development is tremendously benefited from the Internet by having online code corpora that enable instant sharing of source code and online developer's guides and documentation. Nowadays, duplicated code (i.e., code clones) not only exists within or across software projects but also between online code repositories and websites. We call them "online code clones."' They can lead to license violations, bug propagation, and re-use of outdated code similar to classic code clones between software systems. Unfortunately, they are difficult to locate and fix since the search space in online code corpora is large and no longer confined to a local repository. This thesis presents a combined study of code similarity and online code clones. We empirically show that many code snippets on Stack Overflow are cloned from open source projects. Several of them become outdated or violate their original license and are possibly harmful to reuse. To develop a solution for finding online code clones, we study various code similarity techniques to gain insights into their strengths and weaknesses. A framework, called OCD, for evaluating code similarity and clone search tools is introduced and used to compare 34 state-of-the-art techniques on pervasively modified code and boiler-plate code. We also found that clone detection techniques can be enhanced by compilation and decompilation. Using the knowledge from the comparison of code similarity analysers, we create and evaluate Siamese, a scalable token-based clone search technique via multiple code representations. Our evaluation shows that Siamese scales to large-scale source code data of 365 million lines of code and offers high search precision and recall. Its clone search precision is comparable to seven state-of-the-art clone detection tools on the OCD framework. Finally, we demonstrate the usefulness of Siamese by applying the tool to find online code clones, automatically analyse clone licenses, and recommend tests for reuse

UCL Discovery

Grading Uncompilable Programs

Author
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date
Field of study

Crossref