34,441 research outputs found

    Feedback Generation for Performance Problems in Introductory Programming Assignments

    Full text link
    Providing feedback on programming assignments manually is a tedious, error prone, and time-consuming task. In this paper, we motivate and address the problem of generating feedback on performance aspects in introductory programming assignments. We studied a large number of functionally correct student solutions to introductory programming assignments and observed: (1) There are different algorithmic strategies, with varying levels of efficiency, for solving a given problem. These different strategies merit different feedback. (2) The same algorithmic strategy can be implemented in countless different ways, which are not relevant for reporting feedback on the student program. We propose a light-weight programming language extension that allows a teacher to define an algorithmic strategy by specifying certain key values that should occur during the execution of an implementation. We describe a dynamic analysis based approach to test whether a student's program matches a teacher's specification. Our experimental results illustrate the effectiveness of both our specification language and our dynamic analysis. On one of our benchmarks consisting of 2316 functionally correct implementations to 3 programming problems, we identified 16 strategies that we were able to describe using our specification language (in 95 minutes after inspecting 66, i.e., around 3%, implementations). Our dynamic analysis correctly matched each implementation with its corresponding specification, thereby automatically producing the intended feedback.Comment: Tech report/extended version of FSE 2014 pape

    Fairness-Aware Ranking in Search & Recommendation Systems with Application to LinkedIn Talent Search

    Full text link
    We present a framework for quantifying and mitigating algorithmic bias in mechanisms designed for ranking individuals, typically used as part of web-scale search and recommendation systems. We first propose complementary measures to quantify bias with respect to protected attributes such as gender and age. We then present algorithms for computing fairness-aware re-ranking of results. For a given search or recommendation task, our algorithms seek to achieve a desired distribution of top ranked results with respect to one or more protected attributes. We show that such a framework can be tailored to achieve fairness criteria such as equality of opportunity and demographic parity depending on the choice of the desired distribution. We evaluate the proposed algorithms via extensive simulations over different parameter choices, and study the effect of fairness-aware ranking on both bias and utility measures. We finally present the online A/B testing results from applying our framework towards representative ranking in LinkedIn Talent Search, and discuss the lessons learned in practice. Our approach resulted in tremendous improvement in the fairness metrics (nearly three fold increase in the number of search queries with representative results) without affecting the business metrics, which paved the way for deployment to 100% of LinkedIn Recruiter users worldwide. Ours is the first large-scale deployed framework for ensuring fairness in the hiring domain, with the potential positive impact for more than 630M LinkedIn members.Comment: This paper has been accepted for publication at ACM KDD 201

    One machine, one minute, three billion tetrahedra

    Full text link
    This paper presents a new scalable parallelization scheme to generate the 3D Delaunay triangulation of a given set of points. Our first contribution is an efficient serial implementation of the incremental Delaunay insertion algorithm. A simple dedicated data structure, an efficient sorting of the points and the optimization of the insertion algorithm have permitted to accelerate reference implementations by a factor three. Our second contribution is a multi-threaded version of the Delaunay kernel that is able to concurrently insert vertices. Moore curve coordinates are used to partition the point set, avoiding heavy synchronization overheads. Conflicts are managed by modifying the partitions with a simple rescaling of the space-filling curve. The performances of our implementation have been measured on three different processors, an Intel core-i7, an Intel Xeon Phi and an AMD EPYC, on which we have been able to compute 3 billion tetrahedra in 53 seconds. This corresponds to a generation rate of over 55 million tetrahedra per second. We finally show how this very efficient parallel Delaunay triangulation can be integrated in a Delaunay refinement mesh generator which takes as input the triangulated surface boundary of the volume to mesh

    Extreme-Point-based Heuristics for the Three-Dimensional Bin Packing problem

    Get PDF
    One of the main issues in addressing three-dimensional packing problems is finding an efficient and accurate definition of the points at which to place the items inside the bins, because the performance of exact and heuristic solution methods is actually strongly influenced by the choice of a placement rule. We introduce the extreme point concept and present a new extreme point-based rule for packing items inside a three-dimensional container. The extreme point rule is independent from the particular packing problem addressed and can handle additional constraints, such as fixing the position of the items. The new extreme point rule is also used to derive new constructive heuristics for the three-dimensional bin-packing problem. Extensive computational results show the effectiveness of the new heuristics compared to state-of-the-art results. Moreover, the same heuristics, when applied to the two-dimensional bin-packing problem, outperform those specifically designed for the proble

    Run Generation Revisited: What Goes Up May or May Not Come Down

    Full text link
    In this paper, we revisit the classic problem of run generation. Run generation is the first phase of external-memory sorting, where the objective is to scan through the data, reorder elements using a small buffer of size M , and output runs (contiguously sorted chunks of elements) that are as long as possible. We develop algorithms for minimizing the total number of runs (or equivalently, maximizing the average run length) when the runs are allowed to be sorted or reverse sorted. We study the problem in the online setting, both with and without resource augmentation, and in the offline setting. (1) We analyze alternating-up-down replacement selection (runs alternate between sorted and reverse sorted), which was studied by Knuth as far back as 1963. We show that this simple policy is asymptotically optimal. Specifically, we show that alternating-up-down replacement selection is 2-competitive and no deterministic online algorithm can perform better. (2) We give online algorithms having smaller competitive ratios with resource augmentation. Specifically, we exhibit a deterministic algorithm that, when given a buffer of size 4M , is able to match or beat any optimal algorithm having a buffer of size M . Furthermore, we present a randomized online algorithm which is 7/4-competitive when given a buffer twice that of the optimal. (3) We demonstrate that performance can also be improved with a small amount of foresight. We give an algorithm, which is 3/2-competitive, with foreknowledge of the next 3M elements of the input stream. For the extreme case where all future elements are known, we design a PTAS for computing the optimal strategy a run generation algorithm must follow. (4) Finally, we present algorithms tailored for nearly sorted inputs which are guaranteed to have optimal solutions with sufficiently long runs

    A Graph Rewriting Visual Language for Database Programming

    Get PDF
    Textual database programming languages are computationally complete, but have the disadvantage of giving the user a non-intuitive view of the database information that is being manipulated. Visual languages developed in recent years have allowed naive users access to a direct representation of data, often in a graph form, but have concentrated on user interface rather than complex programming tasks. There is a need for a system which combines the advantages of both these programming methods. We describe an implementation of Spider, an experimental visual database programming language aimed at programmers. It uses a graph rewriting paradigm as a basis for a fully visual, computationally complete language. The graphs it rewrites represent the schema and instances of a database. The unique graph rewriting method used by Spider has syntactic and semantic simplicity. Its form of algorithmic expression allows complex computation to be easily represented in short programs. Furthermore, Spider has greater power than normally provided in textual systems, and we show that queries on the schema and associative queries can be performed easily and without requiring any additions to the language
    • 

    corecore