111,807 research outputs found

    A Cost-based Optimizer for Gradient Descent Optimization

    Full text link
    As the use of machine learning (ML) permeates into diverse application domains, there is an urgent need to support a declarative framework for ML. Ideally, a user will specify an ML task in a high-level and easy-to-use language and the framework will invoke the appropriate algorithms and system configurations to execute it. An important observation towards designing such a framework is that many ML tasks can be expressed as mathematical optimization problems, which take a specific form. Furthermore, these optimization problems can be efficiently solved using variations of the gradient descent (GD) algorithm. Thus, to decouple a user specification of an ML task from its execution, a key component is a GD optimizer. We propose a cost-based GD optimizer that selects the best GD plan for a given ML task. To build our optimizer, we introduce a set of abstract operators for expressing GD algorithms and propose a novel approach to estimate the number of iterations a GD algorithm requires to converge. Extensive experiments on real and synthetic datasets show that our optimizer not only chooses the best GD plan but also allows for optimizations that achieve orders of magnitude performance speed-up.Comment: Accepted at SIGMOD 201

    Applying XP Ideas Formally: The Story Card and Extreme X-Machines

    Get PDF
    By gathering requirements on story cards extreme programming (XP) makes requirements collection easy. However it is less clear how the story cards are translated into a �finished product. We propose that a formal specification method based on X-Machines can be used to direct this transition. Extreme X-Machines �t in to the XP method well, without large overheads in design and maintenance. We also investigate how such machines adapt to change in the story cards and propose how this could be further enhanced

    Teaching telecommunication standards: bridging the gap between theory and practice

    Get PDF
    ©2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Telecommunication standards have become a reliable mechanism to strengthen collaboration between industry and research institutions to accelerate the evolution of communications systems. Standards are needed to enable cooperation while promoting competition. Within the framework of a standard, the companies involved in the standardization process contribute and agree on appropriate technical specifications to ensure diversity and compatibility, and facilitate worldwide commercial deployment and evolution. Those parts of the system that can create competitive advantages are intentionally left open in the specifications. Such specifications are extensive, complex, and minimalistic. This makes telecommunication standards education a difficult endeavor, but it is much demanded by industry and governments to spur economic growth. This article describes a methodology for teaching wireless communications standards. We define our methodology around six learning stages that assimilate the standardization process and identify key learning objectives for each. Enabled by software-defined radio technology, we describe a practical learning environment that facilitates developing many of the needed technical and soft skills without the inherent difficulty and cost associated with radio frequency components and regulation. Using only open source software and commercial of-the-shelf computers, this environment is portable and can easily be recreated at other educational institutions and adapted to their educational needs and constraints. We discuss our and our students' experiences when employing the proposed methodology to 4G LTE standard education at Barcelona Tech.Peer ReviewedPostprint (author's final draft

    Why Do Developers Get Password Storage Wrong? A Qualitative Usability Study

    Full text link
    Passwords are still a mainstay of various security systems, as well as the cause of many usability issues. For end-users, many of these issues have been studied extensively, highlighting problems and informing design decisions for better policies and motivating research into alternatives. However, end-users are not the only ones who have usability problems with passwords! Developers who are tasked with writing the code by which passwords are stored must do so securely. Yet history has shown that this complex task often fails due to human error with catastrophic results. While an end-user who selects a bad password can have dire consequences, the consequences of a developer who forgets to hash and salt a password database can lead to far larger problems. In this paper we present a first qualitative usability study with 20 computer science students to discover how developers deal with password storage and to inform research into aiding developers in the creation of secure password systems

    FixMiner: Mining Relevant Fix Patterns for Automated Program Repair

    Get PDF
    Patching is a common activity in software development. It is generally performed on a source code base to address bugs or add new functionalities. In this context, given the recurrence of bugs across projects, the associated similar patches can be leveraged to extract generic fix actions. While the literature includes various approaches leveraging similarity among patches to guide program repair, these approaches often do not yield fix patterns that are tractable and reusable as actionable input to APR systems. In this paper, we propose a systematic and automated approach to mining relevant and actionable fix patterns based on an iterative clustering strategy applied to atomic changes within patches. The goal of FixMiner is thus to infer separate and reusable fix patterns that can be leveraged in other patch generation systems. Our technique, FixMiner, leverages Rich Edit Script which is a specialized tree structure of the edit scripts that captures the AST-level context of the code changes. FixMiner uses different tree representations of Rich Edit Scripts for each round of clustering to identify similar changes. These are abstract syntax trees, edit actions trees, and code context trees. We have evaluated FixMiner on thousands of software patches collected from open source projects. Preliminary results show that we are able to mine accurate patterns, efficiently exploiting change information in Rich Edit Scripts. We further integrated the mined patterns to an automated program repair prototype, PARFixMiner, with which we are able to correctly fix 26 bugs of the Defects4J benchmark. Beyond this quantitative performance, we show that the mined fix patterns are sufficiently relevant to produce patches with a high probability of correctness: 81% of PARFixMiner's generated plausible patches are correct.Comment: 31 pages, 11 figure

    Characterizing and Subsetting Big Data Workloads

    Full text link
    Big data benchmark suites must include a diversity of data and workloads to be useful in fairly evaluating big data systems and architectures. However, using truly comprehensive benchmarks poses great challenges for the architecture community. First, we need to thoroughly understand the behaviors of a variety of workloads. Second, our usual simulation-based research methods become prohibitively expensive for big data. As big data is an emerging field, more and more software stacks are being proposed to facilitate the development of big data applications, which aggravates hese challenges. In this paper, we first use Principle Component Analysis (PCA) to identify the most important characteristics from 45 metrics to characterize big data workloads from BigDataBench, a comprehensive big data benchmark suite. Second, we apply a clustering technique to the principle components obtained from the PCA to investigate the similarity among big data workloads, and we verify the importance of including different software stacks for big data benchmarking. Third, we select seven representative big data workloads by removing redundant ones and release the BigDataBench simulation version, which is publicly available from http://prof.ict.ac.cn/BigDataBench/simulatorversion/.Comment: 11 pages, 6 figures, 2014 IEEE International Symposium on Workload Characterizatio
    • …
    corecore