111,807 research outputs found
A Cost-based Optimizer for Gradient Descent Optimization
As the use of machine learning (ML) permeates into diverse application
domains, there is an urgent need to support a declarative framework for ML.
Ideally, a user will specify an ML task in a high-level and easy-to-use
language and the framework will invoke the appropriate algorithms and system
configurations to execute it. An important observation towards designing such a
framework is that many ML tasks can be expressed as mathematical optimization
problems, which take a specific form. Furthermore, these optimization problems
can be efficiently solved using variations of the gradient descent (GD)
algorithm. Thus, to decouple a user specification of an ML task from its
execution, a key component is a GD optimizer. We propose a cost-based GD
optimizer that selects the best GD plan for a given ML task. To build our
optimizer, we introduce a set of abstract operators for expressing GD
algorithms and propose a novel approach to estimate the number of iterations a
GD algorithm requires to converge. Extensive experiments on real and synthetic
datasets show that our optimizer not only chooses the best GD plan but also
allows for optimizations that achieve orders of magnitude performance speed-up.Comment: Accepted at SIGMOD 201
Applying XP Ideas Formally: The Story Card and Extreme X-Machines
By gathering requirements on story cards extreme programming (XP) makes requirements collection easy. However it is less clear how the story cards are translated into a �finished product. We propose that a formal specification method based on X-Machines can be used to direct this transition. Extreme X-Machines �t in to the XP method well, without large overheads in design and maintenance. We also investigate how such machines adapt to change in the story cards and propose how this could be further enhanced
Teaching telecommunication standards: bridging the gap between theory and practice
©2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Telecommunication standards have become a reliable mechanism to strengthen collaboration between industry and research institutions to accelerate the evolution of communications systems. Standards are needed to enable cooperation while promoting competition. Within the framework of a standard, the companies involved in the standardization process contribute and agree on appropriate technical specifications to ensure diversity and compatibility, and facilitate worldwide commercial deployment and evolution. Those parts of the system that can create competitive advantages are intentionally left open in the specifications. Such specifications are extensive, complex, and minimalistic. This makes telecommunication standards education a difficult endeavor, but it is much demanded by industry and governments to spur economic growth. This article describes a methodology for teaching wireless communications standards. We define our methodology around six learning stages that assimilate the standardization process and identify key learning objectives for each. Enabled by software-defined radio technology, we describe a practical learning environment that facilitates developing many of the needed technical and soft skills without the inherent difficulty and cost associated with radio frequency components and regulation. Using only open source software and commercial of-the-shelf computers, this environment is portable and can easily be recreated at other educational institutions and adapted to their educational needs and constraints. We discuss our and our students' experiences when employing the proposed methodology to 4G LTE standard education at Barcelona Tech.Peer ReviewedPostprint (author's final draft
Why Do Developers Get Password Storage Wrong? A Qualitative Usability Study
Passwords are still a mainstay of various security systems, as well as the
cause of many usability issues. For end-users, many of these issues have been
studied extensively, highlighting problems and informing design decisions for
better policies and motivating research into alternatives. However, end-users
are not the only ones who have usability problems with passwords! Developers
who are tasked with writing the code by which passwords are stored must do so
securely. Yet history has shown that this complex task often fails due to human
error with catastrophic results. While an end-user who selects a bad password
can have dire consequences, the consequences of a developer who forgets to hash
and salt a password database can lead to far larger problems. In this paper we
present a first qualitative usability study with 20 computer science students
to discover how developers deal with password storage and to inform research
into aiding developers in the creation of secure password systems
FixMiner: Mining Relevant Fix Patterns for Automated Program Repair
Patching is a common activity in software development. It is generally
performed on a source code base to address bugs or add new functionalities. In
this context, given the recurrence of bugs across projects, the associated
similar patches can be leveraged to extract generic fix actions. While the
literature includes various approaches leveraging similarity among patches to
guide program repair, these approaches often do not yield fix patterns that are
tractable and reusable as actionable input to APR systems. In this paper, we
propose a systematic and automated approach to mining relevant and actionable
fix patterns based on an iterative clustering strategy applied to atomic
changes within patches. The goal of FixMiner is thus to infer separate and
reusable fix patterns that can be leveraged in other patch generation systems.
Our technique, FixMiner, leverages Rich Edit Script which is a specialized tree
structure of the edit scripts that captures the AST-level context of the code
changes. FixMiner uses different tree representations of Rich Edit Scripts for
each round of clustering to identify similar changes. These are abstract syntax
trees, edit actions trees, and code context trees. We have evaluated FixMiner
on thousands of software patches collected from open source projects.
Preliminary results show that we are able to mine accurate patterns,
efficiently exploiting change information in Rich Edit Scripts. We further
integrated the mined patterns to an automated program repair prototype,
PARFixMiner, with which we are able to correctly fix 26 bugs of the Defects4J
benchmark. Beyond this quantitative performance, we show that the mined fix
patterns are sufficiently relevant to produce patches with a high probability
of correctness: 81% of PARFixMiner's generated plausible patches are correct.Comment: 31 pages, 11 figure
Characterizing and Subsetting Big Data Workloads
Big data benchmark suites must include a diversity of data and workloads to
be useful in fairly evaluating big data systems and architectures. However,
using truly comprehensive benchmarks poses great challenges for the
architecture community. First, we need to thoroughly understand the behaviors
of a variety of workloads. Second, our usual simulation-based research methods
become prohibitively expensive for big data. As big data is an emerging field,
more and more software stacks are being proposed to facilitate the development
of big data applications, which aggravates hese challenges. In this paper, we
first use Principle Component Analysis (PCA) to identify the most important
characteristics from 45 metrics to characterize big data workloads from
BigDataBench, a comprehensive big data benchmark suite. Second, we apply a
clustering technique to the principle components obtained from the PCA to
investigate the similarity among big data workloads, and we verify the
importance of including different software stacks for big data benchmarking.
Third, we select seven representative big data workloads by removing redundant
ones and release the BigDataBench simulation version, which is publicly
available from http://prof.ict.ac.cn/BigDataBench/simulatorversion/.Comment: 11 pages, 6 figures, 2014 IEEE International Symposium on Workload
Characterizatio
- …