8,702 research outputs found

    Performance analysis and optimization of automatic speech recognition

    Get PDF
    © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Fast and accurate Automatic Speech Recognition (ASR) is emerging as a key application for mobile devices. Delivering ASR on such devices is challenging due to the compute-intensive nature of the problem and the power constraints of embedded systems. In this paper, we provide a performance and energy characterization of Pocketsphinx, a popular toolset for ASR that targets mobile devices. We identify the computation of the Gaussian Mixture Model (GMM) as the main bottleneck, consuming more than 80 percent of the execution time. The CPI stack analysis shows that branches and main memory accesses are the main performance limiting factors for GMM computation. We propose several software-level optimizations driven by the power/performance analysis. Unlike previous proposals that trade accuracy for performance by reducing the number of Gaussians evaluated, we maintain accuracy and improve performance by effectively using the underlying CPU microarchitecture. First, we use a refactored implementation of the innermost loop of the GMM evaluation code to ameliorate the impact of branches. Second, we exploit the vector unit available on most modern CPUs to boost GMM computation, introducing a novel memory layout for storing the means and variances of the Gaussians in order to maximize the effectiveness of vectorization. Third, we compute the Gaussians for multiple frames in parallel, so means and variances can be fetched once in the on-chip caches and reused across multiple frames, significantly reducing memory bandwidth usage. We evaluate our optimizations using both hardware counters on real CPUs and simulations. Our experimental results show that the proposed optimizations provide 2.68x speedup over the baseline Pocketsphinx decoder on a high-end Intel Skylake CPU, while achieving 61 percent energy savings. On a modern ARM Cortex-A57 mobile processor our techniques improve performance by 1.85x, while providing 59 percent energy savings without any loss in the accuracy of the ASR system.Peer ReviewedPostprint (author's final draft

    MDA-Based Reverse Engineering

    Get PDF

    Locating bugs without looking back

    Get PDF
    Bug localisation is a core program comprehension task in software maintenance: given the observation of a bug, e.g. via a bug report, where is it located in the source code? Information retrieval (IR) approaches see the bug report as the query, and the source code files as the documents to be retrieved, ranked by relevance. Such approaches have the advantage of not requiring expensive static or dynamic analysis of the code. However, current state-of-the-art IR approaches rely on project history, in particular previously fixed bugs or previous versions of the source code. We present a novel approach that directly scores each current file against the given report, thus not requiring past code and reports. The scoring method is based on heuristics identified through manual inspection of a small sample of bug reports. We compare our approach to eight others, using their own five metrics on their own six open source projects. Out of 30 performance indicators, we improve 27 and equal 2. Over the projects analysed, on average we find one or more affected files in the top 10 ranked files for 76% of the bug reports. These results show the applicability of our approach to software projects without history

    Naming the Identified Feature Implementation Blocks from Software Source Code

    Get PDF
    Identifying software identifiers that implement a particular feature of a software product is known as feature identification. Feature identification is one of the most critical and popular processes performed by software engineers during software maintenance activity. However, a meaningful name must be assigned to the Identified Feature Implementation Block (IFIB) to complete the feature identification process. The feature naming process remains a challenging task, where the majority of existing approaches manually assign the name of the IFIB. In this paper, the approach called FeatureClouds was proposed, which can be exploited by software developers to name the IFIBs from software code. FeatureClouds approach incorporates word clouds visualization technique to name Feature Blocks (FBs) by using the most frequent words across these blocks. FeatureClouds had evaluated by assessing its added benefit to the current approaches in the literature, where limited tool support was supplied to software developers to distinguish feature names of the IFIBs. For validity, FeatureClouds had applied to draw shapes and ArgoUML software. The findings showed that the proposed approach achieved promising results according to well-known metrics in terms of Precision and Recall

    A model of software component interactions using the call graph technique

    Get PDF
    Interaction information that is related to operations between components is important, especially when the program needs to be modified and maintained. Therefore, the affected components must be identified and matched based on the requirement of the system. This information can be obtained through performing the code review technique, which requires an analyst to search for specific information from the source code, which is a very time consuming process. This research proposed a model for representing software component interactions where this information was automatically extracted from the source code in order to provide an effective display for the software components interaction representation. The objective was achieved through applying a research design methodology, which consists of five phases: awareness of the problem, suggestion, development, evaluation, and conclusion. The development phase was conducted by automatically extracting the components‘ interaction information using appropriate reverse engineering tools and supporting programs that were developed in this research. These tools were used to extract software information, extract the information of component interactions in software programs, and transform this information into the proposed model, which was in the form of a call graph. The produced model was evaluated using a visualization tool and by expert review. The visualization tool was used to display the call graph from a text format into a graphical view. The processed model evaluation was conducted through an expert review technique. The findings from the model evaluation show that the produced model can be used and manipulated to visualize the component interactions. It provides a process that allows a visualization display for analysts to view the interaction of software components in order to comprehend the components integrations that are involved. This information can be manipulated and improved the program comprehension, especially for other software maintenance purposes

    Highly-cited papers in software engineering: The top 100

    Get PDF
    Context: According to the search reported in this paper, as of this writing (May 2015), a very large number of papers (more than 70,000) have been published in the area of Software Engineering (SE) since its inception in 1968. Citations are crucial in any research area to position the work and to build on the work of others. Identification and characterization of highly-cited papers are common and are regularly reported in various disciplines. Objective: The objective of this study is to identify the papers in the area of SE that have influenced others the most as measured by citation count. Studying highly-cited SE papers helps researchers to see the type of approaches and research methods presented and applied in such papers, so as to be able to learn from them to write higher quality papers which will likely receive high citations. Method: To achieve the above objective, we conducted a study, comprised of five research questions, to identify and classify the top-100 highly-cited SE papers in terms of two metrics: total number of citations and average annual number of citations. Results: By total number of citations, the top paper is "A metrics suite for object-oriented design", cited 1,817 times and published in 1994. By average annual number of citations, the top paper is "QoS-aware middleware for Web services composition", cited 154.2 times on average annually and published in 2004. Conclusion: It is concluded that it is important to identify the highly-cited SE papers and also to characterize the overall citation landscape in the SE field. We hope that this paper will encourage further discussions in the SE community towards further analysis and formal characterization of the highly-cited SE papers.Vahid Garousi was partially supported by several internal grants provided by the Hacettepe University. The authors would like to thank the anonymous reviewers for their insightful comments

    Finding strong lenses in CFHTLS using convolutional neural networks

    Get PDF
    We train and apply convolutional neural networks, a machine learning technique developed to learn from and classify image data, to Canada-France-Hawaii Telescope Legacy Survey (CFHTLS) imaging for the identification of potential strong lensing systems. An ensemble of four convolutional neural networks was trained on images of simulated galaxy-galaxy lenses. The training sets consisted of a total of 62,406 simulated lenses and 64,673 non-lens negative examples generated with two different methodologies. The networks were able to learn the features of simulated lenses with accuracy of up to 99.8% and a purity and completeness of 94-100% on a test set of 2000 simulations. An ensemble of trained networks was applied to all of the 171 square degrees of the CFHTLS wide field image data, identifying 18,861 candidates including 63 known and 139 other potential lens candidates. A second search of 1.4 million early type galaxies selected from the survey catalog as potential deflectors, identified 2,465 candidates including 117 previously known lens candidates, 29 confirmed lenses/high-quality lens candidates, 266 novel probable or potential lenses and 2097 candidates we classify as false positives. For the catalog-based search we estimate a completeness of 21-28% with respect to detectable lenses and a purity of 15%, with a false-positive rate of 1 in 671 images tested. We predict a human astronomer reviewing candidates produced by the system would identify ~20 probable lenses and 100 possible lenses per hour in a sample selected by the robot. Convolutional neural networks are therefore a promising tool for use in the search for lenses in current and forthcoming surveys such as the Dark Energy Survey and the Large Synoptic Survey Telescope.Comment: 16 pages, 8 figures. Accepted by MNRA
    corecore