9,101 research outputs found

    Visual Search at eBay

    Full text link
    In this paper, we propose a novel end-to-end approach for scalable visual search infrastructure. We discuss the challenges we faced for a massive volatile inventory like at eBay and present our solution to overcome those. We harness the availability of large image collection of eBay listings and state-of-the-art deep learning techniques to perform visual search at scale. Supervised approach for optimized search limited to top predicted categories and also for compact binary signature are key to scale up without compromising accuracy and precision. Both use a common deep neural network requiring only a single forward inference. The system architecture is presented with in-depth discussions of its basic components and optimizations for a trade-off between search relevance and latency. This solution is currently deployed in a distributed cloud infrastructure and fuels visual search in eBay ShopBot and Close5. We show benchmark on ImageNet dataset on which our approach is faster and more accurate than several unsupervised baselines. We share our learnings with the hope that visual search becomes a first class citizen for all large scale search engines rather than an afterthought.Comment: To appear in 23rd SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2017. A demonstration video can be found at https://youtu.be/iYtjs32vh4

    A Survey of Prediction and Classification Techniques in Multicore Processor Systems

    Get PDF
    In multicore processor systems, being able to accurately predict the future provides new optimization opportunities, which otherwise could not be exploited. For example, an oracle able to predict a certain application\u27s behavior running on a smart phone could direct the power manager to switch to appropriate dynamic voltage and frequency scaling modes that would guarantee minimum levels of desired performance while saving energy consumption and thereby prolonging battery life. Using predictions enables systems to become proactive rather than continue to operate in a reactive manner. This prediction-based proactive approach has become increasingly popular in the design and optimization of integrated circuits and of multicore processor systems. Prediction transforms from simple forecasting to sophisticated machine learning based prediction and classification that learns from existing data, employs data mining, and predicts future behavior. This can be exploited by novel optimization techniques that can span across all layers of the computing stack. In this survey paper, we present a discussion of the most popular techniques on prediction and classification in the general context of computing systems with emphasis on multicore processors. The paper is far from comprehensive, but, it will help the reader interested in employing prediction in optimization of multicore processor systems

    Performance analysis and optimization of automatic speech recognition

    Get PDF
    © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Fast and accurate Automatic Speech Recognition (ASR) is emerging as a key application for mobile devices. Delivering ASR on such devices is challenging due to the compute-intensive nature of the problem and the power constraints of embedded systems. In this paper, we provide a performance and energy characterization of Pocketsphinx, a popular toolset for ASR that targets mobile devices. We identify the computation of the Gaussian Mixture Model (GMM) as the main bottleneck, consuming more than 80 percent of the execution time. The CPI stack analysis shows that branches and main memory accesses are the main performance limiting factors for GMM computation. We propose several software-level optimizations driven by the power/performance analysis. Unlike previous proposals that trade accuracy for performance by reducing the number of Gaussians evaluated, we maintain accuracy and improve performance by effectively using the underlying CPU microarchitecture. First, we use a refactored implementation of the innermost loop of the GMM evaluation code to ameliorate the impact of branches. Second, we exploit the vector unit available on most modern CPUs to boost GMM computation, introducing a novel memory layout for storing the means and variances of the Gaussians in order to maximize the effectiveness of vectorization. Third, we compute the Gaussians for multiple frames in parallel, so means and variances can be fetched once in the on-chip caches and reused across multiple frames, significantly reducing memory bandwidth usage. We evaluate our optimizations using both hardware counters on real CPUs and simulations. Our experimental results show that the proposed optimizations provide 2.68x speedup over the baseline Pocketsphinx decoder on a high-end Intel Skylake CPU, while achieving 61 percent energy savings. On a modern ARM Cortex-A57 mobile processor our techniques improve performance by 1.85x, while providing 59 percent energy savings without any loss in the accuracy of the ASR system.Peer ReviewedPostprint (author's final draft

    A physical model suggests that hip-localized balance sense in birds improves state estimation in perching: implications for bipedal robots

    Get PDF
    In addition to a vestibular system, birds uniquely have a balance-sensing organ within the pelvis, called the lumbosacral organ (LSO). The LSO is well developed in terrestrial birds, possibly to facilitate balance control in perching and terrestrial locomotion. No previous studies have quantified the functional benefits of the LSO for balance. We suggest two main benefits of hip-localized balance sense: reduced sensorimotor delay and improved estimation of foot-ground acceleration. We used system identification to test the hypothesis that hip-localized balance sense improves estimates of foot acceleration compared to a head-localized sense, due to closer proximity to the feet. We built a physical model of a standing guinea fowl perched on a platform, and used 3D accelerometers at the hip and head to replicate balance sense by the LSO and vestibular systems. The horizontal platform was attached to the end effector of a 6 DOF robotic arm, allowing us to apply perturbations to the platform analogous to motions of a compliant branch. We also compared state estimation between models with low and high neck stiffness. Cross-correlations revealed that foot-to-hip sensing delays were shorter than foot-to-head, as expected. We used multi-variable output error state-space (MOESP) system identification to estimate foot-ground acceleration as a function of hip- and head-localized sensing, individually and combined. Hip-localized sensors alone provided the best state estimates, which were not improved when fused with head-localized sensors. However, estimates from head-localized sensors improved with higher neck stiffness. Our findings support the hypothesis that hip-localized balance sense improves the speed and accuracy of foot state estimation compared to head-localized sense. The findings also suggest a role of neck muscles for active sensing for balance control: increased neck stiffness through muscle co-contraction can improve the utility of vestibular signals. Our engineering approach provides, to our knowledge, the first quantitative evidence for functional benefits of the LSO balance sense in birds. The findings support notions of control modularity in birds, with preferential vestibular sense for head stability and gaze, and LSO for body balance control,respectively. The findings also suggest advantages for distributed and active sensing for agile locomotion in compliant bipedal robots
    • …
    corecore