2,294 research outputs found

    A Survey of Geometric Optimization for Deep Learning: From Euclidean Space to Riemannian Manifold

    Full text link
    Although Deep Learning (DL) has achieved success in complex Artificial Intelligence (AI) tasks, it suffers from various notorious problems (e.g., feature redundancy, and vanishing or exploding gradients), since updating parameters in Euclidean space cannot fully exploit the geometric structure of the solution space. As a promising alternative solution, Riemannian-based DL uses geometric optimization to update parameters on Riemannian manifolds and can leverage the underlying geometric information. Accordingly, this article presents a comprehensive survey of applying geometric optimization in DL. At first, this article introduces the basic procedure of the geometric optimization, including various geometric optimizers and some concepts of Riemannian manifold. Subsequently, this article investigates the application of geometric optimization in different DL networks in various AI tasks, e.g., convolution neural network, recurrent neural network, transfer learning, and optimal transport. Additionally, typical public toolboxes that implement optimization on manifold are also discussed. Finally, this article makes a performance comparison between different deep geometric optimization methods under image recognition scenarios.Comment: 41 page

    Efficient Hardware Implementation of Deep Learning Networks Based on the Convolutional Neural Network

    Get PDF
    Image classification, speech processing, autonomous driving, and medical diagnosis have made the adoption of Deep Neural Networks (DNN) mainstream. Many deep networks such as AlexNet, GoogleNet, ResidualNet, MobileNet, YOLOv3 and Transformers have achieved immense success and popularity. However, implementing these deep and complex networks in hardware is a challenging feat. The growing demand of DNN applications in mobile devices and data centers have led the researchers to explore application specific hardware accelerators for DNNs. There have been numerous hardware and software based solutions to improve DNN throughput, latency, performance and accuracy. Any solution for hardware acceleration needs to optimize in a space confined by these metrics. Hardware acceleration of Deep Neural Networks (DNN) is a highly effective and viable solution for running them on mobile devices. The power of DNN is now available at the edge in a compact and power-efficient form factor because of hardware acceleration. In this thesis, we introduce a novel architecture that uses a generalized method called Single Input Partial Product 2-Dimensional Convolution (SIPP2D Convolution) which calculates a 2-D convolution in a fast and expedient manner. We present the exploration designs that have culminated into SIPP2D and emphasize its benefits. SIPP2D architecture prevents the re-fetching of input weights for the calculation of partial products. It can calculate the output of any input size and kernel size with a low memory-traffic while maintaining a low latency and high throughput compared to other popular techniques. In addition to being compatible with any input and kernel size, SIPP2D architecture can be modified to support any allowable stride. We describe the data flow and algorithmic modifications to SIPP2D which extends its capabilities to accommodate multi-stride convolutions. Supporting multi-stride convolutions is an essential feature addition to SIPP2D architecture, increasing its versatility and network agnostic character for convolutional type DNNs. Along with architectural explorations, we have also performed research in the area of model optimization. It is widely understood that any change on the algorithmic level of the network pays significant dividends at the hardware level. Compression and optimization techniques such as pruning and quantization help reduce the size of the model while maintaining the accuracy at an acceptable level. Thus, by combining techniques such as channel pruning with SIPP2D we can only boost its performance. In this thesis, we examine the performance of channel pruned SIPP2D compared to other compressed models. Traditionally, quantization of weights and inputs are used to reduce the memory transfer and power consumption. However, quantizing the outputs of layers can be a challenge since the output of each layer changes with the input. In our research, we use quantization on the output of each layer for AlexNet and VGGNet-16 to analyze the effect it has on accuracy. We use Signal to Noise Quantization Ratio (SQNR) to empirically determine the integer length (IL) as well as the fractional length (FL) for the fixed point precision that can yields the lowest SQNR and highest accuracy. Based on our observations, we can report that accuracy is sensitive to fractional length as well as integer length. For AlexNet, we observe deterioration in accuracy as the word length decreases. The Top -5 accuracy goes from 77% for floating point precision to 56% for a WL of 12 and FL of 8. The results are similar in the case of VGGNet-16. The Top-5 accuracy for VGGNet-16 decreases from 82% for floating point to 30% for a WL of 12 and FL of 8. In addition to the small word length, we observe the accuracy to be highly dependent on the integer length as well as the fractional length. We have also done analysis on the loss after retraining post quantization. We use polynomial fitting to achieve a relationship with fractional length and the drop in accuracy still sustained after retraining a quantized network. In summary, the winning combination of the enhanced SIPP2D architecture and compression techniques such as channel pruning and quantization techniques is highly advantageous and conducive to widespread adoption. SIPP2D architecture, with its flexible data flow and algorithmic modifications to support multi-stride convolutions, offers a powerful and versatile framework for deep neural networks

    Speaking Rate Effects on Normal Aspects of Articulation: Outcomes and Issues

    Get PDF
    The articulatory effects of speaking rate have been a point of focus for a substantial literature in speech science. The normal aspects of speaking rate variation have influenced theories and models of speech production and perception in the literature pertaining to both normal and disordered speech. While the body of literature pertaining to the articulatory effects of speaking rate change is reasonably large, few speaker-general outcomes have emerged. The purpose of this paper is to review outcomes of the existing literature and address problems related to the study of speaking rate that may be germane to the recurring theme that speaking rate effects are largely idiosyncratic

    Differentiable Quantum Architecture Search

    Full text link
    Quantum architecture search (QAS) is the process of automating architecture engineering of quantum circuits. It has been desired to construct a powerful and general QAS platform which can significantly accelerate current efforts to identify quantum advantages of error-prone and depth-limited quantum circuits in the NISQ era. Hereby, we propose a general framework of differentiable quantum architecture search (DQAS), which enables automated designs of quantum circuits in an end-to-end differentiable fashion. We present several examples of circuit design problems to demonstrate the power of DQAS. For instance, unitary operations are decomposed into quantum gates, noisy circuits are re-designed to improve accuracy, and circuit layouts for quantum approximation optimization algorithm are automatically discovered and upgraded for combinatorial optimization problems. These results not only manifest the vast potential of DQAS being an essential tool for the NISQ application developments, but also present an interesting research topic from the theoretical perspective as it draws inspirations from the newly emerging interdisciplinary paradigms of differentiable programming, probabilistic programming, and quantum programming.Comment: 9.1 pages + Appendix, 5 figure

    Interference Alignment for Cognitive Radio Communications and Networks: A Survey

    Get PDF
    © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).Interference alignment (IA) is an innovative wireless transmission strategy that has shown to be a promising technique for achieving optimal capacity scaling of a multiuser interference channel at asymptotically high-signal-to-noise ratio (SNR). Transmitters exploit the availability of multiple signaling dimensions in order to align their mutual interference at the receivers. Most of the research has focused on developing algorithms for determining alignment solutions as well as proving interference alignment’s theoretical ability to achieve the maximum degrees of freedom in a wireless network. Cognitive radio, on the other hand, is a technique used to improve the utilization of the radio spectrum by opportunistically sensing and accessing unused licensed frequency spectrum, without causing harmful interference to the licensed users. With the increased deployment of wireless services, the possibility of detecting unused frequency spectrum becomes diminished. Thus, the concept of introducing interference alignment in cognitive radio has become a very attractive proposition. This paper provides a survey of the implementation of IA in cognitive radio under the main research paradigms, along with a summary and analysis of results under each system model.Peer reviewe

    Machine learning for multi-criteria inventory classification applied to intermittent demand

    Get PDF
    Multi-criteria inventory classification groups inventory items into classes, each of which is managed by a specific re-order policy according to its priority. However, the tasks of inventory classification and control are not carried out jointly if the classification criteria and the classification approach are not robustly established from an inventory-cost perspective. Exhaustive simulations at the single item level of the inventory system would directly solve this issue by searching for the best re-order policy per item, thus achieving the subsequent optimal classification without resorting to any multi-criteria classification method. However, this would be very time-consuming in real settings, where a large number of items need to be managed simultaneously. In this article, a reduction in simulation effort is achieved by extracting from the population of items a sample on which to perform an exhaustive search of best re-order policies per item; the lowest cost classification of in-sample items is, therefore, achieved. Then, in line with the increasing need for ICT tools in the production management of Industry 4.0 systems, supervised classifiers from the machine learning research field (i.e. support vector machines with a Gaussian kernel and deep neural networks) are trained on these in-sample items to learn to classify the out-of-sample items solely based on the values they show on the features (i.e. classification criteria). The inventory system adopted here is suitable for intermittent demands, but it may also suit non-intermittent demands, thus providing great flexibility. The experimental analysis of two large datasets showed an excellent accuracy, which suggests that machine learning classifiers could be implemented in advanced inventory classification systems

    Innovation performance and embeddedness in networks: evidence from the Ethiopian footwear cluster

    Get PDF
    This study focuses on innovation in a cluster of informal shoemaking firms in Ethiopia - namely the Mercato footwear cluster. It examines how differently those firms are embedded in networks and how heterogeneous they are in absorptive capacity, and how this heterogeneity affects their innovation performance. Business interactions with buyers, suppliers and other producers are the major channels through which knowledge flows into the cluster. These business networks are mainly built on trust and long-term relationships and tend to be selective. The study reveals that despite homogeneity in social background the firms in the cluster behave and perform differently. Based on econometric analysis we document a positive and strong effect of local network position and absorptive capacity on innovation performance.industrial clusters, networks, innovation performance, informal sector, Africa, Ethiopia
    • 

    corecore