95 research outputs found

    A transistor operations model for deep learning energy consumption scaling law

    Get PDF
    Deep Neural Networks (DNN) has transformed the automation of a wide range of industries and finds increasing ubiquity in society. The high complexity of DNN models and its widespread adoption has led to global energy consumption doubling every 3-4 months. Current energy consumption measures largely monitor system wide consumption or make linear assumptions of DNN models. The former approach captures other unrelated energy consumption anomalies, whilst the latter does not accurately reflect nonlinear computations. In this paper, we are the first to develop a bottom-up Transistor Operations (TOs) approach to expose the role of non-linear activation functions and neural network structure. As there will be inevitable energy measurement errors at the core level, we statistically model the energy scaling laws as opposed to absolute consumption values. We offer models for both feedforward DNNs and convolution neural networks (CNNs) on a variety of data sets and hardware configurations - achieving a 93.6% - 99.5% precision. This outperforms existing FLOPs-based methods and our TOs method can be further extended to other DNN models.European Union funding: 77830

    Deep learning methods for solving linear inverse problems: Research directions and paradigms

    Get PDF
    The linear inverse problem is fundamental to the development of various scientific areas. Innumerable attempts have been carried out to solve different variants of the linear inverse problem in different applications. Nowadays, the rapid development of deep learning provides a fresh perspective for solving the linear inverse problem, which has various well-designed network architectures results in state-of-the-art performance in many applications. In this paper, we present a comprehensive survey of the recent progress in the development of deep learning for solving various linear inverse problems. We review how deep learning methods are used in solving different linear inverse problems, and explore the structured neural network architectures that incorporate knowledge used in traditional methods. Furthermore, we identify open challenges and potential future directions along this research line

    Enhancing Diffusion Models with Text-Encoder Reinforcement Learning

    Full text link
    Text-to-image diffusion models are typically trained to optimize the log-likelihood objective, which presents challenges in meeting specific requirements for downstream tasks, such as image aesthetics and image-text alignment. Recent research addresses this issue by refining the diffusion U-Net using human rewards through reinforcement learning or direct backpropagation. However, many of them overlook the importance of the text encoder, which is typically pretrained and fixed during training. In this paper, we demonstrate that by finetuning the text encoder through reinforcement learning, we can enhance the text-image alignment of the results, thereby improving the visual quality. Our primary motivation comes from the observation that the current text encoder is suboptimal, often requiring careful prompt adjustment. While fine-tuning the U-Net can partially improve performance, it remains suffering from the suboptimal text encoder. Therefore, we propose to use reinforcement learning with low-rank adaptation to finetune the text encoder based on task-specific rewards, referred as \textbf{TexForce}. We first show that finetuning the text encoder can improve the performance of diffusion models. Then, we illustrate that TexForce can be simply combined with existing U-Net finetuned models to get much better results without additional training. Finally, we showcase the adaptability of our method in diverse applications, including the generation of high-quality face and hand images

    Dynamic hypergraph convolutional network for no-reference point cloud quality assessment

    Get PDF
    With the rapid advancement of three-dimensional (3D) sensing technology, point cloud has emerged as one of the most important approaches for representing 3D data. However, quality degradation inevitably occurs during the acquisition, transmission, and process of point clouds. Therefore, point cloud quality assessment (PCQA) with automatic visual quality perception is particularly critical. In the literature, the graph convolutional networks (GCNs) have achieved certain performance in point cloud-related tasks. However, they cannot fully characterize the nonlinear high-order relationship of such complex data. In this paper, we propose a novel no-reference (NR) PCQA method with hypergraph learning. Specifically, a dynamic hypergraph convolutional network (DHCN) composing of a projected image encoder, a point group encoder, a dynamic hypergraph generator, and a perceptual quality predictor, is devised. First, a projected image encoder and a point group encoder are used to extract feature representations from projected images and point groups, respectively. Then, using the feature representations obtained by the two encoders, dynamic hypergraphs are generated during each iteration, aiming to constantly update the interactive information between the vertices of hypergraphs. Finally, we design the perceptual quality predictor to conduct quality reasoning on the generated hypergraphs. By leveraging the interactive information among hypergraph vertices, feature representations are well aggregated, resulting in a notable improvement in the accuracy of quality pediction. Experimental results on several point cloud quality assessment databases demonstrate that our proposed DHCN can achieve state-of-the-art performance. The code will be available at: https://github.com/chenwuwq/DHCN

    Scarce data driven deep learning of drones via generalized data distribution space

    Get PDF
    Increased drone proliferation in civilian and professional settings has created new threat vectors for airports and national infrastructures. The economic damage for a single major airport from drone incursions is estimated to be millions per day. Due to the lack of balanced representation in drone data, training accurate deep learning drone detection algorithms under scarce data is an open challenge. Existing methods largely rely on collecting diverse and comprehensive experimental drone footage data, artificially induced data augmentation, transfer and meta-learning, as well as physics-informed learning. However, these methods cannot guarantee capturing diverse drone designs and fully understanding the deep feature space of drones. Here, we show how understanding the general distribution of the drone data via a generative adversarial network (GAN), and explaining the under-learned data features using topological data analysis (TDA) can allow us to acquire under-represented data to achieve rapid and more accurate learning. We demonstrate our results on a drone image dataset, which contains both real drone images as well as simulated images from computer-aided design. When compared to random, tag-informed and expert-informed data collections (discriminator accuracy of 94.67%, 94.53% and 91.07%, respectively, after 200 epochs), our proposed GAN-TDA-informed data collection method offers a significant 4% improvement (99.42% after 200 epochs). We believe that this approach of exploiting general data distribution knowledge from neural networks can be applied to a wide range of scarce data open challenges
    • …
    corecore