20 research outputs found

    Exploiting Symmetric Temporally Sparse BPTT for Efficient RNN Training

    Get PDF
    Recurrent Neural Networks (RNNs) are useful in temporal sequence tasks. However, training RNNs involves dense matrix multiplications which require hardware that can support a large number of arithmetic operations and memory accesses. Implementing online training of RNNs on the edge calls for optimized algorithms for an efficient deployment on hardware. Inspired by the spiking neuron model, the Delta RNN exploits temporal sparsity during inference by skipping over the update of hidden states from those inactivated neurons whose change of activation across two timesteps is below a defined threshold. This work describes a training algorithm for Delta RNNs that exploits temporal sparsity in the backward propagation phase to reduce computational requirements for training on the edge. Due to the symmetric computation graphs of forward and backward propagation during training, the gradient computation of inactivated neurons can be skipped. Results show a reduction of ∼80% in matrix operations for training a 56k parameter Delta LSTM on the Fluent Speech Commands dataset with negligible accuracy loss. Logic simulations of a hardware accelerator designed for the training algorithm show 2-10X speedup in matrix computations for an activation sparsity range of 50%-90%. Additionally, we show that our training algorithm will be useful for online incremental learning on edge devices with limited computing resources

    Exploiting Symmetric Temporally Sparse BPTT for Efficient RNN Training

    Full text link
    Recurrent Neural Networks (RNNs) are useful in temporal sequence tasks. However, training RNNs involves dense matrix multiplications which require hardware that can support a large number of arithmetic operations and memory accesses. Implementing online training of RNNs on the edge calls for optimized algorithms for an efficient deployment on hardware. Inspired by the spiking neuron model, the Delta RNN exploits temporal sparsity during inference by skipping over the update of hidden states from those inactivated neurons whose change of activation across two timesteps is below a defined threshold. This work describes a training algorithm for Delta RNNs that exploits temporal sparsity in the backward propagation phase to reduce computational requirements for training on the edge. Due to the symmetric computation graphs of forward and backward propagation during training, the gradient computation of inactivated neurons can be skipped. Results show a reduction of ∼\sim80% in matrix operations for training a 56k parameter Delta LSTM on the Fluent Speech Commands dataset with negligible accuracy loss. Logic simulations of a hardware accelerator designed for the training algorithm show 2-10X speedup in matrix computations for an activation sparsity range of 50%-90%. Additionally, we show that the proposed Delta RNN training will be useful for online incremental learning on edge devices with limited computing resources.Comment: Accepted by the 38th Annual AAAI Conference on Artificial Intelligence (AAAI-24

    Deep mobile traffic forecast and complementary base station clustering for C-RAN optimization

    Get PDF
    The increasingly growing data traffic has posed great challenges for mobile operators to increase their data processing capacity, which incurs a significant energy consumption and deployment cost. With the emergence of the Cloud Radio Access Network (C-RAN) architecture, the data processing units can now be centralized in data centers and shared among base stations. By mapping a cluster of base stations with complementary traffic patterns to a data processing unit, the processing unit can be fully utilized in different periods of time, and the required capacity to be deployed is expected to be smaller than the sum of capacities of single base stations. However, since the traffic patterns of base stations are highly dynamic in different time and locations, it is challenging to foresee and characterize the traffic patterns in advance to make optimal clustering schemes. In this paper, we address these issues by proposing a deep-learning-based C-RAN optimization framework. First, we exploit a Multivariate Long Short-Term Memory (MuLSTM) model to learn the temporal dependency and spatial correlation among base station traffic patterns, and make accurate traffic forecast for a future period of time. Afterwards, we build a weighted graph to model the complementarity of base stations according to their traffic patterns, and propose a Distance-Constrained Complementarity-Aware (DCCA) algorithm to find optimal base station clustering schemes with the objectives of optimizing capacity utility and deployment cost. We evaluate the performance of our framework using data in two months from real-world mobile networks in Milan and Trentino, Italy. Results show that our method effectively increases the average capacity utility to 83.4% and 76.7%, and reduces the overall deployment cost to 48.4% and 51.7% of the traditional RAN architecture in the two datasets, respectively, which consistently outperforms the state-of-the-art baseline methods

    Advancing safe mobility: A global analysis of research trends in safe route planning

    No full text
    Safe route planning has become an increasingly important area of research in recent years due to growing concerns about pedestrian and traffic safety, rising traffic volumes and densities in urban areas, and advancements in smart vehicle and transportation technologies. This study conducted a bibliometric analysis of publications on safe route planning retrieved from the Web of Science database between January 2000 and January 2023 to understand the state of the field. A total of 1546 publications authored by 5423 researchers from 84 countries were analyzed. The findings identified the United States, China, India, South Korea, and Spain as the most productive countries, while the University of North Carolina emerged as the most productive organization. Engineering, computer science, transportation, public health, and automation were revealed to be the dominant initial research areas, although interest grew from other domains like urban planning and the environment over time. Analysis of publications by year showed a steady rise in output starting from 2008. Notable influential publications and highly cited authors in the field were also identified. Several research themes and terms like path planning, safety, walking, and route to school were highlighted through keyword analysis. This study provided novel insights into the evolving international landscape, topics, and influential contributors in safe route planning research over the past two decades. Limitations in database coverage and analytical techniques necessitate future work to enhance understanding in this critical domain

    Exploiting Symmetric Temporally Sparse BPTT for Efficient RNN Training

    No full text
    Recurrent Neural Networks (RNNs) are useful in temporal sequence tasks. However, training RNNs involves dense matrix multiplications which require hardware that can support a large number of arithmetic operations and memory accesses. Implementing online training of RNNs on the edge calls for optimized algorithms for an efficient deployment on hardware. Inspired by the spiking neuron model, the Delta RNN exploits temporal sparsity during inference by skipping over the update of hidden states from those inactivated neurons whose change of activation across two timesteps is below a defined threshold. This work describes a training algorithm for Delta RNNs that exploits temporal sparsity in the backward propagation phase to reduce computational requirements for training on the edge. Due to the symmetric computation graphs of forward and backward propagation during training, the gradient computation of inactivated neurons can be skipped. Results show a reduction of ∼80% in matrix operations for training a 56k parameter Delta LSTM on the Fluent Speech Commands dataset with negligible accuracy loss. Logic simulations of a hardware accelerator designed for the training algorithm show 2-10X speedup in matrix computations for an activation sparsity range of 50%-90%. Additionally, we show that the proposed Delta RNN training will be useful for online incremental learning on edge devices with limited computing resources

    Unlicensed taxis detection service based on large-scale vehicles mobility data

    Full text link

    Complementary Base Station Clustering for Cost-Effective and Energy-Efficient Cloud-RAN

    No full text
    International audienceThe fast growing mobile network data traffic poses great challenges for operators to increase their data processing capacity in base stations in an efficient manner. With the emergence of Cloud Radio Access Network (Cloud-RAN), the data processing units can now be centralized in a data center and shared among several base stations. By clustering base stations with complementary traffic patterns to the same data center, the deployment cost and energy consumption can be reduced. In this paper, we propose a two-phase framework to find optimal base station clustering schemes in a city-wide Cloud-RAN. First, we design a traffic profile for each base station, and propose an entropy-based metric to characterize the complementarity among base stations. Second, we build a graph model to represent the complementarity as link weight, and propose a distance-constrained clustering algorithm to find optimal base station clustering schemes. We evaluate the performance of our framework using two months of real-world mobile network traffic data in Milan, Italy. Results show that our framework effectively reduces 12.88% of deployment cost and 9.45% of energy consumption compared with traditional architectures, and outperforms the baseline method
    corecore