68 research outputs found

    RNN-Based Radio Resource Management on Multicore RISC-V Accelerator Architectures

    Get PDF
    Radio resource management (RRM) is critical in 5G mobile communications due to its ubiquity on every radio device and its low latency constraints. The rapidly evolving RRM algorithms with low latency requirements combined with the dense and massive 5G base station deployment ask for an on-the-edge RRM acceleration system with a tradeoff between flexibility, efficiency, and cost-making application-specific instruction-set processors (ASIPs) an optimal choice. In this work, we start from a baseline, simple RISC-V core and introduce instruction extensions coupled with software optimizations for maximizing the throughput of a selected set of recently proposed RRM algorithms based on models using multilayer perceptrons (MLPs) and recurrent neural networks (RNNs). Furthermore, we scale from a single-ASIP to a multi-ASIP acceleration system to further improve RRM throughput. For the single-ASIP system, we demonstrate an energy efficiency of 218 GMAC/s/W and a throughput of 566 MMAC/s corresponding to an improvement of 10x and 10.6x, respectively, over the single-core system with a baseline RV32IMC core. For the multi-ASIP system, we analyze the parallel speedup dependency on the input and output feature map (FM) size for fully connected and LSTM layers, achieving up to 10.2x speedup with 16 cores over a single extended RI5CY core for single LSTM layers and a speedup of 13.8x for a single fully connected layer. On the full RRM benchmark suite, we achieve an average overall speedup of 16.4x, 25.2x, 31.9x, and 38.8x on two, four, eight, and 16 cores, respectively, compared to our single-core RV32IMC baseline implementation

    Evaluation of Edge AI Co-Processing Methods for Space Applications

    Get PDF
    The recent years spread of SmallSats offers several new services and opens to the implementation of new technologies to improve the existent ones. However, the communication link to Earth in order to process data often is a bottleneck, due to the amount of collected data and the limited bandwidth. A way to face this challenge is edge computing, which supposedly discards useless data and fasten up the transmission, and therefore the research has moved towards the study of COTS architectures to be used in space, often organized in co-processing setups. This thesis considers AI as application use case and two devices in a controller-accelerator configuration. It proposes to investigate the performances of co-processing methods such as simple parallel, horizontal partitioning and vertical partitioning, for a set of different tasks and taking advantage of different pre-trained models. The actual experiments regard only simple parallel and horizontal partitioning mode, and they compare latency and accuracy results with single processing runs on both devices. Evaluating the results task-by-task, image classification has the best performance improvement taking advantage of horizontal partitioning, with a clear accuracy improvement, as well as semantic segmentation, which shows almost stable accuracy and potentially higher throughput with smaller models input sizes. On the other hand, object detection shows a drop in performances, especially accuracy, which could maybe be improved with more specifically developed models for the chosen hardware. The project clearly shows how co-processing methods are worth of being investigated and can improve system outcomes for some of the analyzed tasks, making future work about it interesting

    KAVUAKA: a low-power application-specific processor architecture for digital hearing aids

    Get PDF
    The power consumption of digital hearing aids is very restricted due to their small physical size and the available hardware resources for signal processing are limited. However, there is a demand for more processing performance to make future hearing aids more useful and smarter. Future hearing aids should be able to detect, localize, and recognize target speakers in complex acoustic environments to further improve the speech intelligibility of the individual hearing aid user. Computationally intensive algorithms are required for this task. To maintain acceptable battery life, the hearing aid processing architecture must be highly optimized for extremely low-power consumption and high processing performance.The integration of application-specific instruction-set processors (ASIPs) into hearing aids enables a wide range of architectural customizations to meet the stringent power consumption and performance requirements. In this thesis, the application-specific hearing aid processor KAVUAKA is presented, which is customized and optimized with state-of-the-art hearing aid algorithms such as speaker localization, noise reduction, beamforming algorithms, and speech recognition. Specialized and application-specific instructions are designed and added to the baseline instruction set architecture (ISA). Among the major contributions are a multiply-accumulate (MAC) unit for real- and complex-valued numbers, architectures for power reduction during register accesses, co-processors and a low-latency audio interface. With the proposed MAC architecture, the KAVUAKA processor requires 16 % less cycles for the computation of a 128-point fast Fourier transform (FFT) compared to related programmable digital signal processors. The power consumption during register file accesses is decreased by 6 %to 17 % with isolation and by-pass techniques. The hardware-induced audio latency is 34 %lower compared to related audio interfaces for frame size of 64 samples.The final hearing aid system-on-chip (SoC) with four KAVUAKA processor cores and ten co-processors is integrated as an application-specific integrated circuit (ASIC) using a 40 nm low-power technology. The die size is 3.6 mm2. Each of the processors and co-processors contains individual customizations and hardware features with a varying datapath width between 24-bit to 64-bit. The core area of the 64-bit processor configuration is 0.134 mm2. The processors are organized in two clusters that share memory, an audio interface, co-processors and serial interfaces. The average power consumption at a clock speed of 10 MHz is 2.4 mW for SoC and 0.6 mW for the 64-bit processor.Case studies with four reference hearing aid algorithms are used to present and evaluate the proposed hardware architectures and optimizations. The program code for each processor and co-processor is generated and optimized with evolutionary algorithms for operation merging,instruction scheduling and register allocation. The KAVUAKA processor architecture is com-pared to related processor architectures in terms of processing performance, average power consumption, and silicon area requirements

    Improving Compute & Data Efficiency of Flexible Architectures

    Get PDF

    Automatic Field Monitoring and Detection of Plant Diseases Using IoT

    Get PDF
    This research presents a GSM-based system for automatic plant disease diagnosis and describes its use in the creation of ACPS. Traditional farming methods were largely ineffective against microbial diseases. In addition, farmers can't keep up with the ever-changing nature of infections, so a reliable disease forecasting system is essential. To circumvent this, we employ a Convolutional Neural Network (CNN) model that has been trained to examine the crop image recorded by a health maintenance system. The solar sensor node is in charge of taking pictures, sensing continuously, and automating smartly. An agricultural robot is sometimes known as an agribot or agbot. An autonomous robot with agricultural applications. It helps the farmer improve crop productivity while decreasing the need for manual labour. In the future, these agricultural robots could replace human labour in a variety of farming tasks, including tilling, planting, and harvesting. These agricultural robots will manage pests and diseases as well as perform tasks like weeding. In order to keep an eye on the crops and streamline the irrigation process, this system is equipped with disease prediction technology for plants and intelligent irrigation controls. The energy required to provide disease prediction and irrigation systems separately is reduced by combining them in this project

    Particle Swarm Optimization for HW/SW Partitioning

    Get PDF

    Detecting Spam Email With Machine Learning Optimized With Bio-Inspired Metaheuristic Algorithms

    Get PDF
    Electronic mail has eased communication methods for many organisations as well as individuals. This method is exploited for fraudulent gain by spammers through sending unsolicited emails. This article aims to present a method for detection of spam emails with machine learning algorithms that are optimized with bio-inspired methods. A literature review is carried to explore the efficient methods applied on different datasets to achieve good results. An extensive research was done to implement machine learning models using Naïve Bayes, Support Vector Machine, Random Forest, Decision Tree and Multi-Layer Perceptron on seven different email datasets, along with feature extraction and pre-processing. The bio-inspired algorithms like Particle Swarm Optimization and Genetic Algorithm were implemented to optimize the performance of classifiers. Multinomial Naïve Bayes with Genetic Algorithm performed the best overall. The comparison of our results with other machine learning and bio-inspired models to show the best suitable model is also discussed

    A Roadmap for AI in Latin America

    Get PDF
    International audienceIf we want ensure that AI in the upcoming years is a positive factor of the development of Latin America we need to start acting now and stop doing the same thing over and over again. The recent past and the current context in the region clearly indicates that it is unlikely that we see any improvements in the resources and support that AI has, instead, it will probably be aggravated by the impact of the COVID-19 pandemic. Consequently, it is our role as researchers to visit this issue and attempt to propose a road map towards a solution.The driving motivation for this paper is to plant the seeds of a discussion on how to create a bottom-up and inclusive positive momentum for AI in the region, given the existing conditions, while, at the same time, reducing the potential negative impacts that it might have. We present this in the form of a roadmap or workflow that identifies the main obstacles that should be addressed and how they can be overcome by a combination focusing the work AI practitioners on particular research topics and that of decision markers and concern citizens
    • …
    corecore