1,499 research outputs found

    Deep generative models for network data synthesis and monitoring

    Get PDF
    Measurement and monitoring are fundamental tasks in all networks, enabling the down-stream management and optimization of the network. Although networks inherently have abundant amounts of monitoring data, its access and effective measurement is another story. The challenges exist in many aspects. First, the inaccessibility of network monitoring data for external users, and it is hard to provide a high-fidelity dataset without leaking commercial sensitive information. Second, it could be very expensive to carry out effective data collection to cover a large-scale network system, considering the size of network growing, i.e., cell number of radio network and the number of flows in the Internet Service Provider (ISP) network. Third, it is difficult to ensure fidelity and efficiency simultaneously in network monitoring, as the available resources in the network element that can be applied to support the measurement function are too limited to implement sophisticated mechanisms. Finally, understanding and explaining the behavior of the network becomes challenging due to its size and complex structure. Various emerging optimization-based solutions (e.g., compressive sensing) or data-driven solutions (e.g. deep learning) have been proposed for the aforementioned challenges. However, the fidelity and efficiency of existing methods cannot yet meet the current network requirements. The contributions made in this thesis significantly advance the state of the art in the domain of network measurement and monitoring techniques. Overall, we leverage cutting-edge machine learning technology, deep generative modeling, throughout the entire thesis. First, we design and realize APPSHOT , an efficient city-scale network traffic sharing with a conditional generative model, which only requires open-source contextual data during inference (e.g., land use information and population distribution). Second, we develop an efficient drive testing system — GENDT, based on generative model, which combines graph neural networks, conditional generation, and quantified model uncertainty to enhance the efficiency of mobile drive testing. Third, we design and implement DISTILGAN, a high-fidelity, efficient, versatile, and real-time network telemetry system with latent GANs and spectral-temporal networks. Finally, we propose SPOTLIGHT , an accurate, explainable, and efficient anomaly detection system of the Open RAN (Radio Access Network) system. The lessons learned through this research are summarized, and interesting topics are discussed for future work in this domain. All proposed solutions have been evaluated with real-world datasets and applied to support different applications in real systems

    Faster inference from state space models via GPU computing

    Get PDF
    Funding: C.F.-J. is funded via a doctoral scholarship from the University of St Andrews, School of Mathematics and Statistics.Inexpensive Graphics Processing Units (GPUs) offer the potential to greatly speed up computation by employing their massively parallel architecture to perform arithmetic operations more efficiently. Population dynamics models are important tools in ecology and conservation. Modern Bayesian approaches allow biologically realistic models to be constructed and fitted to multiple data sources in an integrated modelling framework based on a class of statistical models called state space models. However, model fitting is often slow, requiring hours to weeks of computation. We demonstrate the benefits of GPU computing using a model for the population dynamics of British grey seals, fitted with a particle Markov chain Monte Carlo algorithm. Speed-ups of two orders of magnitude were obtained for estimations of the log-likelihood, compared to a traditional ‘CPU-only’ implementation, allowing for an accurate method of inference to be used where this was previously too computationally expensive to be viable. GPU computing has enormous potential, but one barrier to further adoption is a steep learning curve, due to GPUs' unique hardware architecture. We provide a detailed description of hardware and software setup, and our case study provides a template for other similar applications. We also provide a detailed tutorial-style description of GPU hardware architectures, and examples of important GPU-specific programming practices.Publisher PDFPeer reviewe

    Multidisciplinary perspectives on Artificial Intelligence and the law

    Get PDF
    This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio

    Towards Neuromorphic Gradient Descent: Exact Gradients and Low-Variance Online Estimates for Spiking Neural Networks

    Get PDF
    Spiking Neural Networks (SNNs) are biologically-plausible models that can run on low-powered non-Von Neumann neuromorphic hardware, positioning them as promising alternatives to conventional Deep Neural Networks (DNNs) for energy-efficient edge computing and robotics. Over the past few years, the Gradient Descent (GD) and Error Backpropagation (BP) algorithms used in DNNs have inspired various training methods for SNNs. However, the non-local and the reverse nature of BP, combined with the inherent non-differentiability of spikes, represent fundamental obstacles to computing gradients with SNNs directly on neuromorphic hardware. Therefore, novel approaches are required to overcome the limitations of GD and BP and enable online gradient computation on neuromorphic hardware. In this thesis, I address the limitations of GD and BP with SNNs by proposing three algorithms. First, I extend a recent method that computes exact gradients with temporally-coded SNNs by relaxing the firing constraint of temporal coding and allowing multiple spikes per neuron. My proposed method generalizes the computation of exact gradients with SNNs and enhances the tradeoffs between performance and various other aspects of spiking neurons. Next, I introduce a novel alternative to BP that computes low-variance gradient estimates in a local and online manner. Compared to other alternatives to BP, the proposed method demonstrates an improved convergence rate and increased performance with DNNs. Finally, I combine these two methods and propose an algorithm that estimates gradients with SNNs in a manner that is compatible with the constraints of neuromorphic hardware. My empirical results demonstrate the effectiveness of the resulting algorithm in training SNNs without performing BP

    Enabling Deep Neural Network Inferences on Resource-constraint Devices

    Get PDF
    Department of Computer Science and EngineeringWhile deep neural networks (DNN) are widely used on various devices, including resource-constraint devices such as IoT, AR/VR, and mobile devices, running DNN from resource-constrained devices remains challenging. There exist three approaches for DNN inferences on resource-constraint devices: 1) lightweight DNN for on-device computing, 2) offloading DNN inferences to a cloud server, and 3) split computing to utilize computation and network resources efficiently. Designing a lightweight DNN without compromising the accuracy of DNN is challenging due to a trade-off between latency and accuracy, that more computation is required to achieve higher accuracy. One solution to overcome this challenge is pre-processing to extract and transfer helpful information to achieve high accuracy of DNN. We design the pre-processing, which consists of three processes. The first process of pre-processing is finding out the best input source. The second process is the input-processing which extracts and contains important information for DNN inferences among the whole information gained from the input source. The last process is choosing or designing a suitable lightweight DNN for processed input. As an instance of how to apply the pre-processing, in Sec 2, we present a new transportation mode recognition system for smartphones called DeepVehicleSense, which aims at achieving three performance objectives: high accuracy, low latency, and low power consumption at once by exploiting sound characteristics captured from the built-in microphone while being on candidate transportations. To achieve high accuracy and low latency, DeepVehicleSense makes use of non-linear filters that can best extract the transportation sound samples. For the recognition of five different transportation modes, we design a deep learning-based sound classifier using a novel deep neural network architecture with multiple branches. Our staged inference technique can significantly reduce runtime and energy consumption while maintaining high accuracy for the majority of samples. Offloading DNN inferences to a server is a solution for DNN inferences on resource-constraint devices, but there is one concern about latency caused by data transmission. To reduce transmission latency, recent studies have tried to make this offloading process more efficient by compressing data to be offloaded. However, conventional compression techniques are designed for human beings, so they compress data to be possible to restore data, which looks like the original from the perspective of human eyes. As a result, the compressed data through the compression technique contains redundancy beyond the necessary information for DNN inference. In other words, the most fundamental question on extracting and offloading the minimal amount of necessary information that does not degrade the inference accuracy has remained unanswered. To answer the question, in Sec 3, we call such an ideal offloading semantic offloading and propose N-epitomizer, a new offloading framework that enables semantic offloading, thus achieving more reliable and timely inferences in highly-fluctuated or even low-bandwidth wireless networks. To realize N-epitomizer, we design an autoencoder-based scalable encoder trained to extract the most informative data and scale its output size to meet the latency and accuracy requirements of inferences over a network. Even though our proposed lightweight DNN and offloading framework with the essential information extractor achieve low latency while preserving DNN performance, they alone cannot realize latency-guaranteed DNN inferences. To realize latency-guaranteed DNN inferences, the computational complexity of the lightweight DNN and the compression performance of the encoder for offloading should be adaptively selected according to current computation resources and network conditions by utilizing the DNN's trade-off between computational complexity and DNN performance and the encoder's trade-off between compression performance and DNN performance. To this end, we propose a new framework for latency-guaranteed DNN inferences called LG-DI, which predicts DNN performance degradation given a latency budget in advance and utilizes the better method between the lightweight DNN and offloading with compression. As a result, our proposed framework for DNN inferences can guarantee latency regardless of changes in computation and network resources while maintaining DNN performance as much as possible.ope

    Neural Architecture Search for Image Segmentation and Classification

    Get PDF
    Deep learning (DL) is a class of machine learning algorithms that relies on deep neural networks (DNNs) for computations. Unlike traditional machine learning algorithms, DL can learn from raw data directly and effectively. Hence, DL has been successfully applied to tackle many real-world problems. When applying DL to a given problem, the primary task is designing the optimum DNN. This task relies heavily on human expertise, is time-consuming, and requires many trial-and-error experiments. This thesis aims to automate the laborious task of designing the optimum DNN by exploring the neural architecture search (NAS) approach. Here, we propose two new NAS algorithms for two real-world problems: pedestrian lane detection for assistive navigation and hyperspectral image segmentation for biosecurity scanning. Additionally, we also introduce a new dataset-agnostic predictor of neural network performance, which can be used to speed-up NAS algorithms that require the evaluation of candidate DNNs

    Guided rewriting and constraint satisfaction for parallel GPU code generation

    Get PDF
    Graphics Processing Units (GPUs) are notoriously hard to optimise for manually due to their scheduling and memory hierarchies. What is needed are good automatic code generators and optimisers for such parallel hardware. Functional approaches such as Accelerate, Futhark and LIFT leverage a high-level algorithmic Intermediate Representation (IR) to expose parallelism and abstract the implementation details away from the user. However, producing efficient code for a given accelerator remains challenging. Existing code generators depend on the user input to choose a subset of hard-coded optimizations or automated exploration of implementation search space. The former suffers from the lack of extensibility, while the latter is too costly due to the size of the search space. A hybrid approach is needed, where a space of valid implementations is built automatically and explored with the aid of human expertise. This thesis presents a solution combining user-guided rewriting and automatically generated constraints to produce high-performance code. The first contribution is an automatic tuning technique to find a balance between performance and memory consumption. Leveraging its functional patterns, the LIFT compiler is empowered to infer tuning constraints and limit the search to valid tuning combinations only. Next, the thesis reframes parallelisation as a constraint satisfaction problem. Parallelisation constraints are extracted automatically from the input expression, and a solver is used to identify valid rewriting. The constraints truncate the search space to valid parallel mappings only by capturing the scheduling restrictions of the GPU in the context of a given program. A synchronisation barrier insertion technique is proposed to prevent data races and improve the efficiency of the generated parallel mappings. The final contribution of this thesis is the guided rewriting method, where the user encodes a design space of structural transformations using high-level IR nodes called rewrite points. These strongly typed pragmas express macro rewrites and expose design choices as explorable parameters. The thesis proposes a small set of reusable rewrite points to achieve tiling, cache locality, data reuse and memory optimisation. A comparison with the vendor-provided handwritten kernel ARM Compute Library and the TVM code generator demonstrates the effectiveness of this thesis' contributions. With convolution as a use case, LIFT-generated direct and GEMM-based convolution implementations are shown to perform on par with the state-of-the-art solutions on a mobile GPU. Overall, this thesis demonstrates that a functional IR yields well to user-guided and automatic rewriting for high-performance code generation

    Automated Distinct Bone Segmentation from Computed Tomography Images using Deep Learning

    Get PDF
    Large-scale CT scans are frequently performed for forensic and diagnostic purposes, to plan and direct surgical procedures, and to track the development of bone-related diseases. This often involves radiologists who have to annotate bones manually or in a semi-automatic way, which is a time consuming task. Their annotation workload can be reduced by automated segmentation and detection of individual bones. This automation of distinct bone segmentation not only has the potential to accelerate current workflows but also opens up new possibilities for processing and presenting medical data for planning, navigation, and education. In this thesis, we explored the use of deep learning for automating the segmentation of all individual bones within an upper-body CT scan. To do so, we had to find a network architec- ture that provides a good trade-off between the problem’s high computational demands and the results’ accuracy. After finding a baseline method and having enlarged the dataset, we set out to eliminate the most prevalent types of error. To do so, we introduced an novel method called binary-prediction-enhanced multi-class (BEM) inference, separating the task into two: Distin- guishing bone from non-bone is conducted separately from identifying the individual bones. Both predictions are then merged, which leads to superior results. Another type of error is tack- led by our developed architecture, the Sneaky-Net, which receives additional inputs with larger fields of view but at a smaller resolution. We can thus sneak more extensive areas of the input into the network while keeping the growth of additional pixels in check. Overall, we present a deep-learning-based method that reliably segments most of the over one hundred distinct bones present in upper-body CT scans in an end-to-end trained matter quickly enough to be used in interactive software. Our algorithm has been included in our groups virtual reality medical image visualisation software SpectoVR with the plan to be used as one of the puzzle piece in surgical planning and navigation, as well as in the education of future doctors
    • 

    corecore