172 research outputs found

    Modern computing: Vision and challenges

    Get PDF
    Over the past six decades, the computing systems field has experienced significant transformations, profoundly impacting society with transformational developments, such as the Internet and the commodification of computing. Underpinned by technological advancements, computer systems, far from being static, have been continuously evolving and adapting to cover multifaceted societal niches. This has led to new paradigms such as cloud, fog, edge computing, and the Internet of Things (IoT), which offer fresh economic and creative opportunities. Nevertheless, this rapid change poses complex research challenges, especially in maximizing potential and enhancing functionality. As such, to maintain an economical level of performance that meets ever-tighter requirements, one must understand the drivers of new model emergence and expansion, and how contemporary challenges differ from past ones. To that end, this article investigates and assesses the factors influencing the evolution of computing systems, covering established systems and architectures as well as newer developments, such as serverless computing, quantum computing, and on-device AI on edge devices. Trends emerge when one traces technological trajectory, which includes the rapid obsolescence of frameworks due to business and technical constraints, a move towards specialized systems and models, and varying approaches to centralized and decentralized control. This comprehensive review of modern computing systems looks ahead to the future of research in the field, highlighting key challenges and emerging trends, and underscoring their importance in cost-effectively driving technological progress

    Accelerating SNN Training with Stochastic Parallelizable Spiking Neurons

    Full text link
    Spiking neural networks (SNN) are able to learn spatiotemporal features while using less energy, especially on neuromorphic hardware. The most widely used spiking neuron in deep learning is the Leaky Integrate and Fire (LIF) neuron. LIF neurons operate sequentially, however, since the computation of state at time t relies on the state at time t-1 being computed. This limitation is shared with Recurrent Neural Networks (RNN) and results in slow training on Graphics Processing Units (GPU). In this paper, we propose the Stochastic Parallelizable Spiking Neuron (SPSN) to overcome the sequential training limitation of LIF neurons. By separating the linear integration component from the non-linear spiking function, SPSN can be run in parallel over time. The proposed approach results in performance comparable with the state-of-the-art for feedforward neural networks on the Spiking Heidelberg Digits (SHD) dataset, outperforming LIF networks while training 10 times faster and outperforming non-spiking networks with the same network architecture. For longer input sequences of 10000 time-steps, we show that the proposed approach results in 4000 times faster training, thus demonstrating the potential of the proposed approach to accelerate SNN training for very large datasets

    Runtime Construction of Large-Scale Spiking Neuronal Network Models on GPU Devices

    Get PDF
    Simulation speed matters for neuroscientific research: this includes not only how quickly the simulated model time of a large-scale spiking neuronal network progresses but also how long it takes to instantiate the network model in computer memory. On the hardware side, acceleration via highly parallel GPUs is being increasingly utilized. On the software side, code generation approaches ensure highly optimized code at the expense of repeated code regeneration and recompilation after modifications to the network model. Aiming for a greater flexibility with respect to iterative model changes, here we propose a new method for creating network connections interactively, dynamically, and directly in GPU memory through a set of commonly used high-level connection rules. We validate the simulation performance with both consumer and data center GPUs on two neuroscientifically relevant models: a cortical microcircuit of about 77,000 leaky-integrate-and-fire neuron models and 300 million static synapses, and a two-population network recurrently connected using a variety of connection rules. With our proposed ad hoc network instantiation, both network construction and simulation times are comparable or shorter than those obtained with other state-of-the-art simulation technologies while still meeting the flexibility demands of explorative network modeling

    Close to the metal: Towards a material political economy of the epistemology of computation

    Get PDF
    This paper investigates the role of the materiality of computation in two domains: blockchain technologies and artificial intelligence (AI). Although historically designed as parallel computing accelerators for image rendering and videogames, graphics processing units (GPUs) have been instrumental in the explosion of both cryptoasset mining and machine learning models. The political economy associated with video games and Bitcoin and Ethereum mining provided a staggering growth in performance and energy efficiency and this, in turn, fostered a change in the epistemological understanding of AI: from rules-based or symbolic AI towards the matrix multiplications underpinning connectionism, machine learning and neural nets. Combining a material political economy of markets with a material epistemology of science, the article shows that there is no clear-cut division between software and hardware, between instructions and tools, and between frameworks of thought and the material and economic conditions of possibility of thought itself. As the microchip shortage and the growing geopolitical relevance of the hardware and semiconductor supply chain come to the fore, the paper invites social scientists to engage more closely with the materialities and hardware architectures of ‘virtual’ algorithms and software

    FLAME GPU 2: a framework for flexible and performant agent based simulation on GPUs

    Get PDF
    Agent based modelling (ABM) offers a powerful abstraction for scientific study in a broad range of domains. The use of agent based simulators encourages good software engineering design such as separation of concerns, that is, the uncoupling of the model description from its implementation detail. A major limitation in current approaches to ABM simulation is that of the trade off between simulator flexibility and performance. It is common that highly optimised simulations, such as those which target graphics processing units (GPU) hardware, are implemented as standalone software. This work presents a software framework (FLAME GPU 2) which balances flexibility with performance for general purpose ABM. Methods for ensuring high computational efficacy are demonstrated by, minimising data movement, and ensuring high device utilisation by exploiting opportunities for concurrent code execution within a model and through the use of ensembles of simulations. A novel hierarchical sub-modelling approach is also presented which can be used to model certain types of recursive behaviours. This feature is shown to be essential in providing a mechanism to resolve competition for resources between agents within a parallel environment which would otherwise introduce race conditions. To understand the performance characteristics of the software, a benchmark model with millions of agents is used to explore the use of simulation ensembles and to parametrically investigate concurrent code execution within a model. Performance speedups are demonstrated of 3.5 and 10 respectively over a baseline GPU implementation. Our hierarchical sub-modelling approach is used to demonstrate the implementation of a recursive algorithm to resolve competition of agent movement which occurs as a result of agent desire to simultaneously occupy discrete areas high in a ‘resource’. The algorithm is used to implement a classical socio-economics model, Sugarscape, with populations of up to 16M agents

    MOCAST 2021

    Get PDF
    The 10th International Conference on Modern Circuit and System Technologies on Electronics and Communications (MOCAST 2021) will take place in Thessaloniki, Greece, from July 5th to July 7th, 2021. The MOCAST technical program includes all aspects of circuit and system technologies, from modeling to design, verification, implementation, and application. This Special Issue presents extended versions of top-ranking papers in the conference. The topics of MOCAST include:Analog/RF and mixed signal circuits;Digital circuits and systems design;Nonlinear circuits and systems;Device and circuit modeling;High-performance embedded systems;Systems and applications;Sensors and systems;Machine learning and AI applications;Communication; Network systems;Power management;Imagers, MEMS, medical, and displays;Radiation front ends (nuclear and space application);Education in circuits, systems, and communications

    Event-Driven Technologies for Reactive Motion Planning: Neuromorphic Stereo Vision and Robot Path Planning and Their Application on Parallel Hardware

    Get PDF
    Die Robotik wird immer mehr zu einem Schlüsselfaktor des technischen Aufschwungs. Trotz beeindruckender Fortschritte in den letzten Jahrzehnten, übertreffen Gehirne von Säugetieren in den Bereichen Sehen und Bewegungsplanung noch immer selbst die leistungsfähigsten Maschinen. Industrieroboter sind sehr schnell und präzise, aber ihre Planungsalgorithmen sind in hochdynamischen Umgebungen, wie sie für die Mensch-Roboter-Kollaboration (MRK) erforderlich sind, nicht leistungsfähig genug. Ohne schnelle und adaptive Bewegungsplanung kann sichere MRK nicht garantiert werden. Neuromorphe Technologien, einschließlich visueller Sensoren und Hardware-Chips, arbeiten asynchron und verarbeiten so raum-zeitliche Informationen sehr effizient. Insbesondere ereignisbasierte visuelle Sensoren sind konventionellen, synchronen Kameras bei vielen Anwendungen bereits überlegen. Daher haben ereignisbasierte Methoden ein großes Potenzial, schnellere und energieeffizientere Algorithmen zur Bewegungssteuerung in der MRK zu ermöglichen. In dieser Arbeit wird ein Ansatz zur flexiblen reaktiven Bewegungssteuerung eines Roboterarms vorgestellt. Dabei wird die Exterozeption durch ereignisbasiertes Stereosehen erreicht und die Pfadplanung ist in einer neuronalen Repräsentation des Konfigurationsraums implementiert. Die Multiview-3D-Rekonstruktion wird durch eine qualitative Analyse in Simulation evaluiert und auf ein Stereo-System ereignisbasierter Kameras übertragen. Zur Evaluierung der reaktiven kollisionsfreien Online-Planung wird ein Demonstrator mit einem industriellen Roboter genutzt. Dieser wird auch für eine vergleichende Studie zu sample-basierten Planern verwendet. Ergänzt wird dies durch einen Benchmark von parallelen Hardwarelösungen wozu als Testszenario Bahnplanung in der Robotik gewählt wurde. Die Ergebnisse zeigen, dass die vorgeschlagenen neuronalen Lösungen einen effektiven Weg zur Realisierung einer Robotersteuerung für dynamische Szenarien darstellen. Diese Arbeit schafft eine Grundlage für neuronale Lösungen bei adaptiven Fertigungsprozesse, auch in Zusammenarbeit mit dem Menschen, ohne Einbußen bei Geschwindigkeit und Sicherheit. Damit ebnet sie den Weg für die Integration von dem Gehirn nachempfundener Hardware und Algorithmen in die Industrierobotik und MRK

    Advances in Artificial Intelligence: Models, Optimization, and Machine Learning

    Get PDF
    The present book contains all the articles accepted and published in the Special Issue “Advances in Artificial Intelligence: Models, Optimization, and Machine Learning” of the MDPI Mathematics journal, which covers a wide range of topics connected to the theory and applications of artificial intelligence and its subfields. These topics include, among others, deep learning and classic machine learning algorithms, neural modelling, architectures and learning algorithms, biologically inspired optimization algorithms, algorithms for autonomous driving, probabilistic models and Bayesian reasoning, intelligent agents and multiagent systems. We hope that the scientific results presented in this book will serve as valuable sources of documentation and inspiration for anyone willing to pursue research in artificial intelligence, machine learning and their widespread applications

    Convolutional Neural Network in Pattern Recognition

    Get PDF
    Since convolutional neural network (CNN) was first implemented by Yann LeCun et al. in 1989, CNN and its variants have been widely implemented to numerous topics of pattern recognition, and have been considered as the most crucial techniques in the field of artificial intelligence and computer vision. This dissertation not only demonstrates the implementation aspect of CNN, but also lays emphasis on the methodology of neural network (NN) based classifier. As known to many, one general pipeline of NN-based classifier can be recognized as three stages: pre-processing, inference by models, and post-processing. To demonstrate the importance of pre-processing techniques, this dissertation presents how to model actual problems in medical pattern recognition and image processing by introducing conceptual abstraction and fuzzification. In particular, a transformer on the basis of self-attention mechanism, namely beat-rhythm transformer, greatly benefits from correct R-peak detection results and conceptual fuzzification. Recently proposed self-attention mechanism has been proven to be the top performer in the fields of computer vision and natural language processing. In spite of the pleasant accuracy and precision it has gained, it usually consumes huge computational resources to perform self-attention. Therefore, realtime global attention network is proposed to make a better trade-off between efficiency and performance for the task of image segmentation. To illustrate more on the stage of inference, we also propose models to detect polyps via Faster R-CNN - one of the most popular CNN-based 2D detectors, as well as a 3D object detection pipeline for regressing 3D bounding boxes from LiDAR points and stereo image pairs powered by CNN. The goal for post-processing stage is to refine artifacts inferred by models. For the semantic segmentation task, the dilated continuous random field is proposed to be better fitted to CNN-based models than the widely implemented fully-connected continuous random field. Proposed approaches can be further integrated into a reinforcement learning architecture for robotics
    corecore