9,108 research outputs found
Modern computing: Vision and challenges
Over the past six decades, the computing systems field has experienced significant transformations, profoundly impacting society with transformational developments, such as the Internet and the commodification of computing. Underpinned by technological advancements, computer systems, far from being static, have been continuously evolving and adapting to cover multifaceted societal niches. This has led to new paradigms such as cloud, fog, edge computing, and the Internet of Things (IoT), which offer fresh economic and creative opportunities. Nevertheless, this rapid change poses complex research challenges, especially in maximizing potential and enhancing functionality. As such, to maintain an economical level of performance that meets ever-tighter requirements, one must understand the drivers of new model emergence and expansion, and how contemporary challenges differ from past ones. To that end, this article investigates and assesses the factors influencing the evolution of computing systems, covering established systems and architectures as well as newer developments, such as serverless computing, quantum computing, and on-device AI on edge devices. Trends emerge when one traces technological trajectory, which includes the rapid obsolescence of frameworks due to business and technical constraints, a move towards specialized systems and models, and varying approaches to centralized and decentralized control. This comprehensive review of modern computing systems looks ahead to the future of research in the field, highlighting key challenges and emerging trends, and underscoring their importance in cost-effectively driving technological progress
FireFly: A High-Throughput and Reconfigurable Hardware Accelerator for Spiking Neural Networks
Spiking neural networks (SNNs) have been widely used due to their strong
biological interpretability and high energy efficiency. With the introduction
of the backpropagation algorithm and surrogate gradient, the structure of
spiking neural networks has become more complex, and the performance gap with
artificial neural networks has gradually decreased. However, most SNN hardware
implementations for field-programmable gate arrays (FPGAs) cannot meet
arithmetic or memory efficiency requirements, which significantly restricts the
development of SNNs. They do not delve into the arithmetic operations between
the binary spikes and synaptic weights or assume unlimited on-chip RAM
resources by using overly expensive devices on small tasks. To improve
arithmetic efficiency, we analyze the neural dynamics of spiking neurons,
generalize the SNN arithmetic operation to the multiplex-accumulate operation,
and propose a high-performance implementation of such operation by utilizing
the DSP48E2 hard block in Xilinx Ultrascale FPGAs. To improve memory
efficiency, we design a memory system to enable efficient synaptic weights and
membrane voltage memory access with reasonable on-chip RAM consumption.
Combining the above two improvements, we propose an FPGA accelerator that can
process spikes generated by the firing neuron on-the-fly (FireFly). FireFly is
implemented on several FPGA edge devices with limited resources but still
guarantees a peak performance of 5.53TSOP/s at 300MHz. As a lightweight
accelerator, FireFly achieves the highest computational density efficiency
compared with existing research using large FPGA devices
Natural and Technological Hazards in Urban Areas
Natural hazard events and technological accidents are separate causes of environmental impacts. Natural hazards are physical phenomena active in geological times, whereas technological hazards result from actions or facilities created by humans. In our time, combined natural and man-made hazards have been induced. Overpopulation and urban development in areas prone to natural hazards increase the impact of natural disasters worldwide. Additionally, urban areas are frequently characterized by intense industrial activity and rapid, poorly planned growth that threatens the environment and degrades the quality of life. Therefore, proper urban planning is crucial to minimize fatalities and reduce the environmental and economic impacts that accompany both natural and technological hazardous events
Further Improvements in Decoding Performance for 5G LDPC Codes Based on Modified Check-Node Unit
One of the most important units of Low-Density Parity-Check (LDPC) decoders is the Check-Node Unit. Its main task is to find the first two minimum values among incoming variable-to-check messages and return check-to-variable messages. This block significantly affects the decoding performance, as well as the hardware implementation complexity. In this paper, we first propose a modification to the check-node update rule by introducing two optimal offset factors applied to the check-to-variable messages. Then, we present the Check-Node Unit hardware architecture which performs the proposed algorithm. The main objective of this work aims to improve further the decoding performance for 5th Generation (5G) LDPC codes. The simulation results show that the proposed algorithm achieves essential improvements in terms of error correction performance. More precisely, the error-floor does not appear within Bit-Error-Rate (BER) of 10^(-8), while the decoding gain increases up to 0.21 dB compared to the baseline Normalized Min-Sum, as well as several state-of-the-art LDPC-based Min-Sum decoders
SCV-GNN: Sparse Compressed Vector-based Graph Neural Network Aggregation
Graph neural networks (GNNs) have emerged as a powerful tool to process
graph-based data in fields like communication networks, molecular interactions,
chemistry, social networks, and neuroscience. GNNs are characterized by the
ultra-sparse nature of their adjacency matrix that necessitates the development
of dedicated hardware beyond general-purpose sparse matrix multipliers. While
there has been extensive research on designing dedicated hardware accelerators
for GNNs, few have extensively explored the impact of the sparse storage format
on the efficiency of the GNN accelerators. This paper proposes SCV-GNN with the
novel sparse compressed vectors (SCV) format optimized for the aggregation
operation. We use Z-Morton ordering to derive a data-locality-based computation
ordering and partitioning scheme. The paper also presents how the proposed
SCV-GNN is scalable on a vector processing system. Experimental results over
various datasets show that the proposed method achieves a geometric mean
speedup of and over CSC and CSR aggregation
operations, respectively. The proposed method also reduces the memory traffic
by a factor of and over compressed sparse column
(CSC) and compressed sparse row (CSR), respectively. Thus, the proposed novel
aggregation format reduces the latency and memory access for GNN inference
Exploring the effects of robotic design on learning and neural control
The ongoing deep learning revolution has allowed computers to outclass humans
in various games and perceive features imperceptible to humans during
classification tasks. Current machine learning techniques have clearly
distinguished themselves in specialized tasks. However, we have yet to see
robots capable of performing multiple tasks at an expert level. Most work in
this field is focused on the development of more sophisticated learning
algorithms for a robot's controller given a largely static and presupposed
robotic design. By focusing on the development of robotic bodies, rather than
neural controllers, I have discovered that robots can be designed such that
they overcome many of the current pitfalls encountered by neural controllers in
multitask settings. Through this discovery, I also present novel metrics to
explicitly measure the learning ability of a robotic design and its resistance
to common problems such as catastrophic interference.
Traditionally, the physical robot design requires human engineers to plan
every aspect of the system, which is expensive and often relies on human
intuition. In contrast, within the field of evolutionary robotics, evolutionary
algorithms are used to automatically create optimized designs, however, such
designs are often still limited in their ability to perform in a multitask
setting. The metrics created and presented here give a novel path to automated
design that allow evolved robots to synergize with their controller to improve
the computational efficiency of their learning while overcoming catastrophic
interference.
Overall, this dissertation intimates the ability to automatically design
robots that are more general purpose than current robots and that can perform
various tasks while requiring less computation.Comment: arXiv admin note: text overlap with arXiv:2008.0639
- …