12 research outputs found
Attention-free Spikformer: Mixing Spike Sequences with Simple Linear Transforms
By integrating the self-attention capability and the biological properties of
Spiking Neural Networks (SNNs), Spikformer applies the flourishing Transformer
architecture to SNNs design. It introduces a Spiking Self-Attention (SSA)
module to mix sparse visual features using spike-form Query, Key, and Value,
resulting in the State-Of-The-Art (SOTA) performance on numerous datasets
compared to previous SNN-like frameworks. In this paper, we demonstrate that
the Spikformer architecture can be accelerated by replacing the SSA with an
unparameterized Linear Transform (LT) such as Fourier and Wavelet transforms.
These transforms are utilized to mix spike sequences, reducing the quadratic
time complexity to log-linear time complexity. They alternate between the
frequency and time domains to extract sparse visual features, showcasing
powerful performance and efficiency. We conduct extensive experiments on image
classification using both neuromorphic and static datasets. The results
indicate that compared to the SOTA Spikformer with SSA, Spikformer with LT
achieves higher Top-1 accuracy on neuromorphic datasets (i.e., CIFAR10-DVS and
DVS128 Gesture) and comparable Top-1 accuracy on static datasets (i.e.,
CIFAR-10 and CIFAR-100). Furthermore, Spikformer with LT achieves approximately
29-51% improvement in training speed, 61-70% improvement in inference speed,
and reduces memory usage by 4-26% due to not requiring learnable parameters.Comment: Under Revie
ODE-based Recurrent Model-free Reinforcement Learning for POMDPs
Neural ordinary differential equations (ODEs) are widely recognized as the
standard for modeling physical mechanisms, which help to perform approximate
inference in unknown physical or biological environments. In partially
observable (PO) environments, how to infer unseen information from raw
observations puzzled the agents. By using a recurrent policy with a compact
context, context-based reinforcement learning provides a flexible way to
extract unobservable information from historical transitions. To help the agent
extract more dynamics-related information, we present a novel ODE-based
recurrent model combines with model-free reinforcement learning (RL) framework
to solve partially observable Markov decision processes (POMDPs). We
experimentally demonstrate the efficacy of our methods across various PO
continuous control and meta-RL tasks. Furthermore, our experiments illustrate
that our method is robust against irregular observations, owing to the ability
of ODEs to model irregularly-sampled time series.Comment: Accepted by NeurIPS 202
Population-coding and Dynamic-neurons improved Spiking Actor Network for Reinforcement Learning
With the Deep Neural Networks (DNNs) as a powerful function approximator,
Deep Reinforcement Learning (DRL) has been excellently demonstrated on robotic
control tasks. Compared to DNNs with vanilla artificial neurons, the
biologically plausible Spiking Neural Network (SNN) contains a diverse
population of spiking neurons, making it naturally powerful on state
representation with spatial and temporal information. Based on a hybrid
learning framework, where a spike actor-network infers actions from states and
a deep critic network evaluates the actor, we propose a Population-coding and
Dynamic-neurons improved Spiking Actor Network (PDSAN) for efficient state
representation from two different scales: input coding and neuronal coding. For
input coding, we apply population coding with dynamically receptive fields to
directly encode each input state component. For neuronal coding, we propose
different types of dynamic-neurons (containing 1st-order and 2nd-order neuronal
dynamics) to describe much more complex neuronal dynamics. Finally, the PDSAN
is trained in conjunction with deep critic networks using the Twin Delayed Deep
Deterministic policy gradient algorithm (TD3-PDSAN). Extensive experimental
results show that our TD3-PDSAN model achieves better performance than
state-of-the-art models on four OpenAI gym benchmark tasks. It is an important
attempt to improve RL with SNN towards the effective computation satisfying
biological plausibility.Comment: 27 pages, 11 figures, accepted by Journal of Neural Network
Tuning Synaptic Connections instead of Weights by Genetic Algorithm in Spiking Policy Network
Learning from the interaction is the primary way biological agents know about
the environment and themselves. Modern deep reinforcement learning (DRL)
explores a computational approach to learning from interaction and has
significantly progressed in solving various tasks. However, the powerful DRL is
still far from biological agents in energy efficiency. Although the underlying
mechanisms are not fully understood, we believe that the integration of spiking
communication between neurons and biologically-plausible synaptic plasticity
plays a prominent role. Following this biological intuition, we optimize a
spiking policy network (SPN) by a genetic algorithm as an energy-efficient
alternative to DRL. Our SPN mimics the sensorimotor neuron pathway of insects
and communicates through event-based spikes. Inspired by biological research
that the brain forms memories by forming new synaptic connections and rewires
these connections based on new experiences, we tune the synaptic connections
instead of weights in SPN to solve given tasks. Experimental results on several
robotic control tasks show that our method can achieve the performance level of
mainstream DRL methods and exhibit significantly higher energy efficiency
Recent Advances and New Frontiers in Spiking Neural Networks
In recent years, spiking neural networks (SNNs) have received extensive
attention in brain-inspired intelligence due to their rich spatially-temporal
dynamics, various encoding methods, and event-driven characteristics that
naturally fit the neuromorphic hardware. With the development of SNNs,
brain-inspired intelligence, an emerging research field inspired by brain
science achievements and aiming at artificial general intelligence, is becoming
hot. This paper reviews recent advances and discusses new frontiers in SNNs
from five major research topics, including essential elements (i.e., spiking
neuron models, encoding methods, and topology structures), neuromorphic
datasets, optimization algorithms, software, and hardware frameworks. We hope
our survey can help researchers understand SNNs better and inspire new works to
advance this field.Comment: Accepted at IJCAI202
TSAM: A Two-Stream Attention Model for Causal Emotion Entailment
Causal Emotion Entailment (CEE) aims to discover the potential causes behind
an emotion in a conversational utterance. Previous works formalize CEE as
independent utterance pair classification problems, with emotion and speaker
information neglected. From a new perspective, this paper considers CEE in a
joint framework. We classify multiple utterances synchronously to capture the
correlations between utterances in a global view and propose a Two-Stream
Attention Model (TSAM) to effectively model the speaker's emotional influences
in the conversational history. Specifically, the TSAM comprises three modules:
Emotion Attention Network (EAN), Speaker Attention Network (SAN), and
interaction module. The EAN and SAN incorporate emotion and speaker information
in parallel, and the subsequent interaction module effectively interchanges
relevant information between the EAN and SAN via a mutual BiAffine
transformation. Extensive experimental results demonstrate that our model
achieves new State-Of-The-Art (SOTA) performance and outperforms baselines
remarkably
Continual Named Entity Recognition without Catastrophic Forgetting
Continual Named Entity Recognition (CNER) is a burgeoning area, which
involves updating an existing model by incorporating new entity types
sequentially. Nevertheless, continual learning approaches are often severely
afflicted by catastrophic forgetting. This issue is intensified in CNER due to
the consolidation of old entity types from previous steps into the non-entity
type at each step, leading to what is known as the semantic shift problem of
the non-entity type. In this paper, we introduce a pooled feature distillation
loss that skillfully navigates the trade-off between retaining knowledge of old
entity types and acquiring new ones, thereby more effectively mitigating the
problem of catastrophic forgetting. Additionally, we develop a confidence-based
pseudo-labeling for the non-entity type, \emph{i.e.,} predicting entity types
using the old model to handle the semantic shift of the non-entity type.
Following the pseudo-labeling process, we suggest an adaptive re-weighting
type-balanced learning strategy to handle the issue of biased type
distribution. We carried out comprehensive experiments on ten CNER settings
using three different datasets. The results illustrate that our method
significantly outperforms prior state-of-the-art approaches, registering an
average improvement of \% and \% in Micro and Macro F1 scores,
respectively.Comment: Accepted by EMNLP2023 main conference as a long pape
Federated Incremental Semantic Segmentation
Federated learning-based semantic segmentation (FSS) has drawn widespread
attention via decentralized training on local clients. However, most FSS models
assume categories are fixed in advance, thus heavily undergoing forgetting on
old categories in practical applications where local clients receive new
categories incrementally while have no memory storage to access old classes.
Moreover, new clients collecting novel classes may join in the global training
of FSS, which further exacerbates catastrophic forgetting. To surmount the
above challenges, we propose a Forgetting-Balanced Learning (FBL) model to
address heterogeneous forgetting on old classes from both intra-client and
inter-client aspects. Specifically, under the guidance of pseudo labels
generated via adaptive class-balanced pseudo labeling, we develop a
forgetting-balanced semantic compensation loss and a forgetting-balanced
relation consistency loss to rectify intra-client heterogeneous forgetting of
old categories with background shift. It performs balanced gradient propagation
and relation consistency distillation within local clients. Moreover, to tackle
heterogeneous forgetting from inter-client aspect, we propose a task transition
monitor. It can identify new classes under privacy protection and store the
latest old global model for relation distillation. Qualitative experiments
reveal large improvement of our model against comparison methods. The code is
available at https://github.com/JiahuaDong/FISS.Comment: Accepted to CVPR202
Task Relation Distillation and Prototypical Pseudo Label for Incremental Named Entity Recognition
Incremental Named Entity Recognition (INER) involves the sequential learning
of new entity types without accessing the training data of previously learned
types. However, INER faces the challenge of catastrophic forgetting specific
for incremental learning, further aggravated by background shift (i.e., old and
future entity types are labeled as the non-entity type in the current task). To
address these challenges, we propose a method called task Relation Distillation
and Prototypical pseudo label (RDP) for INER. Specifically, to tackle
catastrophic forgetting, we introduce a task relation distillation scheme that
serves two purposes: 1) ensuring inter-task semantic consistency across
different incremental learning tasks by minimizing inter-task relation
distillation loss, and 2) enhancing the model's prediction confidence by
minimizing intra-task self-entropy loss. Simultaneously, to mitigate background
shift, we develop a prototypical pseudo label strategy that distinguishes old
entity types from the current non-entity type using the old model. This
strategy generates high-quality pseudo labels by measuring the distances
between token embeddings and type-wise prototypes. We conducted extensive
experiments on ten INER settings of three benchmark datasets (i.e., CoNLL2003,
I2B2, and OntoNotes5). The results demonstrate that our method achieves
significant improvements over the previous state-of-the-art methods, with an
average increase of 6.08% in Micro F1 score and 7.71% in Macro F1 score.Comment: Accepted by CIKM2023 as a long paper with an oral presentatio
Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition
The spiking neural network (SNN) using leaky-integrated-and-fire (LIF) neurons has been commonly used in automatic speech recognition (ASR) tasks. However, the LIF neuron is still relatively simple compared to that in the biological brain. Further research on more types of neurons with different scales of neuronal dynamics is necessary. Here we introduce four types of neuronal dynamics to post-process the sequential patterns generated from the spiking transformer to get the complex dynamic neuron improved spiking transformer neural network (DyTr-SNN). We found that the DyTr-SNN could handle the non-toy automatic speech recognition task well, representing a lower phoneme error rate, lower computational cost, and higher robustness. These results indicate that the further cooperation of SNNs and neural dynamics at the neuron and network scales might have much in store for the future, especially on the ASR tasks