1,021 research outputs found
Recurrent Neural Network Training with Dark Knowledge Transfer
Recurrent neural networks (RNNs), particularly long short-term memory (LSTM),
have gained much attention in automatic speech recognition (ASR). Although some
successful stories have been reported, training RNNs remains highly
challenging, especially with limited training data. Recent research found that
a well-trained model can be used as a teacher to train other child models, by
using the predictions generated by the teacher model as supervision. This
knowledge transfer learning has been employed to train simple neural nets with
a complex one, so that the final performance can reach a level that is
infeasible to obtain by regular training. In this paper, we employ the
knowledge transfer learning approach to train RNNs (precisely LSTM) using a
deep neural network (DNN) model as the teacher. This is different from most of
the existing research on knowledge transfer learning, since the teacher (DNN)
is assumed to be weaker than the child (RNN); however, our experiments on an
ASR task showed that it works fairly well: without applying any tricks on the
learning scheme, this approach can train RNNs successfully even with limited
training data.Comment: ICASSP 201
Modelling trust evolution within small business lending relationships
Trust is a key dimension in the principal-agent relationship and it has been studied extensively. However, the dynamics, evolution, and intrinsic motivation and mechanisms have received less attention. This paper investigates the intrinsic motivation of trust and it proposes a theoretical model of trust evolution that is based on the notion of ‘trust response’ and ‘trust spiral’. We then specifically focus on trust within the lending relationship between banks and small businesses, and we run numerical simulations to further illustrate the evolution of involved mutual trust over time. Our model provides implications for future research in both trust evolution and small business lending relationships
Efficient and Interpretable Compressive Text Summarisation with Unsupervised Dual-Agent Reinforcement Learning
Recently, compressive text summarisation offers a balance between the
conciseness issue of extractive summarisation and the factual hallucination
issue of abstractive summarisation. However, most existing compressive
summarisation methods are supervised, relying on the expensive effort of
creating a new training dataset with corresponding compressive summaries. In
this paper, we propose an efficient and interpretable compressive summarisation
method that utilises unsupervised dual-agent reinforcement learning to optimise
a summary's semantic coverage and fluency by simulating human judgment on
summarisation quality. Our model consists of an extractor agent and a
compressor agent, and both agents have a multi-head attentional pointer-based
structure. The extractor agent first chooses salient sentences from a document,
and then the compressor agent compresses these extracted sentences by selecting
salient words to form a summary without using reference summaries to compute
the summary reward. To our best knowledge, this is the first work on
unsupervised compressive summarisation. Experimental results on three widely
used datasets (e.g., Newsroom, CNN/DM, and XSum) show that our model achieves
promising performance and a significant improvement on Newsroom in terms of the
ROUGE metric, as well as interpretability of semantic coverage of summarisation
results.Comment: The 4th Workshop on Simple and Efficient Natural Language Processing
(SustaiNLP 2023), co-located with ACL 202
RGB-D-based Stair Detection using Deep Learning for Autonomous Stair Climbing
Stairs are common building structures in urban environments, and stair
detection is an important part of environment perception for autonomous mobile
robots. Most existing algorithms have difficulty combining the visual
information from binocular sensors effectively and ensuring reliable detection
at night and in the case of extremely fuzzy visual clues. To solve these
problems, we propose a neural network architecture with RGB and depth map
inputs. Specifically, we design a selective module, which can make the network
learn the complementary relationship between the RGB map and the depth map and
effectively combine the information from the RGB map and the depth map in
different scenes. In addition, we design a line clustering algorithm for the
postprocessing of detection results, which can make full use of the detection
results to obtain the geometric stair parameters. Experiments on our dataset
show that our method can achieve better accuracy and recall compared with
existing state-of-the-art deep learning methods, which are 5.64% and 7.97%,
respectively, and our method also has extremely fast detection speed. A
lightweight version can achieve 300 + frames per second with the same
resolution, which can meet the needs of most real-time detection scenes
Stable Score Distillation for High-Quality 3D Generation
Although Score Distillation Sampling (SDS) has exhibited remarkable
performance in conditional 3D content generation, a comprehensive understanding
of its formulation is still lacking, hindering the development of 3D
generation. In this work, we decompose SDS as a combination of three functional
components, namely mode-seeking, mode-disengaging and variance-reducing terms,
analyzing the properties of each. We show that problems such as over-smoothness
and implausibility result from the intrinsic deficiency of the first two terms
and propose a more advanced variance-reducing term than that introduced by SDS.
Based on the analysis, we propose a simple yet effective approach named Stable
Score Distillation (SSD) which strategically orchestrates each term for
high-quality 3D generation and can be readily incorporated to various 3D
generation frameworks and 3D representations. Extensive experiments validate
the efficacy of our approach, demonstrating its ability to generate
high-fidelity 3D content without succumbing to issues such as over-smoothness
Approaching quantum anomalous Hall effect in proximity-coupled YIG/graphene/h-BN sandwich structure
Quantum anomalous Hall state is expected to emerge in Dirac electron systems
such as graphene under both sufficiently strong exchange and spin-orbit
interactions. In pristine graphene, neither interaction exists; however, both
interactions can be acquired by coupling graphene to a magnetic insulator (MI)
as revealed by the anomalous Hall effect. Here, we show enhanced magnetic
proximity coupling by sandwiching graphene between a ferrimagnetic insulator
yttrium iron garnet (YIG) and hexagonal-boron nitride (h-BN) which also serves
as a top gate dielectric. By sweeping the top-gate voltage, we observe Fermi
level-dependent anomalous Hall conductance. As the Dirac point is approached
from both electron and hole sides, the anomalous Hall conductance reaches 1/4
of the quantum anomalous Hall conductance 2e2/h. The exchange coupling strength
is determined to be as high as 27 meV from the transition temperature of the
induced magnetic phase. YIG/graphene/h-BN is an excellent heterostructure for
demonstrating proximity-induced interactions in two-dimensional electron
systems
- …