799 research outputs found
Graph Neural Networks for Contextual ASR with the Tree-Constrained Pointer Generator
The incorporation of biasing words obtained through contextual knowledge is
of paramount importance in automatic speech recognition (ASR) applications.
This paper proposes an innovative method for achieving end-to-end contextual
ASR using graph neural network (GNN) encodings based on the tree-constrained
pointer generator method. GNN node encodings facilitate lookahead for future
word pieces in the process of ASR decoding at each tree node by incorporating
information about all word pieces on the tree branches rooted from it. This
results in a more precise prediction of the generation probability of the
biasing words. The study explores three GNN encoding techniques, namely tree
recursive neural networks, graph convolutional network (GCN), and GraphSAGE,
along with different combinations of the complementary GCN and GraphSAGE
structures. The performance of the systems was evaluated using the Librispeech
and AMI corpus, following the visual-grounded contextual ASR pipeline. The
findings indicate that using GNN encodings achieved consistent and significant
reductions in word error rate (WER), particularly for words that are rare or
have not been seen during the training process. Notably, the most effective
combination of GNN encodings obtained more than 60% WER reduction for rare and
unseen words compared to standard end-to-end systems.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech, and Language
Processin
Urban Building Energy and Climate (UrBEC) simulation: Example application and field evaluation in Sai Ying Pun, Hong Kong
The energy performance of a building in a dense city depends to some extent on its surroundings. The impact of the built form, together with anthropogenic heat gains from traffic and building HVAC exhaust, determines external environmental conditions at the Urban Canopy Layer. Existing building energy models are limited in accounting for micro-scale variations of the urban microclimate, which may significantly modify a building's energy performance in density cities. This paper presents the Urban Building Energy and Climate (UrBEC) model, a coupled urban microclimate model (UMM) and building energy model (HTB2) developed to assess the time varying energy performance of a cluster of buildings and the combined heat gains to the external space from direct and reflected solar radiation, traffic and the exhaust from HVAC systems in a high-density city. The simulation results were evaluated by comparison with field measurement data collected from the Sai Ying Pun neighbourhood in Hong Kong, on a summer and winter day. Predicted and measured air and surface temperature at the four locations were found to be in reasonable agreement. Simulation results indicate an average of 1-3 ÂșC of temperature rise in street canyons compared with the ambient air in summer. Street level air is predicted to be 0.6 ÂșC warmer than those at higher levels (20m +). Anthropogenic heat from traffic and building HVAC exhaust are the dominant contributors to temperature rise in street canyons in summer, exceeding the contribution from urban surfaces. The predicted building cooling demand is expected to increase up to 15 % in summer due to the warming effect in street canyons. The UrBEC model runs significantly faster than current CFD-based approaches. Therefore, the model has the potential to support early stage design and planning decisions in a dense city
No-reference image quality assessment based on the AdaBoost BP neural network in the wavelet domain
Considering the relatively poor robustness of quality scores for different types of distortion and the lack of mechanism for determining distortion types, a no-reference image quality assessment (NR-IQA) method based on the AdaBoost BP Neural Network in Wavelet domain (WABNN) is proposed. A 36-dimensional image feature vector is constructed by extracting natural scene statistics (NSS) features and local information entropy features of the distorted image wavelet sub-band coefficients in three scales. The ABNN classifier is obtained by learning the relationship between image features and distortion types. The ABNN scorer is obtained by learning the relationship between image features and image quality scores. A series of contrast experiments are carried out in the LIVE database and TID2013 database. Experimental results show the high accuracy of the distinguishing distortion type, the high consistency with subjective scores and the high robustness of the method for distorted images. Experiment results also show the independence for the database and the relatively high operation efficiency of this method
Recommended from our members
Collaborative interactions of heterogenous ribonucleoproteins contribute to transcriptional regulation of sterol metabolism in mice.
Heterogeneous nuclear ribonucleoproteins (hnRNPs) are a group of functionally versatile proteins that play critical roles in the biogenesis, cellular localization and transport of RNA. Here, we outline a role for hnRNPs in gene regulatory circuits controlling sterol homeostasis. Specifically, we find that tissue-selective loss of the conserved hnRNP RALY enriches for metabolic pathways. Liver-specific deletion of RALY alters hepatic lipid content and serum cholesterol level. In vivo interrogation of chromatin architecture and genome-wide RALY-binding pattern reveal insights into its cooperative interactions and mode of action in regulating cholesterogenesis. Interestingly, we find that RALY binds the promoter region of the master metabolic regulator Srebp2 and show that it directly interacts with coactivator Nuclear Transcription Factor Y (NFY) to influence cholesterogenic gene expression. Our work offers insights into mechanisms orchestrating selective promoter activation in metabolic control and a model by which hnRNPs can impact health and disease states
Chaotic exploration and learning of locomotion behaviours
We present a general and fully dynamic neural system, which exploits intrinsic chaotic dynamics, for the real-time goal-directed exploration and learning of the possible locomotion patterns of an articulated robot of an arbitrary morphology in an unknown environment. The controller is modeled as a network of neural oscillators that are initially coupled only through physical embodiment, and goal-directed exploration of coordinated motor patterns is achieved by chaotic search using adaptive bifurcation. The phase space of the indirectly coupled neural-body-environment system contains multiple transient or permanent self-organized dynamics, each of which is a candidate for a locomotion behavior. The adaptive bifurcation enables the system orbit to wander through various phase-coordinated states, using its intrinsic chaotic dynamics as a driving force, and stabilizes on to one of the states matching the given goal criteria. In order to improve the sustainability of useful transient patterns, sensory homeostasis has been introduced, which results in an increased diversity of motor outputs, thus achieving multiscale exploration. A rhythmic pattern discovered by this process is memorized and sustained by changing the wiring between initially disconnected oscillators using an adaptive synchronization method. Our results show that the novel neurorobotic system is able to create and learn multiple locomotion behaviors for a wide range of body configurations and physical environments and can readapt in realtime after sustaining damage
A note on extinction times for the general birth, death and catastrophe process
Abstract We consider a birth, death and catastrophe process where the transition rates are allowed to depend on the population size. We obtain an explicit expression for the expected time to extinction, which is valid in all cases where extinction occurs with probability 1. Keywords: Population processes; Hitting times; Catastrophes; Zeta distribution. AMS 2000 Subject Classification: Primary 60J27 Secondary 60J35 The model under consideration is a continuous-time Markov chain (X(t), t â„ 0) taking values in S = {0, 1, . . . }, where X(t) represents the number in a population at time t. When there are i individuals present the population size changes at rate f i (> 0), and when a change occurs it is a birth with probability a (> 0) or catastrophe of size k (the removal of k individuals) with probability d k (k â„ 1). (Simple death events are catastrophes of size 1.) We assume that d k > 0 for at least one k â„ 1 and a + kâ„1 d k = 1. Thus, the process has transition rates Q given by q ij
- âŠ