779 research outputs found

    Graph Neural Networks for Contextual ASR with the Tree-Constrained Pointer Generator

    Full text link
    The incorporation of biasing words obtained through contextual knowledge is of paramount importance in automatic speech recognition (ASR) applications. This paper proposes an innovative method for achieving end-to-end contextual ASR using graph neural network (GNN) encodings based on the tree-constrained pointer generator method. GNN node encodings facilitate lookahead for future word pieces in the process of ASR decoding at each tree node by incorporating information about all word pieces on the tree branches rooted from it. This results in a more precise prediction of the generation probability of the biasing words. The study explores three GNN encoding techniques, namely tree recursive neural networks, graph convolutional network (GCN), and GraphSAGE, along with different combinations of the complementary GCN and GraphSAGE structures. The performance of the systems was evaluated using the Librispeech and AMI corpus, following the visual-grounded contextual ASR pipeline. The findings indicate that using GNN encodings achieved consistent and significant reductions in word error rate (WER), particularly for words that are rare or have not been seen during the training process. Notably, the most effective combination of GNN encodings obtained more than 60% WER reduction for rare and unseen words compared to standard end-to-end systems.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processin

    Urban Building Energy and Climate (UrBEC) simulation: Example application and field evaluation in Sai Ying Pun, Hong Kong

    Get PDF
    The energy performance of a building in a dense city depends to some extent on its surroundings. The impact of the built form, together with anthropogenic heat gains from traffic and building HVAC exhaust, determines external environmental conditions at the Urban Canopy Layer. Existing building energy models are limited in accounting for micro-scale variations of the urban microclimate, which may significantly modify a building's energy performance in density cities. This paper presents the Urban Building Energy and Climate (UrBEC) model, a coupled urban microclimate model (UMM) and building energy model (HTB2) developed to assess the time varying energy performance of a cluster of buildings and the combined heat gains to the external space from direct and reflected solar radiation, traffic and the exhaust from HVAC systems in a high-density city. The simulation results were evaluated by comparison with field measurement data collected from the Sai Ying Pun neighbourhood in Hong Kong, on a summer and winter day. Predicted and measured air and surface temperature at the four locations were found to be in reasonable agreement. Simulation results indicate an average of 1-3 ÂșC of temperature rise in street canyons compared with the ambient air in summer. Street level air is predicted to be 0.6 ÂșC warmer than those at higher levels (20m +). Anthropogenic heat from traffic and building HVAC exhaust are the dominant contributors to temperature rise in street canyons in summer, exceeding the contribution from urban surfaces. The predicted building cooling demand is expected to increase up to 15 % in summer due to the warming effect in street canyons. The UrBEC model runs significantly faster than current CFD-based approaches. Therefore, the model has the potential to support early stage design and planning decisions in a dense city

    No-reference image quality assessment based on the AdaBoost BP neural network in the wavelet domain

    Get PDF
    Considering the relatively poor robustness of quality scores for different types of distortion and the lack of mechanism for determining distortion types, a no-reference image quality assessment (NR-IQA) method based on the AdaBoost BP Neural Network in Wavelet domain (WABNN) is proposed. A 36-dimensional image feature vector is constructed by extracting natural scene statistics (NSS) features and local information entropy features of the distorted image wavelet sub-band coefficients in three scales. The ABNN classifier is obtained by learning the relationship between image features and distortion types. The ABNN scorer is obtained by learning the relationship between image features and image quality scores. A series of contrast experiments are carried out in the LIVE database and TID2013 database. Experimental results show the high accuracy of the distinguishing distortion type, the high consistency with subjective scores and the high robustness of the method for distorted images. Experiment results also show the independence for the database and the relatively high operation efficiency of this method

    Chaotic exploration and learning of locomotion behaviours

    Get PDF
    We present a general and fully dynamic neural system, which exploits intrinsic chaotic dynamics, for the real-time goal-directed exploration and learning of the possible locomotion patterns of an articulated robot of an arbitrary morphology in an unknown environment. The controller is modeled as a network of neural oscillators that are initially coupled only through physical embodiment, and goal-directed exploration of coordinated motor patterns is achieved by chaotic search using adaptive bifurcation. The phase space of the indirectly coupled neural-body-environment system contains multiple transient or permanent self-organized dynamics, each of which is a candidate for a locomotion behavior. The adaptive bifurcation enables the system orbit to wander through various phase-coordinated states, using its intrinsic chaotic dynamics as a driving force, and stabilizes on to one of the states matching the given goal criteria. In order to improve the sustainability of useful transient patterns, sensory homeostasis has been introduced, which results in an increased diversity of motor outputs, thus achieving multiscale exploration. A rhythmic pattern discovered by this process is memorized and sustained by changing the wiring between initially disconnected oscillators using an adaptive synchronization method. Our results show that the novel neurorobotic system is able to create and learn multiple locomotion behaviors for a wide range of body configurations and physical environments and can readapt in realtime after sustaining damage

    A note on extinction times for the general birth, death and catastrophe process

    Get PDF
    Abstract We consider a birth, death and catastrophe process where the transition rates are allowed to depend on the population size. We obtain an explicit expression for the expected time to extinction, which is valid in all cases where extinction occurs with probability 1. Keywords: Population processes; Hitting times; Catastrophes; Zeta distribution. AMS 2000 Subject Classification: Primary 60J27 Secondary 60J35 The model under consideration is a continuous-time Markov chain (X(t), t ≄ 0) taking values in S = {0, 1, . . . }, where X(t) represents the number in a population at time t. When there are i individuals present the population size changes at rate f i (> 0), and when a change occurs it is a birth with probability a (> 0) or catastrophe of size k (the removal of k individuals) with probability d k (k ≄ 1). (Simple death events are catastrophes of size 1.) We assume that d k > 0 for at least one k ≄ 1 and a + k≄1 d k = 1. Thus, the process has transition rates Q given by q ij
    • 

    corecore