16 research outputs found

    Weakly Supervised Training of Hierarchical Attention Networks for Speaker Identification

    Full text link
    Identifying multiple speakers without knowing where a speaker's voice is in a recording is a challenging task. In this paper, a hierarchical attention network is proposed to solve a weakly labelled speaker identification problem. The use of a hierarchical structure, consisting of a frame-level encoder and a segment-level encoder, aims to learn speaker related information locally and globally. Speech streams are segmented into fragments. The frame-level encoder with attention learns features and highlights the target related frames locally, and output a fragment based embedding. The segment-level encoder works with a second attention layer to emphasize the fragments probably related to target speakers. The global information is finally collected from segment-level module to predict speakers via a classifier. To evaluate the effectiveness of the proposed approach, artificial datasets based on Switchboard Cellular part1 (SWBC) and Voxceleb1 are constructed in two conditions, where speakers' voices are overlapped and not overlapped. Comparing to two baselines the obtained results show that the proposed approach can achieve better performances. Moreover, further experiments are conducted to evaluate the impact of utterance segmentation. The results show that a reasonable segmentation can slightly improve identification performances.Comment: Acceptted for presentation at Interspeech202

    Differentiable approximation by means of the Radon transformation and its applications to neural networks

    Get PDF
    AbstractWe treat the problem of simultaneously approximating a several-times differentiable function in several variables and its derivatives by a superposition of a function, say g, in one variable. In our theory, the domain of approximation can be either compact subsets or the whole Euclidean space Rd. We prove that if the domain is compact, the function g can be used without scaling, and that even in the case where the domain of approximation is the whole space Rd, g can be used without scaling if it satisfies a certain condition. Moreover, g can be chosen from a wide class of functions. The basic tool is the inverse Radon transform. As a neural network can output a superposition of g, our results extend well-known neural approximation theorems which are useful in neural computation theory

    Regional flood frequency analysis using an artificial neural network model

    Get PDF
    This paper presents the results from a study on the application of an artificial neural network (ANN) model for regional flood frequency analysis (RFFA). The study was conducted using stream flow data from 88 gauging stations across New South Wales (NSW) in Australia. Five different models consisting of three to eight predictor variables (i.e., annual rainfall, drainage area, fraction forested area, potential evapotranspiration, rainfall intensity, river slope, shape factor and stream density) were tested. The results show that an ANN model with a higher number of predictor variables does not always improve the performance of RFFA models. For example, the model with three predictor variables performs considerably better than the models using a higher number of predictor variables, except for the one which contains all the eight predictor variables. The model with three predictor variables exhibits smaller median relative error values for 2- and 20-year return periods compared to the model containing eight predictor variables. However, for 5-, 10-, 50- and 100-year return periods, the model with eight predictor variables shows smaller median relative error values. The proposed ANN modelling framework can be adapted to other regions in Australia and abroad

    An improved state space method for force identification based on function interpolation in the presence of large noise

    Get PDF
    The conventional state space method for force identification has the disadvantage of large discretization error with a low sampling frequency. This paper presents an improved method based on the function interpolation of the external force in time domain. Two types of the interpolation functions are investigated, one is the linear interpolation, and the other type is the sigmoid curve interpolation. Gauss integration method is used for integration computation. Numerical studies show that both of the improved methods based on the two types of interpolation function are more accurate especially when the sampling is long and/or with a low sampling frequency. In addition, the proposed method is also extended for the case of high noise level. The key idea is to divide the time step of measured responses into several smaller time steps to form an overdetermined equation of the inverse force identification. Then, the least square algorithm is adopted, which helps to reduce the effect of the high random noise to improve the accuracy of identified solution
    corecore