354 research outputs found

    Time-domain speaker extraction network

    Full text link
    Speaker extraction is to extract a target speaker's voice from multi-talker speech. It simulates humans' cocktail party effect or the selective listening ability. The prior work mostly performs speaker extraction in frequency domain, then reconstructs the signal with some phase approximation. The inaccuracy of phase estimation is inherent to the frequency domain processing, that affects the quality of signal reconstruction. In this paper, we propose a time-domain speaker extraction network (TseNet) that doesn't decompose the speech signal into magnitude and phase spectrums, therefore, doesn't require phase estimation. The TseNet consists of a stack of dilated depthwise separable convolutional networks, that capture the long-range dependency of the speech signal with a manageable number of parameters. It is also conditioned on a reference voice from the target speaker, that is characterized by speaker i-vector, to perform the selective listening to the target speaker. Experiments show that the proposed TseNet achieves 16.3% and 7.0% relative improvements over the baseline in terms of signal-to-distortion ratio (SDR) and perceptual evaluation of speech quality (PESQ) under open evaluation condition.Comment: Published in ASRU 2019. arXiv admin note: text overlap with arXiv:2004.0832

    Prediction of drilling fluid lost-circulation zone based on deep learning

    Get PDF
    Lost circulation has become a crucial technical problem that restricts the quality and efficiency improvement of the drilling operation in deep oil and gas wells. The lost-circulation zone prediction has always been a hot and difficult research topic on the prevention and control of lost circulation. This study applied machine learning and statistical methods to deeply mine 105 groups and 29 features of loss data from typical loss block M. After removing 10 sets of noise data, the methods of mean removal, range scaling and normalization were used to pre-treat the 95 sets of the loss data. The multi-factor analysis of variance (ANOVA) and random forest algorithm were adopted to determine the 13 main factors affecting the lost circulation. The three typical deep learning neural network models were improved, the parameters in the models were adjusted, the neural network models with different structures were compared according to the PR curves, and the best model structure was built. The pre-treated loss data in 95 sets with 13 features were divided into the training set and test set by a ratio of 4:1. The model performance was evaluated using F1 score, accuracy, and recall rate. The trained model was successfully applied to the G block with severe leakage. The results show that the capsule network model is better than the BP neural network model and the convolutional neural network model. It stabilizes at 300 training rounds, with a prediction accuracy of 94.73%. The improved model can be applied to lost-circulation control in the field and provide guidance on leakage prevention and plugging operations

    The Influence of Family Cultural Capital on the Subject Selection Behavior of High School Students Under the New College Entrance Examination in Mainland China: The Mediating Role of Learning Efficacy

    Get PDF
    The “New College Entrance Examination” reform has become the most difficult part of mainland China’ s current education reform. This study investigates the influence of family cultural capital on the subject selection behavior of Chinese high school students with learning efficacy included as an intermediary variable. Altogether 1258 high school students in Chongqing were surveyed. We find that (1) high school students showed active participation in selecting subjects, and there were significant differences in their selection behavior in terms of grade, parents’ educational background, parents’ occupational level, and family per capita monthly income. Furthermore, (2) the effect of learning efficacy on family cultural capital was significant. The positive influences on high school students’ subject selection behavior were also reflected in the intermediary role of learning efficacy. We also found that (3) the influence of family cultural capital on the selection behavior of high school students is affected by individual and family background variables. Based on these results, countermeasures and suggestions are put forward to help high school students choose courses reasonably

    Sentence boundary detection in chinese broadcast news using conditional random fields and prosodic features

    Full text link
    In this paper, we explore the use of prosodic features in sen-tence boundary detection in Chinese broadcast news. The prosodic features include speaker turn, music, pause dura-tion, pitch, energy and speaking rate. Specifically, consider-ing the Chinese tonal effects in pitch trajectory, we propose to use tone-normalized pitch features. Experiments using deci-sion trees demonstrate that the tone-normalized pitch features show superior performance in sentence boundary detection in Chinese broadcast news. Furthermore, feature combination is able to achieve apparent performance improvement by in-tuitive feature interactive rules formed in the decision tree. Pause duration and a tone-normalized pitch feature contribute the most part of the feature usage in the best-performing de-cision tree. Index Terms — sentence boundary detection, sentence segmentation, speech prosody, rich transcription 1
    • …
    corecore