217 research outputs found
Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition
Long short-term memory (LSTM) based acoustic modeling methods have recently
been shown to give state-of-the-art performance on some speech recognition
tasks. To achieve a further performance improvement, in this research, deep
extensions on LSTM are investigated considering that deep hierarchical model
has turned out to be more efficient than a shallow one. Motivated by previous
research on constructing deep recurrent neural networks (RNNs), alternative
deep LSTM architectures are proposed and empirically evaluated on a large
vocabulary conversational telephone speech recognition task. Meanwhile,
regarding to multi-GPU devices, the training process for LSTM networks is
introduced and discussed. Experimental results demonstrate that the deep LSTM
networks benefit from the depth and yield the state-of-the-art performance on
this task.Comment: submitted to ICASSP 2015 which does not perform blind review
Room geometry blind inference based on the localization of real sound source and first order reflections
The conventional room geometry blind inference techniques with acoustic
signals are conducted based on the prior knowledge of the environment, such as
the room impulse response (RIR) or the sound source position, which will limit
its application under unknown scenarios. To solve this problem, we have
proposed a room geometry reconstruction method in this paper by using the
geometric relation between the direct signal and first-order reflections. In
addition to the information of the compact microphone array itself, this method
does not need any precognition of the environmental parameters. Besides, the
learning-based DNN models are designed and used to improve the accuracy and
integrity of the localization results of the direct source and first-order
reflections. The direction of arrival (DOA) and time difference of arrival
(TDOA) information of the direct and reflected signals are firstly estimated
using the proposed DCNN and TD-CNN models, which have higher sensitivity and
accuracy than the conventional methods. Then the position of the sound source
is inferred by integrating the DOA, TDOA and array height using the proposed
DNN model. After that, the positions of image sources and corresponding
boundaries are derived based on the geometric relation. Experimental results of
both simulations and real measurements verify the effectiveness and accuracy of
the proposed techniques compared with the conventional methods under different
reverberant environments
Equity evaluation of urban park system: a case study of Xiamen, China
Urban parks play a distinctive and important role in satisfying residents’ demands on leisure and recreation, and thus have become the focus of research in the field of urban planning and sustainable development. This paper used equity as indicator to combine both the supply and demand sides of urban park service. Taking Xiamen as the study case, the relationship between spatial distribution of population and park services was analyzed. The results show that while population density has a significant spatial relationship with urban park service level at the city scale, Xiamen has the problem of neglecting the equity of urban park service between people and regions within the city. The proposed approach builds up the linkage between urban park service and urban population in order to evaluate the performance of urban park. Although the mechanism remains to be discussed, this study provides a useful auxiliary tool for constructing a guideline for urban green space planning, since urban park is increasingly seen as a kind of restricted public resource and ensuring its equity should be an important task for city mangers
Involvement of the JNK/FOXO3a/Bim Pathway in Neuronal Apoptosis after Hypoxic-Ischemic Brain Damage in Neonatal Rats.
c-Jun N-terminal kinase (JNK) plays a key role in the regulation of neuronal apoptosis. Previous studies have revealed that forkhead transcription factor (FOXO3a) is a critical effector of JNK-mediated tumor suppression. However, it is not clear whether the JNK/FOXO3a pathway is involved in neuronal apoptosis in the developing rat brain after hypoxia-ischemia (HI). In this study, we generated an HI model using postnatal day 7 rats. Fluorescence immunolabeling and Western blot assays were used to detect the distribution and expression of total and phosphorylated JNK and FOXO3a and the pro-apoptotic proteins Bim and CC3. We found that JNK phosphorylation was accompanied by FOXO3a dephosphorylation, which induced FOXO3a translocation into the nucleus, resulting in the upregulation of levels of Bim and CC3 proteins. Furthermore, we found that JNK inhibition by AS601245, a specific JNK inhibitor, significantly increased FOXO3a phosphorylation, which attenuated FOXO3a translocation into the nucleus after HI. Moreover, JNK inhibition downregulated levels of Bim and CC3 proteins, attenuated neuronal apoptosis and reduced brain infarct volume in the developing rat brain. Our findings suggest that the JNK/FOXO3a/Bim pathway is involved in neuronal apoptosis in the developing rat brain after HI. Agents targeting JNK may offer promise for rescuing neurons from HI-induced damage
A DenseNet-based method for decoding auditory spatial attention with EEG
Auditory spatial attention detection (ASAD) aims to decode the attended
spatial location with EEG in a multiple-speaker setting. ASAD methods are
inspired by the brain lateralization of cortical neural responses during the
processing of auditory spatial attention, and show promising performance for
the task of auditory attention decoding (AAD) with neural recordings. In the
previous ASAD methods, the spatial distribution of EEG electrodes is not fully
exploited, which may limit the performance of these methods. In the present
work, by transforming the original EEG channels into a two-dimensional (2D)
spatial topological map, the EEG data is transformed into a three-dimensional
(3D) arrangement containing spatial-temporal information. And then a 3D deep
convolutional neural network (DenseNet-3D) is used to extract temporal and
spatial features of the neural representation for the attended locations. The
results show that the proposed method achieves higher decoding accuracy than
the state-of-the-art (SOTA) method (94.4% compared to XANet's 90.6%) with
1-second decision window for the widely used KULeuven (KUL) dataset, and the
code to implement our work is available on Github:
https://github.com/xuxiran/ASAD_DenseNe
Semantic reconstruction of continuous language from MEG signals
Decoding language from neural signals holds considerable theoretical and
practical importance. Previous research has indicated the feasibility of
decoding text or speech from invasive neural signals. However, when using
non-invasive neural signals, significant challenges are encountered due to
their low quality. In this study, we proposed a data-driven approach for
decoding semantic of language from Magnetoencephalography (MEG) signals
recorded while subjects were listening to continuous speech. First, a
multi-subject decoding model was trained using contrastive learning to
reconstruct continuous word embeddings from MEG data. Subsequently, a beam
search algorithm was adopted to generate text sequences based on the
reconstructed word embeddings. Given a candidate sentence in the beam, a
language model was used to predict the subsequent words. The word embeddings of
the subsequent words were correlated with the reconstructed word embedding.
These correlations were then used as a measure of the probability for the next
word. The results showed that the proposed continuous word embedding model can
effectively leverage both subject-specific and subject-shared information.
Additionally, the decoded text exhibited significant similarity to the target
text, with an average BERTScore of 0.816, a score comparable to that in the
previous fMRI study
Optimal tests for rare variant effects in sequencing association studies
With development of massively parallel sequencing technologies, there is a substantial need for developing powerful rare variant association tests. Common approaches include burden and non-burden tests. Burden tests assume all rare variants in the target region have effects on the phenotype in the same direction and of similar magnitude. The recently proposed sequence kernel association test (SKAT) (Wu, M. C., and others, 2011. Rare-variant association testing for sequencing data with the SKAT. The American Journal of Human Genetics 89, 82–93], an extension of the C-alpha test (Neale, B. M., and others, 2011. Testing for an unusual distribution of rare variants. PLoS Genetics 7, 161–165], provides a robust test that is particularly powerful in the presence of protective and deleterious variants and null variants, but is less powerful than burden tests when a large number of variants in a region are causal and in the same direction. As the underlying biological mechanisms are unknown in practice and vary from one gene to another across the genome, it is of substantial practical interest to develop a test that is optimal for both scenarios. In this paper, we propose a class of tests that include burden tests and SKAT as special cases, and derive an optimal test within this class that maximizes power. We show that this optimal test outperforms burden tests and SKAT in a wide range of scenarios. The results are illustrated using simulation studies and triglyceride data from the Dallas Heart Study. In addition, we have derived sample size/power calculation formula for SKAT with a new family of kernels to facilitate designing new sequence association studies
- …