Search CORE

10,059 research outputs found

A 3D Coarse-to-Fine Framework for Volumetric Medical Image Segmentation

Author: Fishman Elliot K.
Shen Wei
Xia Yingda
Yuille Alan L.
Zhu Zhuotun
Publication venue
Publication date: 01/08/2018
Field of study

In this paper, we adopt 3D Convolutional Neural Networks to segment volumetric medical images. Although deep neural networks have been proven to be very effective on many 2D vision tasks, it is still challenging to apply them to 3D tasks due to the limited amount of annotated 3D data and limited computational resources. We propose a novel 3D-based coarse-to-fine framework to effectively and efficiently tackle these challenges. The proposed 3D-based framework outperforms the 2D counterpart to a large margin since it can leverage the rich spatial infor- mation along all three axes. We conduct experiments on two datasets which include healthy and pathological pancreases respectively, and achieve the current state-of-the-art in terms of Dice-S{\o}rensen Coefficient (DSC). On the NIH pancreas segmentation dataset, we outperform the previous best by an average of over 2%, and the worst case is improved by 7% to reach almost 70%, which indicates the reliability of our framework in clinical applications.Comment: 9 pages, 4 figures, Accepted to 3D

arXiv.org e-Print Archive

Crossref

Speaker Representation Learning using Global Context Guided Channel and Time-Frequency Transformations

Author: Hansen John H. L.
Xia Wei
Publication venue
Publication date: 09/09/2020
Field of study

In this study, we propose the global context guided channel and time-frequency transformations to model the long-range, non-local time-frequency dependencies and channel variances in speaker representations. We use the global context information to enhance important channels and recalibrate salient time-frequency locations by computing the similarity between the global context and local features. The proposed modules, together with a popular ResNet based model, are evaluated on the VoxCeleb1 dataset, which is a large scale speaker verification corpus collected in the wild. This lightweight block can be easily incorporated into a CNN model with little additional computational costs and effectively improves the speaker verification performance compared to the baseline ResNet-LDE model and the Squeeze&Excitation block by a large margin. Detailed ablation studies are also performed to analyze various factors that may impact the performance of the proposed modules. We find that by employing the proposed L2-tf-GTFC transformation block, the Equal Error Rate decreases from 4.56% to 3.07%, a relative 32.68% reduction, and a relative 27.28% improvement in terms of the DCF score. The results indicate that our proposed global context guided transformation modules can efficiently improve the learned speaker representations by achieving time-frequency and channel-wise feature recalibration.Comment: Accepted to Interspeech 202

arXiv.org e-Print Archive

Crossref

Data-driven Attention and Data-independent DCT based Global Context Modeling for Text-independent Speaker Recognition

Author: Hansen John H. L.
Xia Wei
Publication venue
Publication date: 04/08/2022
Field of study

Learning an effective speaker representation is crucial for achieving reliable performance in speaker verification tasks. Speech signals are high-dimensional, long, and variable-length sequences that entail a complex hierarchical structure. Signals may contain diverse information at each time-frequency (TF) location. For example, it may be more beneficial to focus on high-energy parts for phoneme classes such as fricatives. The standard convolutional layer that operates on neighboring local regions cannot capture the complex TF global context information. In this study, a general global time-frequency context modeling framework is proposed to leverage the context information specifically for speaker representation modeling. First, a data-driven attention-based context model is introduced to capture the long-range and non-local relationship across different time-frequency locations. Second, a data-independent 2D-DCT based context model is proposed to improve model interpretability. A multi-DCT attention mechanism is presented to improve modeling power with alternate DCT base forms. Finally, the global context information is used to recalibrate salient time-frequency locations by computing the similarity between the global context and local features. The proposed lightweight blocks can be easily incorporated into a speaker model with little additional computational costs and effectively improves the speaker verification performance compared to the standard ResNet model and Squeeze\&Excitation block by a large margin. Detailed ablation studies are also performed to analyze various factors that may impact performance of the proposed individual modules. Results from experiments show that the proposed global context modeling framework can efficiently improve the learned speaker representations by achieving channel-wise and time-frequency feature recalibration

arXiv.org e-Print Archive

Positive surface charge of GluN1 N-terminus mediates the direct interaction with EphB2 and NMDAR mobility.

Author: Dalva Matthew B.
Mao Yu-Ting
Washburn Halley R.
Xia Nan L.
Zhou Wei
Publication venue: Jefferson Digital Commons
Publication date: 29/01/2020
Field of study

Localization of the N-methyl-D-aspartate type glutamate receptor (NMDAR) to dendritic spines is essential for excitatory synaptic transmission and plasticity. Rather than remaining trapped at synaptic sites, NMDA receptors undergo constant cycling into and out of the postsynaptic density. Receptor movement is constrained by protein-protein interactions with both the intracellular and extracellular domains of the NMDAR. The role of extracellular interactions on the mobility of the NMDAR is poorly understood. Here we demonstrate that the positive surface charge of the hinge region of the N-terminal domain in the GluN1 subunit of the NMDAR is required to maintain NMDARs at dendritic spine synapses and mediates the direct extracellular interaction with a negatively charged phospho-tyrosine on the receptor tyrosine kinase EphB2. Loss of the EphB-NMDAR interaction by either mutating GluN1 or knocking down endogenous EphB2 increases NMDAR mobility. These findings begin to define a mechanism for extracellular interactions mediated by charged domains

Jefferson Digital Commons

77Se NMR study of pairing symmetry and spin dynamics in KyFe2-xSe2

Author: Bao Wei
Chen G. F.
He J. B.
Ma L.
Wang D. M.
Xia T. -L.
Yu Weiqiang
Publication venue: 'American Physical Society (APS)'
Publication date: 12/04/2011
Field of study

We present a 77Se NMR study of the newly discovered iron selenide superconductor KyFe2-xSe2, in which Tc = 32 K. Below Tc, the Knight shift 77K drops sharply with temperature, providing strong evidence for singlet pairing. Above Tc, Korringa-type relaxation indicates Fermi-liquid behavior. Our experimental results set strict constraints on the nature of possible theories for the mechanism of high-Tc superconductivity in this iron selenide system.Comment: Chemical composition of crystals determined. Accepted in Physical Review Letter

arXiv.org e-Print Archive

Crossref

Cross-domain Adaptation with Discrepancy Minimization for Text-independent Forensic Speaker Verification

Author: Hansen John H. L.
Wang Zhenyu
Xia Wei
Publication venue
Publication date: 09/09/2020
Field of study

Forensic audio analysis for speaker verification offers unique challenges due to location/scenario uncertainty and diversity mismatch between reference and naturalistic field recordings. The lack of real naturalistic forensic audio corpora with ground-truth speaker identity represents a major challenge in this field. It is also difficult to directly employ small-scale domain-specific data to train complex neural network architectures due to domain mismatch and loss in performance. Alternatively, cross-domain speaker verification for multiple acoustic environments is a challenging task which could advance research in audio forensics. In this study, we introduce a CRSS-Forensics audio dataset collected in multiple acoustic environments. We pre-train a CNN-based network using the VoxCeleb data, followed by an approach which fine-tunes part of the high-level network layers with clean speech from CRSS-Forensics. Based on this fine-tuned model, we align domain-specific distributions in the embedding space with the discrepancy loss and maximum mean discrepancy (MMD). This maintains effective performance on the clean set, while simultaneously generalizes the model to other acoustic domains. From the results, we demonstrate that diverse acoustic environments affect the speaker verification performance, and that our proposed approach of cross-domain adaptation can significantly improve the results in this scenario.Comment: To appear in INTERSPEECH 202

arXiv.org e-Print Archive

Crossref

Pharmacological induction of leukotriene B4 12-hydroxydehydrogenase (LTB4DH) in human neutrophils and its potential in the treatment of myocardial injury

Author: Han Y
Le XC
Rong J
Wei L
Xia Z
Publication venue
Publication date: 01/01/2011
Field of study

Oral Presentation: Session S29 - Vascular Biology, Basic Research: abstract no. 263postprint16th World Congress on Heart Disease, International Academy of Cardiology, Annual Scientific Sessions, Vancouver, BC, Canada, 23-26 July 2011, Oral Presentation: Session S29: Vascular Biology, Basic Research, Abstract No. 26

HKU Scholars Hub

Thermopower peak in phase transition region of (1-x)La $_{2/3}$ Ca $_{1/3}$ MnO $_{3}$ /xYSZ

Author: Chinping Chen
Fash M.
Jun Zhao
Mandal P.
Nakamae S.
Rubinstein M.
Sheng Liu
Shousheng Yan
Wei Liu
Yuan S. L.
Zhengcai Xia
Publication venue: 'AIP Publishing'
Publication date: 01/01/2003
Field of study

The thermoelectric power (TEP) and the electrical resistivity of the intergranular magnetoresistance (IGMR) composite, (1-x)La

_{2/3}

_{1/3}

MnO

_{3}

/xYSZ (LCMO/YSZ) with x = 0, 0.75%, 1.25%, 4.5%, 13% 15% and 80% of the yttria-stabalized zirconia (YSZ), have been measured from 300 K down to 77 K. Pronounced TEP peak appears during the phase transition for the samples of x

>

0, while not observed for x = 0. We suggest that this is due to the magnetic structure variation induced by the lattice strain which is resulting from the LCMO/YSZ boundary layers. The transition width in temperature derived from

d\chi/dT

, with

\chi

being the AC magnetic susceptibility, supports this interpretation.Comment: 4 pages, 4 eps figures, Latex, J. Appl. Phys 94, 7206 (2003

arXiv.org e-Print Archive

Crossref