157 research outputs found
The Royalflush System for VoxCeleb Speaker Recognition Challenge 2022
In this technical report, we describe the Royalflush submissions for the
VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22). Our submissions
contain track 1, which is for supervised speaker verification and track 3,
which is for semi-supervised speaker verification. For track 1, we develop a
powerful U-Net-based speaker embedding extractor with a symmetric architecture.
The proposed system achieves 2.06% in EER and 0.1293 in MinDCF on the
validation set. Compared with the state-of-the-art ECAPA-TDNN, it obtains a
relative improvement of 20.7% in EER and 22.70% in MinDCF. For track 3, we
employ the joint training of source domain supervision and target domain
self-supervision to get a speaker embedding extractor. The subsequent
clustering process can obtain target domain pseudo-speaker labels. We adapt the
speaker embedding extractor using all source and target domain data in a
supervised manner, where it can fully leverage both domain information.
Moreover, clustering and supervised domain adaptation can be repeated until the
performance converges on the validation set. Our final submission is a fusion
of 10 models and achieves 7.75% EER and 0.3517 MinDCF on the validation set
Multi-frame image restoration method for novel rotating synthetic aperture imaging system
Abstract The novel rotating synthetic aperture (RSA) optical imaging system is an important development direction for future high-resolution optical remote sensing satellites in geostationary orbit. However, owing to the rotating rectangular pupil, the point spread function of the RSA system has an asymmetric spatial distribution, and the images obtained using the primary mirror from different rotation angles have nonuniform blur degradation. Moreover, platform vibration and pupil rotation have coupling effects on the RSA imaging, resulting in further radiometric and geometric quality degradation. To address these problems, the image degradation characteristics are first analyzed according to the imaging mechanism. Then, combined with the theory of mutual information, an image registration method is suggested by introducing the orientation gradient information. From this, a multi-frame image restoration model is proposed based on the directional gradient prior of the RSA system image. From the perspective of interpretation and application, when the aspect ratio is less than 3, the proposed inversion restoration method can achieve a satisfactory processing performance. This work can provide engineering application reference for the future space application of RSA imaging technology
LE-SSL-MOS: Self-Supervised Learning MOS Prediction with Listener Enhancement
Recently, researchers have shown an increasing interest in automatically
predicting the subjective evaluation for speech synthesis systems. This
prediction is a challenging task, especially on the out-of-domain test set. In
this paper, we proposed a novel fusion model for MOS prediction that combines
supervised and unsupervised approaches. In the supervised aspect, we developed
an SSL-based predictor called LE-SSL-MOS. The LE-SSL-MOS utilizes pre-trained
self-supervised learning models and further improves prediction accuracy by
utilizing the opinion scores of each utterance in the listener enhancement
branch. In the unsupervised aspect, two steps are contained: we fine-tuned the
unit language model (ULM) using highly intelligible domain data to improve the
correlation of an unsupervised metric - SpeechLMScore. Another is that we
utilized ASR confidence as a new metric with the help of ensemble learning. To
our knowledge, this is the first architecture that fuses supervised and
unsupervised methods for MOS prediction. With these approaches, our
experimental results on the VoiceMOS Challenge 2023 show that LE-SSL-MOS
performs better than the baseline. Our fusion system achieved an absolute
improvement of 13% over LE-SSL-MOS on the noisy and enhanced speech track. Our
system ranked 1st and 2nd, respectively, in the French speech synthesis track
and the challenge's noisy and enhanced speech track.Comment: accepted in IEEE-ASRU202
Strong enhancement of photoresponsivity with shrinking the electrodes spacing in few layer GaSe photodetectors
A critical challenge for the integration of the optoelectronics is that
photodetectors have relatively poor sensitivities at the nanometer scale. It is
generally believed that a large electrodes spacing in photodetectors is
required to absorb sufficient light to maintain high photoresponsivity and
reduce the dark current. However, this will limit the optoelectronic
integration density. Through spatially resolved photocurrent investigation, we
find that the photocurrent in metal-semiconductor-metal (MSM) photodetectors
based on layered GaSe is mainly generated from the photoexcited carriers close
to the metal-GaSe interface and the photocurrent active region is always close
to the Schottky barrier with higher electrical potential. The photoresponsivity
monotonically increases with shrinking the spacing distance before the direct
tunneling happen, which was significantly enhanced up to 5,000 AW-1 for the
bottom contacted device at bias voltage 8 V and wavelength of 410 nm. It is
more than 1,700-fold improvement over the previously reported results. Besides
the systematically experimental investigation of the dependence of the
photoresponsivity on the spacing distance for both the bottom and top contacted
MSM photodetectors, a theoretical model has also been developed to well explain
the photoresponsivity for these two types of device configurations. Our
findings realize shrinking the spacing distance and improving the performance
of 2D semiconductor based MSM photodetectors simultaneously, which could pave
the way for future high density integration of 2D semiconductor optoelectronics
with high performances.Comment: 25 pages, 4 figure
A Risk Prediction Model Based on Machine Learning for Cognitive Impairment Among Chinese Community-Dwelling Elderly People With Normal Cognition: Development and Validation Study
Background: Identifying cognitive impairment early enough could support timely intervention that may hinder or delay the trajectory of cognitive impairment, thus increasing the chances for successful cognitive aging.Objective: We aimed to build a prediction model based on machine learning for cognitive impairment among Chinese community-dwelling elderly people with normal cognition.Methods: A prospective cohort of 6718 older people from the Chinese Longitudinal Healthy Longevity Survey (CLHLS) register, followed between 2008 and 2011, was used to develop and validate the prediction model. Participants were included if they were aged 60 years or above, were community-dwelling elderly people, and had a cognitive Mini-Mental State Examination (MMSE) score >= 18. They were excluded if they were diagnosed with a severe disease (eg, cancer and dementia) or were living in institutions. Cognitive impairment was identified using the Chinese version of the MMSE. Several machine learning algorithms (random forest, XGBoost, naive Bayes, and logistic regression) were used to assess the 3-year risk of developing cognitive impairment. Optimal cutoffs and adjusted parameters were explored in validation data, and the model was further evaluated in test data. A nomogram was established to vividly present the prediction model.Results: The mean age of the participants was 80.4 years (SD 10.3 years), and 50.85% (3416/6718) were female. During a 3-year follow-up, 991 (14.8%) participants were identified with cognitive impairment. Among 45 features, the following four features were finally selected to develop the model: age, instrumental activities of daily living, marital status, and baseline cognitive function. The concordance index of the model constructed by logistic regression was 0.814 (95% CI 0.781-0.846). Older people with normal cognitive functioning having a nomogram score of less than 170 were considered to have a low 3-year risk of cognitive impairment, and those with a score of 170 or greater were considered to have a high 3-year risk of cognitive impairment.Conclusions: This simple and feasible cognitive impairment prediction model could identify community-dwelling elderly people at the greatest 3-year risk for cognitive impairment, which could help community nurses in the early identification of dementia
Recommended from our members
Doping High-Mobility Donor : Acceptor Copolymer Semiconductors with an Organic Salt for High-Performance Thermoelectric Materials
Organic semiconductors (OSCs) are attractive for fabrication of thermoelectric devices with low cost, large area, low toxicity, and high flexibility. In order to achieve high-performance organic thermoelectric devices (OTEs), it is essential to develop OSCs with high conductivity (σ), large Seebeck coefficient (S), and low thermal conductivity (κ). It is equally important to explore efficient dopants matching the need of thermoelectric devices. The thermoelectric performance of a high-mobility donor–acceptor (D–A) polymer semiconductor, which is doped by an organic salt, is studied. Both a high p-type electrical conductivity approaching 4 S cm−1 and an excellent power factor (PF) of 7 µW K−2 m−1 are obtained, which are among the highest reported values for polymer semiconductors. Temperature-dependent conductivity, Seebeck coefficient and power factor of the doped materials are systematically investigated. Detailed analysis on the results of thermoelectric measurements has revealed a hopping transport in the materials, which verifies the empirical relationship: S ∝ σ−1/4 and PF ∝ σ1/2. The results demonstrate that D–A copolymer semiconductors with proper combination of dopants have great potential for fabricating high-performance thermoelectric devices. © 2020 The Authors. Published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinhei
A chromosomelevel genome assembly of the Asian arowana, Scleropages formosus
Asian arowana (Scleropages formosus), an ancient teleost belonging to the Order Osteoglossomorpha, has been a valuable ornamental fish with some varieties. However, its biological studies and breeding germplasm have been remarkably limited by the lack of a reference genome. To solve these problems, here we report high-quality genome sequences of three common varieties of Asian arowana (the golden, red and green arowana). We firstly generated a chromosome-level genome assembly of the golden arowana, on basis of the genetic linkage map constructed with the restriction site-associated DNA sequencing (RAD-seq). In addition, we obtained draft genome assemblies of the red and green varieties. Finally, we annotated 22,016, 21,256 and 21,524 protein-coding genes in the genome assemblies of golden, red and green varieties respectively. Our data were deposited in publicly accessible repositories to promote biological research and molecular breeding of Asian arowana
Beam test of a 180 nm CMOS Pixel Sensor for the CEPC vertex detector
The proposed Circular Electron Positron Collider (CEPC) imposes new
challenges for the vertex detector in terms of pixel size and material budget.
A Monolithic Active Pixel Sensor (MAPS) prototype called TaichuPix, based on a
column drain readout architecture, has been developed to address the need for
high spatial resolution. In order to evaluate the performance of the
TaichuPix-3 chips, a beam test was carried out at DESY II TB21 in December
2022. Meanwhile, the Data Acquisition (DAQ) for a muti-plane configuration was
tested during the beam test. This work presents the characterization of the
TaichuPix-3 chips with two different processes, including cluster size, spatial
resolution, and detection efficiency. The analysis results indicate the spatial
resolution better than 5 and the detection efficiency exceeds 99.5 %
for both TaichuPix-3 chips with the two different processes
- …