7 research outputs found

    AutoCycle-VC: Towards Bottleneck-Independent Zero-Shot Cross-Lingual Voice Conversion

    Full text link
    This paper proposes a simple and robust zero-shot voice conversion system with a cycle structure and mel-spectrogram pre-processing. Previous works suffer from information loss and poor synthesis quality due to their reliance on a carefully designed bottleneck structure. Moreover, models relying solely on self-reconstruction loss struggled with reproducing different speakers' voices. To address these issues, we suggested a cycle-consistency loss that considers conversion back and forth between target and source speakers. Additionally, stacked random-shuffled mel-spectrograms and a label smoothing method are utilized during speaker encoder training to extract a time-independent global speaker representation from speech, which is the key to a zero-shot conversion. Our model outperforms existing state-of-the-art results in both subjective and objective evaluations. Furthermore, it facilitates cross-lingual voice conversions and enhances the quality of synthesized speech

    Federated learning for thyroid ultrasound image analysis to protect personal information: Validation study in a real health care environment

    Get PDF
    Background: Federated learning is a decentralized approach to machine learning; it is a training strategy that overcomes medical data privacy regulations and generalizes deep learning algorithms. Federated learning mitigates many systemic privacy risks by sharing only the model and parameters for training, without the need to export existing medical data sets. In this study, we performed ultrasound image analysis using federated learning to predict whether thyroid nodules were benign or malignant. Objective: The goal of this study was to evaluate whether the performance of federated learning was comparable with that of conventional deep learning. Methods: A total of 8457 (5375 malignant, 3082 benign) ultrasound images were collected from 6 institutions and used for federated learning and conventional deep learning. Five deep learning networks (VGG19, ResNet50, ResNext50, SE-ResNet50, and SE-ResNext50) were used. Using stratified random sampling, we selected 20% (1075 malignant, 616 benign) of the total images for internal validation. For external validation, we used 100 ultrasound images (50 malignant, 50 benign) from another institution Results: For internal validation, the area under the receiver operating characteristic (AUROC) curve for federated learning was between 78.88% and 87.56%, and the AUROC for conventional deep learning was between 82.61% and 91.57%. For external validation, the AUROC for federated learning was between 75.20% and 86.72%, and the AUROC curve for conventional deep learning was between 73.04% and 91.04%. Conclusions: We demonstrated that the performance of federated learning using decentralized data was comparable to that of conventional deep learning using pooled data. Federated learning might be potentially useful for analyzing medical images while protecting patients personal information. © 2021 JMIR Medical Informatics. All rights reserved.1

    Boundary-Oriented Binary Building Segmentation Model With Two Scheme Learning for Aerial Images

    Full text link
    Various deep learning-based segmentation models have been developed to segment buildings in aerial images. However, the segmentation maps predicted by the conventional convolutional neural network-based methods cannot accurately determine the shapes and boundaries of segmented buildings. In this article, to improve the prediction accuracy for the boundaries and shapes of segmented buildings in aerial images, we propose the boundary-oriented binary building segmentation model (B3SM). To construct the B3SM for boundary-enhanced semantic segmentation, we present two-scheme learning (Schemes I and II), which uses the upsampling interpolation method (USIM) as a new operator and a boundary-oriented loss function (B-Loss). In Scheme I, a raw input image is processed and transformed into a presegmented map. In Scheme II, the presegmented map from Scheme I is transformed into a more fine-grained representation. To connect these two schemes, we use the USIM operator. In addition, the novel B-Loss function is implemented in B3SM to extract the features of the boundaries of buildings effectively. To perform quantitative evaluation of the shapes and boundaries of segmented buildings generated by B3SM, we develop a new metric called the boundary-oriented intersection over union (B-IoU). After evaluating the effectiveness of two-scheme learning, USIM, and B-Loss for building segmentation, we compare the performance of B3SM to those of other state-of-the-art methods using public and custom datasets. The experimental results demonstrate that the B3SM outperforms other state-of-the-art models, resulting in more accurate shapes and boundaries for segmented buildings in aerial images. IEEE1

    Local Similarity Siamese Network for Urban Land Change Detection on Remote Sensing Images

    Full text link
    Change detection is an important task in the field of remote sensing. Various change detection methods based on convolutional neural networks (CNNs) have recently been proposed for remote sensing using satellite or aerial images. However, existing methods allow only the partial use of content information in images during change detection because they adopt simple feature-similarity measurements or pixel-level loss functions to construct their network architectures. Therefore, when these methods are applied to complex urban areas, their performance in terms of change detection tends to be limited. In this paper, a novel CNN-based change detection approach, referred to as a local similarity Siamese network (LSS-Net), with a cosine similarity measurement, has been proposed for better urban land change detection in remote sensing images. To use content information on two sequential images, a new change attention map-based content loss (CAC loss) function was developed in this study. In addition, to enhance the performance of LSS-Net in terms of change detection, a suitable feature-similarity measurement method, incorporated into a local similarity attention module, was determined through systemic experiments. To verify the change detection performance of LSS-Net, it was compared with other state-of-the-art methods. Experimental results show that the proposed method outperforms the state-of-the-art methods in terms of F1 score (0.9630, 0.9377, and 0.7751), and kappa (0.9581, 0.9351, and 0.7646) on the three test datasets, thus suggesting its potential for various remote sensing applications. CCBYTRU

    Brachygnathia Inferior in Cloned Dogs Is Possibly Correlated with Variants of Wnt Signaling Pathway Initiators

    Full text link
    Abnormalities in animals cloned via somatic cell nuclear transfer (SCNT) have been reported. In this study, to produce bomb-sniffing dogs, we successfully cloned four healthy dogs through SCNT using the same donor genome from the skin of a male German shepherd old dog. Veterinary diagnosis (X-ray/3D-CT imaging) revealed that two cloned dogs showed normal phenotypes, whereas the others showed abnormal shortening of the mandible (brachygnathia inferior) at 1 month after birth, even though they were cloned under the same conditions except for the oocyte source. Therefore, we aimed to determine the genetic cause of brachygnathia inferior in these cloned dogs. To determine the genetic defects related to brachygnathia inferior, we performed karyotyping and whole-genome sequencing (WGS) for identifying small genetic alterations in the genome, such as single-nucleotide variations or frameshifts. There were no chromosomal numerical abnormalities in all cloned dogs. However, WGS analysis revealed variants of Wnt signaling pathway initiators (WNT5B, DVL2, DACT1, ARRB2, FZD 4/8) and cadherin (CDH11, CDH1like) in cloned dogs with brachygnathia inferior. In conclusion, this study proposes that brachygnathia inferior in cloned dogs may be associated with variants in initiators and/or regulators of the Wnt/cadherin signaling pathway
    corecore