Search CORE

122 research outputs found

Comparison for Improvements of Singing Voice Detection System Based on Vocal Separation

Author: Chen Xi
Li Wei
Yu Yi
Zhang Xulong
Publication venue
Publication date: 09/04/2020
Field of study

Singing voice detection is the task to identify the frames which contain the singer vocal or not. It has been one of the main components in music information retrieval (MIR), which can be applicable to melody extraction, artist recognition, and music discovery in popular music. Although there are several methods which have been proposed, a more robust and more complete system is desired to improve the detection performance. In this paper, our motivation is to provide an extensive comparison in different stages of singing voice detection. Based on the analysis a novel method was proposed to build a more efficiently singing voice detection system. In the proposed system, there are main three parts. The first is a pre-process of singing voice separation to extract the vocal without the music. The improvements of several singing voice separation methods were compared to decide the best one which is integrated to singing voice detection system. And the second is a deep neural network based classifier to identify the given frames. Different deep models for classification were also compared. The last one is a post-process to filter out the anomaly frame on the prediction result of the classifier. The median filter and Hidden Markov Model (HMM) based filter as the post process were compared. Through the step by step module extension, the different methods were compared and analyzed. Finally, classification performance on two public datasets indicates that the proposed approach which based on the Long-term Recurrent Convolutional Networks (LRCN) model is a promising alternative.Comment: 15 page

arXiv.org e-Print Archive

Music Artist Classification with WaveNet Classifier for Raw Waveform Audio Data

Author: Gao Yongwei
Li Wei
Yu Yi
Zhang Xulong
Publication venue
Publication date: 09/04/2020
Field of study

Models for music artist classification usually were operated in the frequency domain, in which the input audio samples are processed by the spectral transformation. The WaveNet architecture, originally designed for speech and music generation. In this paper, we propose an end-to-end architecture in the time domain for this task. A WaveNet classifier was introduced which directly models the features from a raw audio waveform. The WaveNet takes the waveform as the input and several downsampling layers are subsequent to discriminate which artist the input belongs to. In addition, the proposed method is applied to singer identification. The model achieving the best performance obtains an average F1 score of 0.854 on benchmark dataset of Artist20, which is a significant improvement over the related works. In order to show the effectiveness of feature learning of the proposed method, the bottleneck layer of the model is visualized.Comment: 12 page

arXiv.org e-Print Archive

Analysis of HER2 Gene Amplification and Certain Prognostic Factors in Breast Cancer

Author: Feng Yangmeng
Huo Binliang
Li Jianhui
Wu Shuhan
Zhu Xulong
Publication venue: Universe Scientific Publishing Pte. Ltd.
Publication date: 07/11/2022
Field of study

Objective: The HER2 gene amplification and certainÂ prognostic factors in breast cancer were analyzed. Method: The gene amplification and protein expression of human epidermal growth factor receptor in 100 breast cancer tissues detected by FISH and IHC detection method in the hospital from January 2020 to December 2021 were analyzed. To analyze some breast cancer prognostic factors.Â Result: 0 is 8 cases of HER-2 protein breast cancer, (1+) is 11 cases, (2+) is 49 cases, (3+) is 32 cases. The HER2 gene was amplified in 49 cases, of which 23 cases showed red signals in clusters, and 26 cases showed red signals in dots. 51 cases of HER-2 gene were not amplified. There are differences in the detection results of FISH and IHC detection methods (Pï¼ž0.05). ER, PR and polysomyÂ of chromosome 17 are prognostic factors associated with HER2 gene amplification in certainÂ breast cancers.Â (Pï¼œ0.05) Conclusion: To analyze the HER2 gene amplification in breast cancer and targeted select FISH and IHC detection methods can improve the therapeutic effect and prognostic factor, which deserves clinical attention

Advanced Emergency Medicine (E-Journal)

Mechanical deformation mechanism and verification of sections at junctions of light and dark tunnel in a mountain area

Author: Fei Wang
Jianbing Lv
Xulong Li
Yimin Wu
Yingmei Yin
Zhengming Zou
Publication venue: 'JVE International Ltd.'
Publication date: 30/06/2019
Field of study

Projects involving junctions of light and dark tunnel in mountainous areas are complex engineering problems that combine tunnel structure, slope rock-soil mass and protection projects. Such junctions suffer from a complex and changeable load. The stress and deformation of the junction varies under different conditions. Thus, it is a major source of inconvenience for construction and monitoring operations. In this paper, according to the load conditions at a junction of light and dark tunnel, we divide the junction hole into thrust, compression, and combined thrust-compression types. Three types of structures were simulated by numerical analysis, and we explored the structural deformation and stress of these types of tunnel under different condition. Thus, in any construction process, the mechanical deformation mechanism and the weak point in the structure should be worked out. Based on the weak parts, some monitoring points were installed, and four fields for monitoring were chosen. The monitoring results show that the actual deformation, stress and structural failure location are basically consistent with the numerical simulation results. The deformation mechanism of light and dark tunnel junction obtained can provide the basis for selecting the treatment measures and controlling the structural deformation. Furthermore, the results can also be used as a reference for similar engineering design, construction and site monitoring projects

Journal of Vibroengineering

Journal of Mechanical Engineering, Automation and Control Systems

YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design

Author: Cai Yuxuan
Li Hongjia
Li Yanyu
Niu Wei
Ren Bin
Tang Xulong
Wang Yanzhi
Yuan Geng
Publication venue
Publication date: 30/12/2020
Field of study

The rapid development and wide utilization of object detection techniques have aroused attention on both accuracy and speed of object detectors. However, the current state-of-the-art object detection works are either accuracy-oriented using a large model but leading to high latency or speed-oriented using a lightweight model but sacrificing accuracy. In this work, we propose YOLObile framework, a real-time object detection on mobile devices via compression-compilation co-design. A novel block-punched pruning scheme is proposed for any kernel size. To improve computational efficiency on mobile devices, a GPU-CPU collaborative scheme is adopted along with advanced compiler-assisted optimizations. Experimental results indicate that our pruning scheme achieves 14

\times

compression rate of YOLOv4 with 49.0 mAP. Under our YOLObile framework, we achieve 17 FPS inference speed using GPU on Samsung Galaxy S20. By incorporating our proposed GPU-CPU collaborative scheme, the inference speed is increased to 19.1 FPS, and outperforms the original YOLOv4 by 5

\times

speedup. Source code is at: \url{https://github.com/nightsnack/YOLObile}

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval

Author: Cheng Ning
Li Huaxiong
Luo Kaiyi
Wang Jianzong
Xiao Jing
Zhang Xulong
Publication venue
Publication date: 15/09/2023
Field of study

Cross-modal retrieval (CMR) has been extensively applied in various domains, such as multimedia search engines and recommendation systems. Most existing CMR methods focus on image-to-text retrieval, whereas audio-to-text retrieval, a less explored domain, has posed a great challenge due to the difficulty to uncover discriminative features from audio clips and texts. Existing studies are restricted in the following two ways: 1) Most researchers utilize contrastive learning to construct a common subspace where similarities among data can be measured. However, they considers only cross-modal transformation, neglecting the intra-modal separability. Besides, the temperature parameter is not adaptively adjusted along with semantic guidance, which degrades the performance. 2) These methods do not take latent representation reconstruction into account, which is essential for semantic alignment. This paper introduces a novel audio-text oriented CMR approach, termed Contrastive Latent Space Reconstruction Learning (CLSR). CLSR improves contrastive representation learning by taking intra-modal separability into account and adopting an adaptive temperature control strategy. Moreover, the latent representation reconstruction modules are embedded into the CMR framework, which improves modal interaction. Experiments in comparison with some state-of-the-art methods on two audio-text datasets have validated the superiority of CLSR.Comment: Accepted by The 35th IEEE International Conference on Tools with Artificial Intelligence. (ICTAI 2023

arXiv.org e-Print Archive

Multimodal Wearable Intelligence for Dementia Care in Healthcare 4.0: A Survey

Author: Bi Gaoshan
Qi Jun
Wang Xulong
Xu Li Da
Yang Po
Yang Yun
Publication venue: ODU Digital Commons
Publication date: 01/01/2021
Field of study

As a new revolution of Ubiquitous Computing and Internet of Things, multimodal wearable intelligence technique is rapidly becoming a new research topic in both academic and industrial fields. Owning to the rapid spread of wearable and mobile devices, this technique is evolving healthcare from traditional hub-based systems to more personalised healthcare systems. This trend is well-aligned with recent Healthcare 4.0 which is a continuous process of transforming the entire healthcare value chain to be preventive, precise, predictive and personalised, with significant benefits to elder care. But empowering the utility of multimodal wearable intelligence technique for elderly care like people with dementia is significantly challenging considering many issues, such as shortage of cost-effective wearable sensors, heterogeneity of wearable devices connected, high demand for interoperability, etc. Focusing on these challenges, this paper gives a systematic review of advanced multimodal wearable intelligence technologies for dementia care in Healthcare 4.0. One framework is proposed for reviewing the current research of wearable intelligence, and key enabling technologies, major applications, and successful case studies in dementia care, and finally points out future research trends and challenges in Healthcare 4.0

Old Dominion University