321 research outputs found
Speech processing with deep learning for voice-based respiratory diagnosis : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Albany, New Zealand
Voice-based respiratory diagnosis research aims at automatically screening and diagnosing respiratory-related symptoms (e.g., smoking status, COVID-19 infection) from human-generated sounds (e.g., breath, cough, speech). It has the potential to be used as an objective, simple, reliable, and less time-consuming method than traditional biomedical diagnosis methods. In this thesis, we conduct one comprehensive literature review and propose three novel deep learning methods to enrich voice-based respiratory diagnosis research and improve its performance.
Firstly, we conduct a comprehensive investigation of the effects of voice features on the detection of smoking status. Secondly, we propose a novel method that uses the combination of both high-level and low-level acoustic features along with deep neural networks for smoking status identification. Thirdly, we investigate various feature extraction/representation methods and propose a SincNet-based CNN method for feature representations to further improve the performance of smoking status identification. To the best of our knowledge, this is the first systemic study that applies speech processing with deep learning for voice-based smoking status identification.
Moreover, we propose a novel transfer learning scheme and a task-driven feature representation method for diagnosing respiratory diseases (e.g., COVID-19) from human-generated sounds. We find those transfer learning methods using VGGish, wav2vec 2.0 and PASE+, and our proposed task-driven method Sinc-ResNet have achieved competitive performance compared with other work. The findings of this study provide a new perspective and insights for voice-based respiratory disease diagnosis.
The experimental results demonstrate the effectiveness of our proposed methods and show that they have achieved better performances compared to other existing methods
Learning Signed Distance Functions from Noisy 3D Point Clouds via Noise to Noise Mapping
Learning signed distance functions (SDFs) from 3D point clouds is an
important task in 3D computer vision. However, without ground truth signed
distances, point normals or clean point clouds, current methods still struggle
from learning SDFs from noisy point clouds. To overcome this challenge, we
propose to learn SDFs via a noise to noise mapping, which does not require any
clean point cloud or ground truth supervision for training. Our novelty lies in
the noise to noise mapping which can infer a highly accurate SDF of a single
object or scene from its multiple or even single noisy point cloud
observations. Our novel learning manner is supported by modern Lidar systems
which capture multiple noisy observations per second. We achieve this by a
novel loss which enables statistical reasoning on point clouds and maintains
geometric consistency although point clouds are irregular, unordered and have
no point correspondence among noisy observations. Our evaluation under the
widely used benchmarks demonstrates our superiority over the state-of-the-art
methods in surface reconstruction, point cloud denoising and upsampling. Our
code, data, and pre-trained models are available at
https://github.com/mabaorui/Noise2NoiseMapping/Comment: To appear at ICML2023. Code and data are available at
https://github.com/mabaorui/Noise2NoiseMapping
Experimental evaluation of mobile phone sensors
Smart phone has become an important part of people's daily life. Most of current smart phone are equipped with a rich set of built-in sensors. The mobile applications such as geo-location based video annotation and indoor positioning require precise measurements from sensors. In addition, understanding the sensing performance of a smart phone device is helpful for implementing a mobile application that needs sensor data. This paper presents an experimental evaluation of key sensors in a state of the art smart phone - Google Nexus 4. The sensors chosen in the paper are accelerometer, gyroscope, magnetometer and GPS. Substantial tests have been executed to evaluate the sensors' accuracy, precision, maximum sampling frequency, sampling period jitter, energy consumption
Institutional mapping and causal analysis of avalanche vulnerable areas based on multi-source data
Avalanche disaster is a major natural disaster that seriously threatens the
national infrastructure and personnel's life safety. For a long time, the
research of avalanche disaster prediction in the world is insufficient, there
are only some basic models and basic conditions of occurrence, and there is no
long series and wide range of avalanche disaster prediction products. Based on
7 different bands and different types of multi-source remote sensing data,this
study combined with existing avalanche occurrence models, field investigation
and statistical data to analyze the causes of avalanche. The U-net
convolutional neural network and threshold analysis were used to extract the
distribution of long time series avalanch-prone areas in two study areas,
Heiluogou in Sichuan Province and along the Zangpo River in Palong, Tibet
Autonomous Region. In addition, the relationship between earthquake magnitude
and spatial distribution and avalanche occurrence is also analyzed in this
study. This study will also continue to build a prior knowledge base of
avalanche occurrence conditions, improve the prediction accuracy of the two
methods, and produce products in long time series interannual avalanch-prone
areas in southwest China, including Sichuan Province, Yunnan Province, and
Tibet Autonomous Region. The resulting products will provide high-precision
avalanche prediction and safety assurance for engineering construction and
mountaineering activities in Southwest China.Comment: 19 pages, 13 figure
A study of the effects of climate change and human activities on NPP of marsh wetland vegetation in the Yellow River source region between 2000 and 2020
Quantitative assessment of the impacts of climate change and human activities on marsh wetland is essential for the sustainable development of marsh wetland ecosystem. This study takes the marsh wetland in the Yellow River source region (YRSR) as the research object, using the method of residual analysis, the potential net primary productivity (NPPp) of marsh wetland vegetation in the YRSR between 2000 and 2020 was stimulated using the Zhou Guangsheng model, and the actual primary productivity (NPPa) of marsh wetland vegetation was download from MOD17A3HGF product, and the difference between them was employed to calculate the NPP affected by human activities, the relative contribution of climate change and human activities to the change of NPPa of marsh wetland vegetation was quantitatively evaluated. The results revealed that between 2000 and 2020, NPPa of marsh wetland vegetation increased in the YRSR by 95.76%, among which climate-dominated and human-dominated NPP change occupied by 66.29% and 29.47% of study areas, respectively. The Zoige Plateau in the southeast accounted for the majority of the 4.24% decline in the NPPa of the marsh wetland vegetation, almost all of which were affected by human activities. It is found that the warming and humidifying of climate, as well as human protective construction activities, are the important reasons for the increase of NPPa of marsh wetland vegetation in the YRSR. Although climate change remains an important cause of the increase in NPPa of marsh wetland vegetation, the contribution of human activities to the increase in NPPa of marsh wetland vegetation is increasing
Application of the improved dynamical–Statistical–Analog ensemble forecast model for landfalling typhoon precipitation in Fujian province
The forecasting performance of the Dynamical–Statistical–Analog Ensemble Forecast (DSAEF) model for Landfalling Typhoon [or tropical cyclone (TC)] Precipitation (DSAEF_LTP), with new values of two parameters (i.e., similarity region and ensemble method) for landfalling TC precipitation over Fujian Province, is tested in four experiments. Forty-two TCs with precipitation over 100 mm in Fujian Province during 2004–2020 are chosen as experimental samples. Thirty of them are training samples and twelve are independent samples. First, simulation experiments for the training samples are used to determine the best scheme of the DSAEF_LTP model. Then, the forecasting performance of this best scheme is evaluated through forecast experiments. In the forecast experiments, the TSsum (the sum of threat scores for predicting TC accumulated rainfall of ≥250 mm and ≥100 mm) of experiments DSAEF_A, B, C, D is 0.0974, 0.2615, 0.2496, and 0.4153, respectively. The results show that the DSAEF_LTP model performs best when both adding new values of the similarity region and ensemble method (DSAEF_D). At the same time, the TSsum of the best performer of numerical weather prediction (NWP) models is only 0.2403. The improved DSAEF_LTP model shows advantages compared to the NWP models. It is an important method to improve the predictability of the DSAEF_LTP model by adopting different schemes in different regions
Learning a More Continuous Zero Level Set in Unsigned Distance Fields through Level Set Projection
Latest methods represent shapes with open surfaces using unsigned distance
functions (UDFs). They train neural networks to learn UDFs and reconstruct
surfaces with the gradients around the zero level set of the UDF. However, the
differential networks struggle from learning the zero level set where the UDF
is not differentiable, which leads to large errors on unsigned distances and
gradients around the zero level set, resulting in highly fragmented and
discontinuous surfaces. To resolve this problem, we propose to learn a more
continuous zero level set in UDFs with level set projections. Our insight is to
guide the learning of zero level set using the rest non-zero level sets via a
projection procedure. Our idea is inspired from the observations that the
non-zero level sets are much smoother and more continuous than the zero level
set. We pull the non-zero level sets onto the zero level set with gradient
constraints which align gradients over different level sets and correct
unsigned distance errors on the zero level set, leading to a smoother and more
continuous unsigned distance field. We conduct comprehensive experiments in
surface reconstruction for point clouds, real scans or depth maps, and further
explore the performance in unsupervised point cloud upsampling and unsupervised
point normal estimation with the learned UDF, which demonstrate our non-trivial
improvements over the state-of-the-art methods. Code is available at
https://github.com/junshengzhou/LevelSetUDF .Comment: To appear at ICCV2023. Code is available at
https://github.com/junshengzhou/LevelSetUD
Learning Consistency-Aware Unsigned Distance Functions Progressively from Raw Point Clouds
Surface reconstruction for point clouds is an important task in 3D computer
vision. Most of the latest methods resolve this problem by learning signed
distance functions (SDF) from point clouds, which are limited to reconstructing
shapes or scenes with closed surfaces. Some other methods tried to represent
shapes or scenes with open surfaces using unsigned distance functions (UDF)
which are learned from large scale ground truth unsigned distances. However,
the learned UDF is hard to provide smooth distance fields near the surface due
to the noncontinuous character of point clouds. In this paper, we propose a
novel method to learn consistency-aware unsigned distance functions directly
from raw point clouds. We achieve this by learning to move 3D queries to reach
the surface with a field consistency constraint, where we also enable to
progressively estimate a more accurate surface. Specifically, we train a neural
network to gradually infer the relationship between 3D queries and the
approximated surface by searching for the moving target of queries in a dynamic
way, which results in a consistent field around the surface. Meanwhile, we
introduce a polygonization algorithm to extract surfaces directly from the
gradient field of the learned UDF. The experimental results in surface
reconstruction for synthetic and real scan data show significant improvements
over the state-of-the-art under the widely used benchmarks.Comment: Accepted by NeurIPS 2022. Project
page:https://junshengzhou.github.io/CAP-UDF.
Code:https://github.com/junshengzhou/CAP-UD
LiDAR-Camera Panoptic Segmentation via Geometry-Consistent and Semantic-Aware Alignment
3D panoptic segmentation is a challenging perception task that requires both
semantic segmentation and instance segmentation. In this task, we notice that
images could provide rich texture, color, and discriminative information, which
can complement LiDAR data for evident performance improvement, but their fusion
remains a challenging problem. To this end, we propose LCPS, the first
LiDAR-Camera Panoptic Segmentation network. In our approach, we conduct
LiDAR-Camera fusion in three stages: 1) an Asynchronous Compensation Pixel
Alignment (ACPA) module that calibrates the coordinate misalignment caused by
asynchronous problems between sensors; 2) a Semantic-Aware Region Alignment
(SARA) module that extends the one-to-one point-pixel mapping to one-to-many
semantic relations; 3) a Point-to-Voxel feature Propagation (PVP) module that
integrates both geometric and semantic fusion information for the entire point
cloud. Our fusion strategy improves about 6.9% PQ performance over the
LiDAR-only baseline on NuScenes dataset. Extensive quantitative and qualitative
experiments further demonstrate the effectiveness of our novel framework. The
code will be released at https://github.com/zhangzw12319/lcps.git.Comment: Accepted as ICCV 2023 pape
- …