353 research outputs found
Real-Time Audio-to-Score Alignment of Music Performances Containing Errors and Arbitrary Repeats and Skips
This paper discusses real-time alignment of audio signals of music
performance to the corresponding score (a.k.a. score following) which can
handle tempo changes, errors and arbitrary repeats and/or skips (repeats/skips)
in performances. This type of score following is particularly useful in
automatic accompaniment for practices and rehearsals, where errors and
repeats/skips are often made. Simple extensions of the algorithms previously
proposed in the literature are not applicable in these situations for scores of
practical length due to the problem of large computational complexity. To cope
with this problem, we present two hidden Markov models of monophonic
performance with errors and arbitrary repeats/skips, and derive efficient
score-following algorithms with an assumption that the prior probability
distributions of score positions before and after repeats/skips are independent
from each other. We confirmed real-time operation of the algorithms with music
scores of practical length (around 10000 notes) on a modern laptop and their
tracking ability to the input performance within 0.7 s on average after
repeats/skips in clarinet performance data. Further improvements and extension
for polyphonic signals are also discussed.Comment: 12 pages, 8 figures, version accepted in IEEE/ACM Transactions on
Audio, Speech, and Language Processin
Sampling-Frequency-Independent Universal Sound Separation
This paper proposes a universal sound separation (USS) method capable of
handling untrained sampling frequencies (SFs). The USS aims at separating
arbitrary sources of different types and can be the key technique to realize a
source separator that can be universally used as a preprocessor for any
downstream tasks. To realize a universal source separator, there are two
essential properties: universalities with respect to source types and recording
conditions. The former property has been studied in the USS literature, which
has greatly increased the number of source types that can be handled by a
single neural network. However, the latter property (e.g., SF) has received
less attention despite its necessity. Since the SF varies widely depending on
the downstream tasks, the universal source separator must handle a wide variety
of SFs. In this paper, to encompass the two properties, we propose an
SF-independent (SFI) extension of a computationally efficient USS network,
SuDoRM-RF. The proposed network uses our previously proposed SFI convolutional
layers, which can handle various SFs by generating convolutional kernels in
accordance with an input SF. Experiments show that signal resampling can
degrade the USS performance and the proposed method works more consistently
than signal-resampling-based methods for various SFs.Comment: Submitted to ICASSP202
Time-Domain Audio Source Separation Based on Wave-U-Net Combined with Discrete Wavelet Transform
We propose a time-domain audio source separation method using down-sampling
(DS) and up-sampling (US) layers based on a discrete wavelet transform (DWT).
The proposed method is based on one of the state-of-the-art deep neural
networks, Wave-U-Net, which successively down-samples and up-samples feature
maps. We find that this architecture resembles that of multiresolution
analysis, and reveal that the DS layers of Wave-U-Net cause aliasing and may
discard information useful for the separation. Although the effects of these
problems may be reduced by training, to achieve a more reliable source
separation method, we should design DS layers capable of overcoming the
problems. With this belief, focusing on the fact that the DWT has an
anti-aliasing filter and the perfect reconstruction property, we design the
proposed layers. Experiments on music source separation show the efficacy of
the proposed method and the importance of simultaneously considering the
anti-aliasing filters and the perfect reconstruction property.Comment: 5 pages, to appear in IEEE International Conference on Acoustics,
Speech, and Signal Processing 2020 (ICASSP 2020
Continuous negative extrathoracic pressure combined with high-frequency oscillation improves oxygenation with less impact on blood pressure than high-frequency oscillation alone in a rabbit model of surfactant depletion
<p>Abstract</p> <p>Background</p> <p>Negative air pressure ventilation has been used to maintain adequate functional residual capacity in patients with chronic muscular disease and to decrease transpulmonary pressure and improve cardiac output during right heart surgery. High-frequency oscillation (HFO) exerts beneficial effects on gas exchange in neonates with acute respiratory failure. We examined whether continuous negative extrathoracic pressure (CNEP) combined with HFO would be effective for treating acute respiratory failure in an animal model.</p> <p>Methods</p> <p>The effects of CNEP combined with HFO on pulmonary gas exchange and circulation were examined in a surfactant-depleted rabbit model. After induction of severe lung injury by repeated saline lung lavage, 18 adult white Japanese rabbits were randomly assigned to 3 groups: Group 1, CNEP (extra thoracic negative pressure, -10 cmH<sub>2</sub>O) with HFO (mean airway pressure (MAP), 10 cmH<sub>2</sub>O); Group 2, HFO (MAP, 10 cmH<sub>2</sub>O); and Group 3, HFO (MAP, 15 cmH<sub>2</sub>O). Physiological and blood gas data were compared among groups using analysis of variance.</p> <p>Results</p> <p>Group 1 showed significantly higher oxygenation than Group 2, and the same oxygenation with significantly higher mean blood pressure compared to Group 3.</p> <p>Conclusion</p> <p>Adequate CNEP combined with HFO improves oxygenation with less impact on blood pressure than high-frequency oscillation alone in an animal model of respiratory failure.</p
Effects of heliox as carrier gas on ventilation and oxygenation in an animal model of piston-type HFOV: a crossover experimental study
<p>Abstract</p> <p>Objective</p> <p>This study aimed to compare gas exchange with heliox and oxygen-enriched air during piston-type high-frequency oscillatory ventilation (HFOV). We hypothesized that helium gas would improve both carbon dioxide elimination and arterial oxygenation during piston-type HFOV.</p> <p>Method</p> <p>Five rabbits were prepared and ventilated by piston-type HFOV with carrier 50% helium/oxygen (heliox50) or 50% oxygen/nitrogen (nitrogen50) gas mixture in a crossover study. Changing the gas mixture from nitrogen50 to heliox50 and back was performed five times per animal with constant ventilation parameters. Arterial blood gas, vital function and respiratory test indices were recorded.</p> <p>Results</p> <p>Compared with nitrogen50, heliox50 did not change PaCO<sub>2 </sub>when stroke volume remained constant, but significantly reduced PaCO<sub>2 </sub>after alignment of amplitude pressure. No significant changes in PaO<sub>2 </sub>were seen despite significant decreases in mean airway pressure with heliox50 compared with nitrogen50.</p> <p>Conclusion</p> <p>This study demonstrated that heliox enhances CO<sub>2 </sub>elimination and maintains oxygenation at the same amplitude but with lower airway pressure compared to air/O<sub>2 </sub>mix gas during piston-type HFOV.</p
Physics-informed convolutional neural network with bicubic spline interpolation for sound field estimation
A sound field estimation method based on a physics-informed convolutional
neural network (PICNN) using spline interpolation is proposed. Most of the
sound field estimation methods are based on wavefunction expansion, making the
estimated function satisfy the Helmholtz equation. However, these methods rely
only on physical properties; thus, they suffer from a significant deterioration
of accuracy when the number of measurements is small. Recent learning-based
methods based on neural networks have advantages in estimating from sparse
measurements when training data are available. However, since physical
properties are not taken into consideration, the estimated function can be a
physically infeasible solution. We propose the application of PICNN to the
sound field estimation problem by using a loss function that penalizes
deviation from the Helmholtz equation. Since the output of CNN is a spatially
discretized pressure distribution, it is difficult to directly evaluate the
Helmholtz-equation loss function. Therefore, we incorporate bicubic spline
interpolation in the PICNN framework. Experimental results indicated that
accurate and physically feasible estimation from sparse measurements can be
achieved with the proposed method.Comment: Accepted to International Workshop on Acoustic Signal Enhancement
(IWAENC) 202
Head-Related Transfer Function Interpolation from Spatially Sparse Measurements Using Autoencoder with Source Position Conditioning
We propose a method of head-related transfer function (HRTF) interpolation
from sparsely measured HRTFs using an autoencoder with source position
conditioning. The proposed method is drawn from an analogy between an HRTF
interpolation method based on regularized linear regression (RLR) and an
autoencoder. Through this analogy, we found the key feature of the RLR-based
method that HRTFs are decomposed into source-position-dependent and
source-position-independent factors. On the basis of this finding, we design
the encoder and decoder so that their weights and biases are generated from
source positions. Furthermore, we introduce an aggregation module that reduces
the dependence of latent variables on source position for obtaining a
source-position-independent representation of each subject. Numerical
experiments show that the proposed method can work well for unseen subjects and
achieve an interpolation performance with only one-eighth measurements
comparable to that of the RLR-based method.Comment: Accepted to International Workshop on Acoustic Signal Enhancement
(IWAENC) 202
Functional Evaluation of Bubble CPAP for Neonates Using a Leak Model
Article信州医学雑誌 61(2):65-73(2013)journal articl
Algorithms of Sampling-Frequency-Independent Layers for Non-integer Strides
In this paper, we propose algorithms for handling non-integer strides in
sampling-frequency-independent (SFI) convolutional and transposed convolutional
layers. The SFI layers have been developed for handling various sampling
frequencies (SFs) by a single neural network. They are replaceable with their
non-SFI counterparts and can be introduced into various network architectures.
However, they could not handle some specific configurations when combined with
non-SFI layers. For example, an SFI extension of Conv-TasNet, a standard audio
source separation model, cannot handle some pairs of trained and target SFs
because the strides of the SFI layers become non-integers. This problem cannot
be solved by simple rounding or signal resampling, resulting in the significant
performance degradation. To overcome this problem, we propose algorithms for
handling non-integer strides by using windowed sinc interpolation. The proposed
algorithms realize the continuous-time representations of features using the
interpolation and enable us to sample instants with the desired stride.
Experimental results on music source separation showed that the proposed
algorithms outperformed the rounding- and signal-resampling-based methods at
SFs lower than the trained SF.Comment: 5 pages, 3 figures, accepted for European Signal Processing
Conference 2023 (EUSIPCO 2023
- …