353 research outputs found

    Real-Time Audio-to-Score Alignment of Music Performances Containing Errors and Arbitrary Repeats and Skips

    Full text link
    This paper discusses real-time alignment of audio signals of music performance to the corresponding score (a.k.a. score following) which can handle tempo changes, errors and arbitrary repeats and/or skips (repeats/skips) in performances. This type of score following is particularly useful in automatic accompaniment for practices and rehearsals, where errors and repeats/skips are often made. Simple extensions of the algorithms previously proposed in the literature are not applicable in these situations for scores of practical length due to the problem of large computational complexity. To cope with this problem, we present two hidden Markov models of monophonic performance with errors and arbitrary repeats/skips, and derive efficient score-following algorithms with an assumption that the prior probability distributions of score positions before and after repeats/skips are independent from each other. We confirmed real-time operation of the algorithms with music scores of practical length (around 10000 notes) on a modern laptop and their tracking ability to the input performance within 0.7 s on average after repeats/skips in clarinet performance data. Further improvements and extension for polyphonic signals are also discussed.Comment: 12 pages, 8 figures, version accepted in IEEE/ACM Transactions on Audio, Speech, and Language Processin

    Sampling-Frequency-Independent Universal Sound Separation

    Full text link
    This paper proposes a universal sound separation (USS) method capable of handling untrained sampling frequencies (SFs). The USS aims at separating arbitrary sources of different types and can be the key technique to realize a source separator that can be universally used as a preprocessor for any downstream tasks. To realize a universal source separator, there are two essential properties: universalities with respect to source types and recording conditions. The former property has been studied in the USS literature, which has greatly increased the number of source types that can be handled by a single neural network. However, the latter property (e.g., SF) has received less attention despite its necessity. Since the SF varies widely depending on the downstream tasks, the universal source separator must handle a wide variety of SFs. In this paper, to encompass the two properties, we propose an SF-independent (SFI) extension of a computationally efficient USS network, SuDoRM-RF. The proposed network uses our previously proposed SFI convolutional layers, which can handle various SFs by generating convolutional kernels in accordance with an input SF. Experiments show that signal resampling can degrade the USS performance and the proposed method works more consistently than signal-resampling-based methods for various SFs.Comment: Submitted to ICASSP202

    Time-Domain Audio Source Separation Based on Wave-U-Net Combined with Discrete Wavelet Transform

    Full text link
    We propose a time-domain audio source separation method using down-sampling (DS) and up-sampling (US) layers based on a discrete wavelet transform (DWT). The proposed method is based on one of the state-of-the-art deep neural networks, Wave-U-Net, which successively down-samples and up-samples feature maps. We find that this architecture resembles that of multiresolution analysis, and reveal that the DS layers of Wave-U-Net cause aliasing and may discard information useful for the separation. Although the effects of these problems may be reduced by training, to achieve a more reliable source separation method, we should design DS layers capable of overcoming the problems. With this belief, focusing on the fact that the DWT has an anti-aliasing filter and the perfect reconstruction property, we design the proposed layers. Experiments on music source separation show the efficacy of the proposed method and the importance of simultaneously considering the anti-aliasing filters and the perfect reconstruction property.Comment: 5 pages, to appear in IEEE International Conference on Acoustics, Speech, and Signal Processing 2020 (ICASSP 2020

    Continuous negative extrathoracic pressure combined with high-frequency oscillation improves oxygenation with less impact on blood pressure than high-frequency oscillation alone in a rabbit model of surfactant depletion

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Negative air pressure ventilation has been used to maintain adequate functional residual capacity in patients with chronic muscular disease and to decrease transpulmonary pressure and improve cardiac output during right heart surgery. High-frequency oscillation (HFO) exerts beneficial effects on gas exchange in neonates with acute respiratory failure. We examined whether continuous negative extrathoracic pressure (CNEP) combined with HFO would be effective for treating acute respiratory failure in an animal model.</p> <p>Methods</p> <p>The effects of CNEP combined with HFO on pulmonary gas exchange and circulation were examined in a surfactant-depleted rabbit model. After induction of severe lung injury by repeated saline lung lavage, 18 adult white Japanese rabbits were randomly assigned to 3 groups: Group 1, CNEP (extra thoracic negative pressure, -10 cmH<sub>2</sub>O) with HFO (mean airway pressure (MAP), 10 cmH<sub>2</sub>O); Group 2, HFO (MAP, 10 cmH<sub>2</sub>O); and Group 3, HFO (MAP, 15 cmH<sub>2</sub>O). Physiological and blood gas data were compared among groups using analysis of variance.</p> <p>Results</p> <p>Group 1 showed significantly higher oxygenation than Group 2, and the same oxygenation with significantly higher mean blood pressure compared to Group 3.</p> <p>Conclusion</p> <p>Adequate CNEP combined with HFO improves oxygenation with less impact on blood pressure than high-frequency oscillation alone in an animal model of respiratory failure.</p

    Effects of heliox as carrier gas on ventilation and oxygenation in an animal model of piston-type HFOV: a crossover experimental study

    Get PDF
    <p>Abstract</p> <p>Objective</p> <p>This study aimed to compare gas exchange with heliox and oxygen-enriched air during piston-type high-frequency oscillatory ventilation (HFOV). We hypothesized that helium gas would improve both carbon dioxide elimination and arterial oxygenation during piston-type HFOV.</p> <p>Method</p> <p>Five rabbits were prepared and ventilated by piston-type HFOV with carrier 50% helium/oxygen (heliox50) or 50% oxygen/nitrogen (nitrogen50) gas mixture in a crossover study. Changing the gas mixture from nitrogen50 to heliox50 and back was performed five times per animal with constant ventilation parameters. Arterial blood gas, vital function and respiratory test indices were recorded.</p> <p>Results</p> <p>Compared with nitrogen50, heliox50 did not change PaCO<sub>2 </sub>when stroke volume remained constant, but significantly reduced PaCO<sub>2 </sub>after alignment of amplitude pressure. No significant changes in PaO<sub>2 </sub>were seen despite significant decreases in mean airway pressure with heliox50 compared with nitrogen50.</p> <p>Conclusion</p> <p>This study demonstrated that heliox enhances CO<sub>2 </sub>elimination and maintains oxygenation at the same amplitude but with lower airway pressure compared to air/O<sub>2 </sub>mix gas during piston-type HFOV.</p

    Physics-informed convolutional neural network with bicubic spline interpolation for sound field estimation

    Full text link
    A sound field estimation method based on a physics-informed convolutional neural network (PICNN) using spline interpolation is proposed. Most of the sound field estimation methods are based on wavefunction expansion, making the estimated function satisfy the Helmholtz equation. However, these methods rely only on physical properties; thus, they suffer from a significant deterioration of accuracy when the number of measurements is small. Recent learning-based methods based on neural networks have advantages in estimating from sparse measurements when training data are available. However, since physical properties are not taken into consideration, the estimated function can be a physically infeasible solution. We propose the application of PICNN to the sound field estimation problem by using a loss function that penalizes deviation from the Helmholtz equation. Since the output of CNN is a spatially discretized pressure distribution, it is difficult to directly evaluate the Helmholtz-equation loss function. Therefore, we incorporate bicubic spline interpolation in the PICNN framework. Experimental results indicated that accurate and physically feasible estimation from sparse measurements can be achieved with the proposed method.Comment: Accepted to International Workshop on Acoustic Signal Enhancement (IWAENC) 202

    Head-Related Transfer Function Interpolation from Spatially Sparse Measurements Using Autoencoder with Source Position Conditioning

    Full text link
    We propose a method of head-related transfer function (HRTF) interpolation from sparsely measured HRTFs using an autoencoder with source position conditioning. The proposed method is drawn from an analogy between an HRTF interpolation method based on regularized linear regression (RLR) and an autoencoder. Through this analogy, we found the key feature of the RLR-based method that HRTFs are decomposed into source-position-dependent and source-position-independent factors. On the basis of this finding, we design the encoder and decoder so that their weights and biases are generated from source positions. Furthermore, we introduce an aggregation module that reduces the dependence of latent variables on source position for obtaining a source-position-independent representation of each subject. Numerical experiments show that the proposed method can work well for unseen subjects and achieve an interpolation performance with only one-eighth measurements comparable to that of the RLR-based method.Comment: Accepted to International Workshop on Acoustic Signal Enhancement (IWAENC) 202

    Functional Evaluation of Bubble CPAP for Neonates Using a Leak Model

    Get PDF
    Article信州医学雑誌 61(2):65-73(2013)journal articl

    Algorithms of Sampling-Frequency-Independent Layers for Non-integer Strides

    Full text link
    In this paper, we propose algorithms for handling non-integer strides in sampling-frequency-independent (SFI) convolutional and transposed convolutional layers. The SFI layers have been developed for handling various sampling frequencies (SFs) by a single neural network. They are replaceable with their non-SFI counterparts and can be introduced into various network architectures. However, they could not handle some specific configurations when combined with non-SFI layers. For example, an SFI extension of Conv-TasNet, a standard audio source separation model, cannot handle some pairs of trained and target SFs because the strides of the SFI layers become non-integers. This problem cannot be solved by simple rounding or signal resampling, resulting in the significant performance degradation. To overcome this problem, we propose algorithms for handling non-integer strides by using windowed sinc interpolation. The proposed algorithms realize the continuous-time representations of features using the interpolation and enable us to sample instants with the desired stride. Experimental results on music source separation showed that the proposed algorithms outperformed the rounding- and signal-resampling-based methods at SFs lower than the trained SF.Comment: 5 pages, 3 figures, accepted for European Signal Processing Conference 2023 (EUSIPCO 2023
    corecore