525 research outputs found

    Techniques for the Synthesis of Reversible Toffoli Networks

    Get PDF
    This paper presents novel techniques for the synthesis of reversible networks of Toffoli gates, as well as improvements to previous methods. Gate count and technology oriented cost metrics are used. Our synthesis techniques are independent of the cost metrics. Two new iterative synthesis procedure employing Reed-Muller spectra are introduced and shown to complement earlier synthesis approaches. The template simplification suggested in earlier work is enhanced through introduction of a faster and more efficient template application algorithm, updated (shorter) classification of the templates, and presentation of the new templates of sizes 7 and 9. A novel ``resynthesis'' approach is introduced wherein a sequence of gates is chosen from a network, and the reversible specification it realizes is resynthesized as an independent problem in hopes of reducing the network cost. Empirical results are presented to show that the methods are effective both in terms of the realization of all 3x3 reversible functions and larger reversible benchmark specifications.Comment: 20 pages, 5 figure

    Content Based Image Retrieval by Convolutional Neural Networks

    Get PDF
    Hamreras S., Benítez-Rochel R., Boucheham B., Molina-Cabello M.A., López-Rubio E. (2019) Content Based Image Retrieval by Convolutional Neural Networks. In: Ferrández Vicente J., Álvarez-Sánchez J., de la Paz López F., Toledo Moreo J., Adeli H. (eds) From Bioinspired Systems and Biomedical Applications to Machine Learning. IWINAC 2019. Lecture Notes in Computer Science, vol 11487. Springer.In this paper, we present a Convolutional Neural Network (CNN) for feature extraction in Content based Image Retrieval (CBIR). The proposed CNN aims at reducing the semantic gap between low level and high-level features. Thus, improving retrieval results. Our CNN is the result of a transfer learning technique using Alexnet pretrained network. It learns how to extract representative features from a learning database and then uses this knowledge in query feature extraction. Experimentations performed on Wang (Corel 1K) database show a significant improvement in terms of precision over the state of the art classic approaches.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Harmonic/Percussive Separation Using Median Filtering

    Get PDF
    In this paper, we present a fast, simple and effective method to separate the harmonic and percussive parts of a monaural audio signal.The technique involves the use of median filtering on a spectrogram of the audio signal, with median filtering performed across successive frames to suppress percussive events and enhance harmonic components, while median filtering is also performed across frequency bins to enhance percussive events and supress harmonic components. The two resulting median filtered spectrograms are then used to generate masks which are then applied to the original spectrogram to separate the harmonic and percussive parts of the signal. We illustrate the use of the algorithm in the context of remixing audio material from commercial recordings

    Resynthesis of Spatial Room Impulse Response tails With anisotropic multi-slope decays

    Get PDF
    Spatial room impulse responses (SRIRs) capture room acoustics with directional information. SRIRs measured in coupled rooms and spaces with non-uniform absorption distribution may exhibit anisotropic reverberation decays and multiple decay slopes. However, noisy measurements with low signal-to-noise ratios pose issues in analysis and reproduction in practice. This paper presents a method for resynthesis of the late decay of anisotropic SRIRs, effectively removing noise from SRIR measurements. The method accounts for both multi-slope decays and directional reverberation. A spherical filter bank extracts directionally constrained signals from Ambisonic input, which are then analyzed and parameterized in terms of multiple exponential decays and a noise floor. The noisy late reverberation is then resynthesized from the estimated parameters using modal synthesis, and the restored SRIR is reconstructed as Ambisonic signals. The method is evaluated both numerically and perceptually, which shows that SRIRs can be denoised with minimal error as long as parts of the decay slope are above the noise level, with signal-to-noise ratios as low as 40 dB in the presented experiment. The method can be used to increase the perceived spatial audio quality of noise-impaired SRIRs.Peer reviewe

    Singing voice resynthesis using concatenative-based techniques

    Get PDF
    Tese de Doutoramento. Engenharia Informática. Faculdade de Engenharia. Universidade do Porto. 201

    Voice source characterization for prosodic and spectral manipulation

    Get PDF
    The objective of this dissertation is to study and develop techniques to decompose the speech signal into its two main components: voice source and vocal tract. Our main efforts are on the glottal pulse analysis and characterization. We want to explore the utility of this model in different areas of speech processing: speech synthesis, voice conversion or emotion detection among others. Thus, we will study different techniques for prosodic and spectral manipulation. One of our requirements is that the methods should be robust enough to work with the large databases typical of speech synthesis. We use a speech production model in which the glottal flow produced by the vibrating vocal folds goes through the vocal (and nasal) tract cavities and its radiated by the lips. Removing the effect of the vocal tract from the speech signal to obtain the glottal pulse is known as inverse filtering. We use a parametric model fo the glottal pulse directly in the source-filter decomposition phase. In order to validate the accuracy of the parametrization algorithm, we designed a synthetic corpus using LF glottal parameters reported in the literature, complemented with our own results from the vowel database. The results show that our method gives satisfactory results in a wide range of glottal configurations and at different levels of SNR. Our method using the whitened residual compared favorably to this reference, achieving high quality ratings (Good-Excellent). Our full parametrized system scored lower than the other two ranking in third place, but still higher than the acceptance threshold (Fair-Good). Next we proposed two methods for prosody modification, one for each of the residual representations explained above. The first method used our full parametrization system and frame interpolation to perform the desired changes in pitch and duration. The second method used resampling on the residual waveform and a frame selection technique to generate a new sequence of frames to be synthesized. The results showed that both methods are rated similarly (Fair-Good) and that more work is needed in order to achieve quality levels similar to the reference methods. As part of this dissertation, we have studied the application of our models in three different areas: voice conversion, voice quality analysis and emotion recognition. We have included our speech production model in a reference voice conversion system, to evaluate the impact of our parametrization in this task. The results showed that the evaluators preferred our method over the original one, rating it with a higher score in the MOS scale. To study the voice quality, we recorded a small database consisting of isolated, sustained Spanish vowels in four different phonations (modal, rough, creaky and falsetto) and were later also used in our study of voice quality. Comparing the results with those reported in the literature, we found them to generally agree with previous findings. Some differences existed, but they could be attributed to the difficulties in comparing voice qualities produced by different speakers. At the same time we conducted experiments in the field of voice quality identification, with very good results. We have also evaluated the performance of an automatic emotion classifier based on GMM using glottal measures. For each emotion, we have trained an specific model using different features, comparing our parametrization to a baseline system using spectral and prosodic characteristics. The results of the test were very satisfactory, showing a relative error reduction of more than 20% with respect to the baseline system. The accuracy of the different emotions detection was also high, improving the results of previously reported works using the same database. Overall, we can conclude that the glottal source parameters extracted using our algorithm have a positive impact in the field of automatic emotion classification

    Real-time Sound Source Separation For Music Applications

    Get PDF
    Sound source separation refers to the task of extracting individual sound sources from some number of mixtures of those sound sources. In this thesis, a novel sound source separation algorithm for musical applications is presented. It leverages the fact that the vast majority of commercially recorded music since the 1950s has been mixed down for two channel reproduction, more commonly known as stereo. The algorithm presented in Chapter 3 in this thesis requires no prior knowledge or learning and performs the task of separation based purely on azimuth discrimination within the stereo field. The algorithm exploits the use of the pan pot as a means to achieve image localisation within stereophonic recordings. As such, only an interaural intensity difference exists between left and right channels for a single source. We use gain scaling and phase cancellation techniques to expose frequency dependent nulls across the azimuth domain, from which source separation and resynthesis is carried out. The algorithm is demonstrated to be state of the art in the field of sound source separation but also to be a useful pre-process to other tasks such as music segmentation and surround sound upmixing

    Systems biology of energy metabolism in skeletal muscle

    Get PDF
    The primary function of skeletal muscle tissue is to produce force or cause motion. To perform this task chemical energy stored in nutrients (glucose and fatty acids) has to be converted into an energy currency that can drive muscle contraction (adenosine-tri-phosphate, ATP). This process is known as the energy metabolism of skeletal muscle and consists of a large number of chemical reactions that are organized in metabolic pathways. Unraveling this complex network is important from a fundamental biological perspective, but also essential to understand how a disturbance of muscle bioenergetics can cause metabolic disorders. ?? 31P magnetic resonance spectroscopy (MRS) has emerged as one of the premier methods to study skeletal muscle energy metabolism in vivo. It, however, remains challenging to relate the observed metabolite dynamics to an understanding of the underlying processes at the level of the metabolic pathways. A possible solution for bridging this gap between macroscopic measurements and mechanistic understanding at pathway level is the application of mechanistic computational modeling. This dissertation describes a series of studies in which a mechanistic model of ATP metabolism was developed and applied in the analysis of skeletal muscle bioenergetics. Skeletal muscle cells contain two primary processes that are responsible for the conversion of glucose and fatty acids into ATP. These processes are known as glycolysis and oxidative phosphorylation in mitochondria. The initial mathematical models of these processes were obtained by integration of known enzyme kinetics and thermodynamics. Testing of these models, however, showed that they failed to reproduce many of the in vivo observed metabolite dynamics, as has been described in chapter 1 and 2. These results indicated that the models might be missing essential regulatory mechanisms or that the model parameterization required changes. First, the physiological implications of necessary model adaptations were investigated in a series of studies described in chapters 2 – 5. ?? Numerical analysis of the initial glycolysis model revealed that the experimentally observed slow turnover rate of phosphorylated sugars post exercise could only be explained by rapid deactivation of phosphofructokinase (PFK) and pyruvate kinase (PK) in non-contracting muscle. In particular the deactivation of PFK was crucial for adequate control of pathway flux. Therefore, in a follow-up study, it was tested if the missing regulation at the level of PFK could be explained by calcium – calmodulin mediated activation of this enzyme. To this end, pathway behavior, represented by phosphocreatine (PCr) and pH dynamics, was measured in ischemic skeletal muscle for a wide variety of muscle excitation frequencies (0 – 80 Hz). Next, it was shown that addition of the calcium – calmodulin mediated activation of PFK was necessary to accurately reproduce these data. These results provided important new quantitative support for the hypothesis that this particular mechanism has a key role in the regulation of glycolytic flux in skeletal muscle.?? The initial model of oxidative phosphorylation was first tested against empirically determined mitochondrial input – output relations, i.e., [ADP] – mitochondrial ATP synthesis flux (Jp) and phosphate potential (¿Gp) – Jp. These empirically determined relations were derived from 31P MRS measurements of metabolite dynamics post-exercise. They capture key features of the regulation of oxidative phosphorylation in vivo and are therefore considered relevant for testing the quality of the mathematical model. Numerical model analysis (i.e., parameter sensitivity analysis) was applied to investigate which components significantly influenced predictions of these input – output relations. Based on these results it was concluded that the adenine nucleotide transporter (which facilitates the exchange of ATP and ADP across the inner mitochondrial membrane) has a dominant role in controlling the ADP sensitivity of mitochondria. Furthermore, we identified that Pi feedback control of respiratory chain activity was essential to explain measurements of ¿Gp at low metabolic rates. These insights were used to improve the predictive power of the model, as described in chapters 4 and 5. ?? In the studies described in chapters 2 - 5 the glycolytic and mitochondrial model components were tested for conditions in which only one of the two processes was active (ischemia and post exercise recovery, respectively). It remained therefore unknown if the control mechanisms included in these models could also explain the contribution of mitochondrial versus glycolytic ATP synthesis for conditions in which both processes are active (aerobic exercise). In an attempt to answer this question, dynamics of ATP metabolism were recorded during a full rest – exercise – recovery protocol under aerobic conditions and subsequently used for testing of the integrated mitochondrial + glycolytic model. The results presented in chapter 8 showed that the integrated model could accurately reproduce the observed metabolite and pH dynamics for varying exercise intensities. The main physiological implications of these results were that, substrate feedback control (ADP + Pi) of oxidative phosphorylation combined with substrate feedback control (ADP + AMP + Pi) and control by parallel activation (calcium – calmodulin mediated activation of PFK) of glycolysis, provides a set of key control mechanisms that can explain the regulation of ATP metabolism in skeletal muscle in vivo for a wide range of physiological conditions. By application of several cycles of model development it was possible to improve the models performance to the point it was consistent with 31P MRS measurements of muscle bioenergetics in both healthy humans and animals. As described in chapters 6 and 7, it is was investigated if the model could be applied to analyze the adaptations of muscle physiology that underlie changes in mitochondrial capacity that occur in for instance type 2 diabetes patients or with aging. A decrease of mitochondrial capacity in these subjects can be diagnosed accurately by determining the rate of PCr recovery post exercise. However, the changes in muscle physiology responsible for any observed difference in oxidative capacity cannot be deduced from these measurements. Therefore additional muscle biopsy samples are collected and analyzed for in vitro markers of oxidative capacity. State-of-the-art analyses of these data are typically limited to statistical or intuitive approaches. We investigated if the insight obtained from the combined in vivo + in vitro data sets could be increased by application of our mathematical model. To this end, first, the model was extended from a single uniform cell type model to a three types cell model (type I, IIA, and IIX), capturing the microscopic heterogeneity of muscle tissue. In addition, several key validation tests were conducted, as described in chapter 6. Subsequently, we demonstrated that the model could explain the prolongation of PCr recovery period observed in type 2 diabetes patients by integrating available literature data of in vitro markers of mitochondrial function. Although this result was already very promising, it was also concluded that the approach could be tested more rigorously by obtaining all data (in vivo + in vitro) in a single study. Therefore, the method was further tested in an animal model of decreased mitochondrial function: 8 versus 25 week old Wistar rats. The first main result of this study was that the mathematical model could accurately reproduce the delayed PCr recovery kinetics in 25 week old animals based on in vitro determined changes in muscle physiology. In addition, model predictions provided quantitative insight in the individual contribution of the different factors responsible for the decreased oxidative capacity. This type of information is considered very relevant for the design of (pharmaceutical) therapies aimed at improving mitochondrial function. For example, model predictions of the physiological changes that contribute the most to the decrease in oxidative capacity provide potentially promising targets for therapy design. Based on these considerations it was concluded that application of the mathematical model provides new promising opportunities for future studies of mitochondrial (dys)function in skeletal muscle. ?? In conclusion, through application of a series of iterative cycles of model development combined with multiple new experimental studies it was possible to develop a detailed mechanistic model of ATP metabolism that was consistent with in vivo observations of skeletal muscle bioenergetics for a wide range of physiological conditions. This process provided new insight in the key control mechanisms embedded in the metabolic pathways that have a dominant role in regulating ATP metabolism in skeletal muscle in vivo. In addition, we successfully demonstrated the feasibility and added value of application of the model for integration of in vivo and in vitro measurements of oxidative capacity in future studies of mitochondrial (dys)function in, for example, type 2 diabetes, aging or mitochondrial myopathy

    Evaluation of the Importance of Time-Frequency Contributions to Speech Intelligibility in Noise

    Get PDF
    Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is highly related to the loudness of its target component, while the importance of each speech-absent T-F unit varies according to the loudness of its masker component. Two types of mask errors are also considered, which include miss and false alarm errors. Consistent with previous work, false alarm errors are shown to be more harmful to speech intelligibility than miss errors when the mixture signal-to-noise ratio (SNR) is below 0 dB. However, the relative importance between the two types of error is conditioned on the SNR level of the input speech signal. Based on these observations, a mask-based objective measure, the loudness weighted hit-false, is proposed for predicting speech intelligibility. The proposed objective measure shows significantly higher correlation with intelligibility compared to two existing mask-based objective measures
    corecore