147 research outputs found

    Characterization of lubricated bearing surfaces operated under high loads

    Get PDF
    The composition and surface profiles of M-50 steel surfaces were measured after operation at high loads in a bearing contact simulator. An ester lubricant (trimethyolpropane triheptanoate) was used with and without various additives. Optical profiles were obtained + or - to 30 depth resolution with a phase-locked interference microscope in 10 micron diameter areas within and outside the wear tracks. Optical constants and surface film thickness were measured in the same areas with an electronic scanning ellipsometer. Film composition was measured with a scanning Auger electron spectrometer. It is concluded that metal oxide formation is accelerated within the wear tracks

    Evaluation and combination of pitch estimation methods for melody extraction in symphonic classical music

    Get PDF
    The extraction of pitch information is arguably one of the most important tasks in automatic music description systems. However, previous research and evaluation datasets dealing with pitch estimation focused on relatively limited kinds of musical data. This work aims to broaden this scope by addressing symphonic western classical music recordings, focusing on pitch estimation for melody extraction. This material is characterised by a high number of overlapping sources, and by the fact that the melody may be played by different instrumental sections, often alternating within an excerpt. We evaluate the performance of eleven state-of-the-art pitch salience functions, multipitch estimation and melody extraction algorithms when determining the sequence of pitches corresponding to the main melody in a varied set of pieces. An important contribution of the present study is the proposed evaluation framework, including the annotation methodology, generated dataset and evaluation metrics. The results show that the assumptions made by certain methods hold better than others when dealing with this type of music signals, leading to a better performance. Additionally, we propose a simple method for combining the output of several algorithms, with promising results

    The impact of the Lombard effect on audio and visual speech recognition systems

    Get PDF
    When producing speech in noisy backgrounds talkers reflexively adapt their speaking style in ways that increase speech-in-noise intelligibility. This adaptation, known as the Lombard effect, is likely to have an adverse effect on the performance of automatic speech recognition systems that have not been designed to anticipate it. However, previous studies of this impact have used very small amounts of data and recognition systems that lack modern adaptation strategies. This paper aims to rectify this by using a new audio-visual Lombard corpus containing speech from 54 different speakers – significantly larger than any previously available – and modern state-of-the-art speech recognition techniques. The paper is organised as three speech-in-noise recognition studies. The first examines the case in which a system is presented with Lombard speech having been exclusively trained on normal speech. It was found that the Lombard mismatch caused a significant decrease in performance even if the level of the Lombard speech was normalised to match the level of normal speech. However, the size of the mismatch was highly speaker-dependent thus explaining conflicting results presented in previous smaller studies. The second study compares systems trained in matched conditions (i.e., training and testing with the same speaking style). Here the Lombard speech affords a large increase in recognition performance. Part of this is due to the greater energy leading to a reduction in noise masking, but performance improvements persist even after the effect of signal-to-noise level difference is compensated. An analysis across speakers shows that the Lombard speech energy is spectro-temporally distributed in a way that reduces energetic masking, and this reduction in masking is associated with an increase in recognition performance. The final study repeats the first two using a recognition system training on visual speech. In the visual domain, performance differences are not confounded by differences in noise masking. It was found that in matched-conditions Lombard speech supports better recognition performance than normal speech. The benefit was consistently present across all speakers but to a varying degree. Surprisingly, the Lombard benefit was observed to a small degree even when training on mismatched non-Lombard visual speech, i.e., the increased clarity of the Lombard speech outweighed the impact of the mismatch. The paper presents two generally applicable conclusions: i) systems that are designed to operate in noise will benefit from being trained on well-matched Lombard speech data, ii) the results of speech recognition evaluations that employ artificial speech and noise mixing need to be treated with caution: they are overly-optimistic to the extent that they ignore a significant source of mismatch but at the same time overly-pessimistic in that they do not anticipate the potential increased intelligibility of the Lombard speaking style

    An analysis of environment, microphone and data simulation mismatches in robust speech recognition

    Get PDF
    Speech enhancement and automatic speech recognition (ASR) are most often evaluated in matched (or multi-condition) settings where the acoustic conditions of the training data match (or cover) those of the test data. Few studies have systematically assessed the impact of acoustic mismatches between training and test data, especially concerning recent speech enhancement and state-of-the-art ASR techniques. In this article, we study this issue in the context of the CHiME- 3 dataset, which consists of sentences spoken by talkers situated in challenging noisy environments recorded using a 6-channel tablet based microphone array. We provide a critical analysis of the results published on this dataset for various signal enhancement, feature extraction, and ASR backend techniques and perform a number of new experiments in order to separately assess the impact of di↵erent noise environments, di↵erent numbers and positions of microphones, or simulated vs. real data on speech enhancement and ASR performance. We show that, with the exception of minimum variance distortionless response (MVDR) beamforming, most algorithms perform consistently on real and simulated data and can benefit from training on simulated data. We also find that training on di↵erent noise environments and di↵erent microphones barely a↵ects the ASR performance, especially when several environments are present in the training data: only the number of microphones has a significant impact. Based on these results, we introduce the CHiME-4 Speech Separation and Recognition Challenge, which revisits the CHiME-3 dataset and makes it more challenging by reducing the number of microphones available for testing

    Unsupervised Incremental Online Learning and Prediction of Musical Audio Signals

    Get PDF
    Guided by the idea that musical human-computer interaction may become more effective, intuitive, and creative when basing its computer part on cognitively more plausible learning principles, we employ unsupervised incremental online learning (i.e. clustering) to build a system that predicts the next event in a musical sequence, given as audio input. The flow of the system is as follows: 1) segmentation by onset detection, 2) timbre representation of each segment by Mel frequency cepstrum coefficients, 3) discretization by incremental clustering, yielding a tree of different sound classes (e.g. timbre categories/instruments) that can grow or shrink on the fly driven by the instantaneous sound events, resulting in a discrete symbol sequence, 4) extraction of statistical regularities of the symbol sequence, using hierarchical N-grams and the newly introduced conceptual Boltzmann machine that adapt to the dynamically changing clustering tree in 3) , and 5) prediction of the next sound event in the sequence, given the last n previous events. The system's robustness is assessed with respect to complexity and noisiness of the signal. Clustering in isolation yields an adjusted Rand index (ARI) of 82.7%/85.7% for data sets of singing voice and drums. Onset detection jointly with clustering achieve an ARI of 81.3%/76.3% and the prediction of the entire system yields an ARI of 27.2%/39.2%

    Multiple shutters for a stereoscopic camera

    Get PDF
    Focal plane shutter assembly composed of three mechanically separate rotary shutters permits exposure of three separated photographic films simultaneously with exposure time of 0.08 second. Exposure time is repeatable within 2 percent, uniformity of exposure over all three films is within 5 percent

    Recording of Chronic Diseases and Adverse Obstetric Outcomes during Hospitalizations for a Delivery in the National Swiss Hospital Medical Statistics Dataset between 2012 and 2018: An Observational Cross-Sectional Study.

    Get PDF
    The prevalence of chronic diseases during pregnancy and adverse maternal obstetric outcomes in Switzerland has been insufficiently studied. Data sources, which reliably capture these events, are scarce. We conducted a nationwide observational cross-sectional study (2012-2018) using data from the Swiss Hospital Medical Statistics (MS) dataset. To quantify the recording of chronic diseases and adverse maternal obstetric outcomes during delivery in hospitals or birthing centers (delivery hospitalization), we identified women who delivered a singleton live-born infant. We quantified the prevalence of 23 maternal chronic diseases (ICD-10-GM) and compared results to a nationwide Danish registry study. We further quantified the prevalence of adverse maternal obstetric outcomes (ICD-10-GM/CHOP) during the delivery hospitalization and compared the results to existing literature from Western Europe. We identified 577,220 delivery hospitalizations, of which 4.99% had a record for ≥1 diagnosis of a chronic disease (versus 15.49% in Denmark). Moreover, 13 of 23 chronic diseases seemed to be substantially under-recorded (8 of those were >10-fold more frequent in the Danish study). The prevalence of three of the chronic diseases was similar in the two studies. The prevalence of adverse maternal obstetric outcomes was comparable to other European countries. Our results suggest that chronic diseases are under-recorded during delivery hospitalizations in the MS dataset, which may be due to specific coding guidelines and aspects regarding whether a disease generates billable effort for a hospital. Adverse maternal obstetric outcomes seemed to be more completely captured

    Lexical frequency effects in English and Spanish word misperceptions

    Get PDF
    When listeners misperceive words in noise, do they report words that are more common? Lexical frequency differences between misperceived and target words in English and Spanish were examined for five masker types. Misperceptions had a higher lexical frequency in the presence of pure energetic maskers, but frequency effects were reduced or absent for informational maskers. The tendency to report more common words increased with the degree of energetic masking, suggesting that uncertainty about segment identity provides a role for lexical frequency. However, acoustic-phonetic information from an informational masker may additionally constrain lexical choice

    An innovative speech-based user interface for smarthomes and IoT solutions to help people with speech and motor disabilities

    Get PDF
    A better use of the increasing functional capabilities of home automation systems and Internet of Things (IoT) devices to support the needs of users with disability, is the subject of a research project currently conducted by Area Ausili (Assistive Technology Area), a department of Polo Tecnologico Regionale Corte Roncati of the Local Health Trust of Bologna (Italy), in collaboration with AIAS Ausilioteca Assistive Technology (AT) Team. The main aim of the project is to develop experimental low cost systems for environmental control through simplified and accessible user interfaces. Many of the activities are focused on automatic speech recognition and are developed in the framework of the CloudCAST project. In this paper we report on the first technical achievements of the project and discuss future possible developments and applications within and outside CloudCAST

    A new clinopyroxene thermobarometer for mafic to intermediate magmatic systems

    Get PDF
    Clinopyroxene-only thermobarometry is one of the most practical tools to reconstruct crystallization pressures and temperatures of clinopyroxenes. Because it does not require any information of coexisting silicate melt or other co-crystallized mineral phases, it has been widely used to elucidate the physiochemical conditions of crystallizing magmas. However, previously calibrated clinopyroxene-only thermobarometers display low accuracy when being applied to mafic and intermediate magmatic systems. Hence, in this study, we present new empirical nonlinear barometric and thermometric models, which were formulated to improve the performance of clinopyroxene-only thermobarometry. Particularly, a total of 559 experimental runs conducted in the pressure range of 1gbar to 12gkbar have been used for calibration and validation of the new barometric and thermometric formulation. The superiority of our new models with respect to previous ones was confirmed by comparing their performance on 100 replications of calibration and validation, and the standard error of estimate (SEE) of the new barometer and thermometer are 1.66gkbar and 36.6gg C, respectively. Although our new barometer and thermometer fail to reproduce the entire test dataset, which has not been used for calibration and validation, they still perform well on clinopyroxenes crystallized from subalkaline basic to intermediate magmas (i.e., basaltic, basalt-andesitic, dacitic magma systems). Thus, their applicability should be limited to basaltic, basalt-andesitic and dacitic magma systems. In a last step, we applied our new thermobarometer to several tholeiitic Icelandic eruptions and established magma storage conditions exhibiting a general consistency with phase equilibria experiments. Therefore, we propose that our new thermobarometer represents a powerful tool to reveal the crystallization conditions of clinopyroxene in mafic to intermediate magmas. © Copyright
    corecore