128,600 research outputs found

    Classifying Rhoticity of /r/ in Speech Sound Disorder using Age-and-Sex Normalized Formants

    Full text link
    Mispronunciation detection tools could increase treatment access for speech sound disorders impacting, e.g., /r/. We show age-and-sex normalized formant estimation outperforms cepstral representation for detection of fully rhotic vs. derhotic /r/ in the PERCEPT-R Corpus. Gated recurrent neural networks trained on this feature set achieve a mean test participant-specific F1-score =.81 ({\sigma}x=.10, med = .83, n = 48), with post hoc modeling showing no significant effect of child age or sex.Comment: To appear in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 202

    Recurrent 3D Pose Sequence Machines

    Full text link
    3D human articulated pose recovery from monocular image sequences is very challenging due to the diverse appearances, viewpoints, occlusions, and also the human 3D pose is inherently ambiguous from the monocular imagery. It is thus critical to exploit rich spatial and temporal long-range dependencies among body joints for accurate 3D pose sequence prediction. Existing approaches usually manually design some elaborate prior terms and human body kinematic constraints for capturing structures, which are often insufficient to exploit all intrinsic structures and not scalable for all scenarios. In contrast, this paper presents a Recurrent 3D Pose Sequence Machine(RPSM) to automatically learn the image-dependent structural constraint and sequence-dependent temporal context by using a multi-stage sequential refinement. At each stage, our RPSM is composed of three modules to predict the 3D pose sequences based on the previously learned 2D pose representations and 3D poses: (i) a 2D pose module extracting the image-dependent pose representations, (ii) a 3D pose recurrent module regressing 3D poses and (iii) a feature adaption module serving as a bridge between module (i) and (ii) to enable the representation transformation from 2D to 3D domain. These three modules are then assembled into a sequential prediction framework to refine the predicted poses with multiple recurrent stages. Extensive evaluations on the Human3.6M dataset and HumanEva-I dataset show that our RPSM outperforms all state-of-the-art approaches for 3D pose estimation.Comment: Published in CVPR 201

    Customer-oriented risk assessment in Network Utilities

    Get PDF
    For companies that distribute services such as telecommunications, water, energy, gas, etc., quality perceived by the customers has a strong impact on the fulfillment of financial goals, positively increasing the demand and negatively increasing the risk of customer churn (loss of customers). Failures by these companies may cause customer affection in a massive way, augmenting the intention to leave the company. Therefore, maintenance performance and specifically service reliability has a strong influence on financial goals. This paper proposes a methodology to evaluate the contribution of the maintenance department in economic terms, based on service unreliability by network failures. The developed methodology aims to provide an analysis of failures to facilitate decision making about maintenance (preventive/predictive and corrective) costs versus negative impacts in end-customer invoicing based on the probability of losing customers. Survival analysis of recurrent failures with the General Renewal Process distribution is used for this novel purpose with the intention to be applied as a standard procedure to calculate the expected maintenance financial impact, for a given period of time. Also, geographical areas of coverage are distinguished, enabling the comparison of different technical or management alternatives. Two case studies in a telecommunications services company are presented in order to illustrate the applicability of the methodology

    Dynamic Modeling and Statistical Analysis of Event Times

    Get PDF
    This review article provides an overview of recent work in the modeling and analysis of recurrent events arising in engineering, reliability, public health, biomedicine and other areas. Recurrent event modeling possesses unique facets making it different and more difficult to handle than single event settings. For instance, the impact of an increasing number of event occurrences needs to be taken into account, the effects of covariates should be considered, potential association among the interevent times within a unit cannot be ignored, and the effects of performed interventions after each event occurrence need to be factored in. A recent general class of models for recurrent events which simultaneously accommodates these aspects is described. Statistical inference methods for this class of models are presented and illustrated through applications to real data sets. Some existing open research problems are described.Comment: Published at http://dx.doi.org/10.1214/088342306000000349 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Recurrent Scene Parsing with Perspective Understanding in the Loop

    Full text link
    Objects may appear at arbitrary scales in perspective images of a scene, posing a challenge for recognition systems that process images at a fixed resolution. We propose a depth-aware gating module that adaptively selects the pooling field size in a convolutional network architecture according to the object scale (inversely proportional to the depth) so that small details are preserved for distant objects while larger receptive fields are used for those nearby. The depth gating signal is provided by stereo disparity or estimated directly from monocular input. We integrate this depth-aware gating into a recurrent convolutional neural network to perform semantic segmentation. Our recurrent module iteratively refines the segmentation results, leveraging the depth and semantic predictions from the previous iterations. Through extensive experiments on four popular large-scale RGB-D datasets, we demonstrate this approach achieves competitive semantic segmentation performance with a model which is substantially more compact. We carry out extensive analysis of this architecture including variants that operate on monocular RGB but use depth as side-information during training, unsupervised gating as a generic attentional mechanism, and multi-resolution gating. We find that gated pooling for joint semantic segmentation and depth yields state-of-the-art results for quantitative monocular depth estimation
    corecore