128,600 research outputs found
Classifying Rhoticity of /r/ in Speech Sound Disorder using Age-and-Sex Normalized Formants
Mispronunciation detection tools could increase treatment access for speech
sound disorders impacting, e.g., /r/. We show age-and-sex normalized formant
estimation outperforms cepstral representation for detection of fully rhotic
vs. derhotic /r/ in the PERCEPT-R Corpus. Gated recurrent neural networks
trained on this feature set achieve a mean test participant-specific F1-score
=.81 ({\sigma}x=.10, med = .83, n = 48), with post hoc modeling showing no
significant effect of child age or sex.Comment: To appear in Proceedings of the Annual Conference of the
International Speech Communication Association, INTERSPEECH 202
Recurrent 3D Pose Sequence Machines
3D human articulated pose recovery from monocular image sequences is very
challenging due to the diverse appearances, viewpoints, occlusions, and also
the human 3D pose is inherently ambiguous from the monocular imagery. It is
thus critical to exploit rich spatial and temporal long-range dependencies
among body joints for accurate 3D pose sequence prediction. Existing approaches
usually manually design some elaborate prior terms and human body kinematic
constraints for capturing structures, which are often insufficient to exploit
all intrinsic structures and not scalable for all scenarios. In contrast, this
paper presents a Recurrent 3D Pose Sequence Machine(RPSM) to automatically
learn the image-dependent structural constraint and sequence-dependent temporal
context by using a multi-stage sequential refinement. At each stage, our RPSM
is composed of three modules to predict the 3D pose sequences based on the
previously learned 2D pose representations and 3D poses: (i) a 2D pose module
extracting the image-dependent pose representations, (ii) a 3D pose recurrent
module regressing 3D poses and (iii) a feature adaption module serving as a
bridge between module (i) and (ii) to enable the representation transformation
from 2D to 3D domain. These three modules are then assembled into a sequential
prediction framework to refine the predicted poses with multiple recurrent
stages. Extensive evaluations on the Human3.6M dataset and HumanEva-I dataset
show that our RPSM outperforms all state-of-the-art approaches for 3D pose
estimation.Comment: Published in CVPR 201
Customer-oriented risk assessment in Network Utilities
For companies that distribute services such as telecommunications, water, energy, gas, etc., quality perceived by the customers has a strong impact on the fulfillment of financial goals, positively increasing the demand and negatively increasing the risk of customer churn (loss of customers). Failures by these companies may cause customer affection in a massive way, augmenting the intention to leave the company. Therefore, maintenance performance and specifically service reliability has a strong influence on financial goals. This paper proposes a methodology to evaluate the contribution of the maintenance department in economic terms, based on service unreliability by network failures. The developed methodology aims to provide an analysis of failures to facilitate decision making about maintenance (preventive/predictive and corrective) costs versus negative impacts in end-customer invoicing based on the probability of losing customers. Survival analysis of recurrent failures with the General Renewal Process distribution is used for this novel purpose with the intention to be applied as a standard procedure to calculate the expected maintenance financial impact, for a given period of time. Also, geographical areas of coverage are distinguished, enabling the comparison of different technical or management alternatives. Two case studies in a telecommunications services company are presented in order to illustrate the applicability of the methodology
Dynamic Modeling and Statistical Analysis of Event Times
This review article provides an overview of recent work in the modeling and
analysis of recurrent events arising in engineering, reliability, public
health, biomedicine and other areas. Recurrent event modeling possesses unique
facets making it different and more difficult to handle than single event
settings. For instance, the impact of an increasing number of event occurrences
needs to be taken into account, the effects of covariates should be considered,
potential association among the interevent times within a unit cannot be
ignored, and the effects of performed interventions after each event occurrence
need to be factored in. A recent general class of models for recurrent events
which simultaneously accommodates these aspects is described. Statistical
inference methods for this class of models are presented and illustrated
through applications to real data sets. Some existing open research problems
are described.Comment: Published at http://dx.doi.org/10.1214/088342306000000349 in the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Recurrent Scene Parsing with Perspective Understanding in the Loop
Objects may appear at arbitrary scales in perspective images of a scene,
posing a challenge for recognition systems that process images at a fixed
resolution. We propose a depth-aware gating module that adaptively selects the
pooling field size in a convolutional network architecture according to the
object scale (inversely proportional to the depth) so that small details are
preserved for distant objects while larger receptive fields are used for those
nearby. The depth gating signal is provided by stereo disparity or estimated
directly from monocular input. We integrate this depth-aware gating into a
recurrent convolutional neural network to perform semantic segmentation. Our
recurrent module iteratively refines the segmentation results, leveraging the
depth and semantic predictions from the previous iterations.
Through extensive experiments on four popular large-scale RGB-D datasets, we
demonstrate this approach achieves competitive semantic segmentation
performance with a model which is substantially more compact. We carry out
extensive analysis of this architecture including variants that operate on
monocular RGB but use depth as side-information during training, unsupervised
gating as a generic attentional mechanism, and multi-resolution gating. We find
that gated pooling for joint semantic segmentation and depth yields
state-of-the-art results for quantitative monocular depth estimation
- …