24,105 research outputs found
Hete-CF : Social-Based Collaborative Filtering Recommendation using Heterogeneous Relations
The work described here was funded by the National Natural Science Foundation of China (NSFC) under Grant No. 61373051; the National Science and Technology Pillar Program (Grant No.2013BAH07F05), the Key Laboratory for Symbolic Computation and Knowledge Engineering, Ministry of Education, China, and the UK Economic & Social Research Council (ESRC); award reference: ES/M001628/1.Preprin
The Perception and Production of Arabic Lexical Stress by Learners of Arabic: A Usage-Based Account
Studies in second language (L2) stress perception and production over the past few decades have focused on the role of the native language (L1) of the L2 learner and how it systematically influences their performance in stress perception and production. However, these studies have not adequately explored and incorporated an important factor: input frequency of stress patterns (henceforth frequency), a factor that has been widely explored in other disciplines and has been found to be crucially relevant to language processing and learning. To bridge this gap in the literature, this study examines the effect of frequency, in addition to the role played by the learners’ L1, on the perception and production of primary lexical stress in Arabic by L2 learners of Arabic in an experimental environment. To this end, a stress perception and production experiment was conducted on first- and second-year L1 English and L1 Chinese learners of Arabic as well as L1 Arabic speakers as controls. In the experiment, the participants completed a stress production task, a stress identification task, and a lexical decision task, where they were asked to produce stimuli that were nonsense words with frequency-biased stress patterns, to listen then identify the position of the stressed syllable in these frequency-biased stimuli, and to determine whether the stimuli were real Arabic words or not, respectively.
The results indicate a more evident frequency effect in the stress production task, where it had a local effect on learners’ performance on stimuli. Specifically, they were significantly quicker and more accurate in
producing the stimuli when the stimuli had a frequent stress pattern whereas they were slower and less accurate when the stimuli had infrequent stress patterns.
In contrast, the results show a more global effect of participants’ L1 on their perception and production of stress, as typological differences were found in the performance in the stress perception and production tasks. L1 speakers of Arabic consistently had slower and less accurate performance than the L2 learners in the stress identification task. The L1 Chinese participants had systematically more fluent and accurate production than their L1 English counterparts, which is argued to be contributed by the L2 Chinese participants’ better utilization of correlates for stress.
These findings are taken to be in partial support for the role of frequency in stress perception and production, as significant differences were found in contexts with larger frequency contrast but not ones with moderate-to-small frequency contrast, and the fact that the performance of the participants was, to a large extent, conditioned by the preferences for acoustic cues and prosodic characteristics of their L1. However, frequency of input should not be disregarded, since it did capture aspects of learners’ performance that were not conditioned by their L1. Future studies should build upon the method implemented in the present study to further explore the role of frequency in L2 learners in higher proficiency as well.
Pedagogically, the findings from the present study provided several implications for current Arabic curricular development and teaching practices, including raising the awareness of Arabic instructors of lexical stress and the importance of lexical stress for L2 teaching, developing teaching materials, and reflecting on current curricula practices that simultaneously engage multiple varieties exhibiting different stress systemsPHDNear Eastern StudiesUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143979/1/cwlin_1.pd
A Novel Self-Intersection Penalty Term for Statistical Body Shape Models and Its Applications in 3D Pose Estimation
Statistical body shape models are widely used in 3D pose estimation due to
their low-dimensional parameters representation. However, it is difficult to
avoid self-intersection between body parts accurately. Motivated by this fact,
we proposed a novel self-intersection penalty term for statistical body shape
models applied in 3D pose estimation. To avoid the trouble of computing
self-intersection for complex surfaces like the body meshes, the gradient of
our proposed self-intersection penalty term is manually derived from the
perspective of geometry. First, the self-intersection penalty term is defined
as the volume of the self-intersection region. To calculate the partial
derivatives with respect to the coordinates of the vertices, we employed
detection rays to divide vertices of statistical body shape models into
different groups depending on whether the vertex is in the region of
self-intersection. Second, the partial derivatives could be easily derived by
the normal vectors of neighboring triangles of the vertices. Finally, this
penalty term could be applied in gradient-based optimization algorithms to
remove the self-intersection of triangular meshes without using any
approximation. Qualitative and quantitative evaluations were conducted to
demonstrate the effectiveness and generality of our proposed method compared
with previous approaches. The experimental results show that our proposed
penalty term can avoid self-intersection to exclude unreasonable predictions
and improves the accuracy of 3D pose estimation indirectly. Further more, the
proposed method could be employed universally in triangular mesh based 3D
reconstruction
A Proximity-Aware Hierarchical Clustering of Faces
In this paper, we propose an unsupervised face clustering algorithm called
"Proximity-Aware Hierarchical Clustering" (PAHC) that exploits the local
structure of deep representations. In the proposed method, a similarity measure
between deep features is computed by evaluating linear SVM margins. SVMs are
trained using nearest neighbors of sample data, and thus do not require any
external training data. Clusters are then formed by thresholding the similarity
scores. We evaluate the clustering performance using three challenging
unconstrained face datasets, including Celebrity in Frontal-Profile (CFP),
IARPA JANUS Benchmark A (IJB-A), and JANUS Challenge Set 3 (JANUS CS3)
datasets. Experimental results demonstrate that the proposed approach can
achieve significant improvements over state-of-the-art methods. Moreover, we
also show that the proposed clustering algorithm can be applied to curate a set
of large-scale and noisy training dataset while maintaining sufficient amount
of images and their variations due to nuisance factors. The face verification
performance on JANUS CS3 improves significantly by finetuning a DCNN model with
the curated MS-Celeb-1M dataset which contains over three million face images
Personalized Acoustic Modeling by Weakly Supervised Multi-Task Deep Learning using Acoustic Tokens Discovered from Unlabeled Data
It is well known that recognizers personalized to each user are much more
effective than user-independent recognizers. With the popularity of smartphones
today, although it is not difficult to collect a large set of audio data for
each user, it is difficult to transcribe it. However, it is now possible to
automatically discover acoustic tokens from unlabeled personal data in an
unsupervised way. We therefore propose a multi-task deep learning framework
called a phoneme-token deep neural network (PTDNN), jointly trained from
unsupervised acoustic tokens discovered from unlabeled data and very limited
transcribed data for personalized acoustic modeling. We term this scenario
"weakly supervised". The underlying intuition is that the high degree of
similarity between the HMM states of acoustic token models and phoneme models
may help them learn from each other in this multi-task learning framework.
Initial experiments performed over a personalized audio data set recorded from
Facebook posts demonstrated that very good improvements can be achieved in both
frame accuracy and word accuracy over popularly-considered baselines such as
fDLR, speaker code and lightly supervised adaptation. This approach complements
existing speaker adaptation approaches and can be used jointly with such
techniques to yield improved results.Comment: 5 pages, 5 figures, published in IEEE ICASSP 201
- …