22,762 research outputs found

    Hete-CF : Social-Based Collaborative Filtering Recommendation using Heterogeneous Relations

    Get PDF
    The work described here was funded by the National Natural Science Foundation of China (NSFC) under Grant No. 61373051; the National Science and Technology Pillar Program (Grant No.2013BAH07F05), the Key Laboratory for Symbolic Computation and Knowledge Engineering, Ministry of Education, China, and the UK Economic & Social Research Council (ESRC); award reference: ES/M001628/1.Preprin

    A Novel Self-Intersection Penalty Term for Statistical Body Shape Models and Its Applications in 3D Pose Estimation

    Full text link
    Statistical body shape models are widely used in 3D pose estimation due to their low-dimensional parameters representation. However, it is difficult to avoid self-intersection between body parts accurately. Motivated by this fact, we proposed a novel self-intersection penalty term for statistical body shape models applied in 3D pose estimation. To avoid the trouble of computing self-intersection for complex surfaces like the body meshes, the gradient of our proposed self-intersection penalty term is manually derived from the perspective of geometry. First, the self-intersection penalty term is defined as the volume of the self-intersection region. To calculate the partial derivatives with respect to the coordinates of the vertices, we employed detection rays to divide vertices of statistical body shape models into different groups depending on whether the vertex is in the region of self-intersection. Second, the partial derivatives could be easily derived by the normal vectors of neighboring triangles of the vertices. Finally, this penalty term could be applied in gradient-based optimization algorithms to remove the self-intersection of triangular meshes without using any approximation. Qualitative and quantitative evaluations were conducted to demonstrate the effectiveness and generality of our proposed method compared with previous approaches. The experimental results show that our proposed penalty term can avoid self-intersection to exclude unreasonable predictions and improves the accuracy of 3D pose estimation indirectly. Further more, the proposed method could be employed universally in triangular mesh based 3D reconstruction

    A Proximity-Aware Hierarchical Clustering of Faces

    Full text link
    In this paper, we propose an unsupervised face clustering algorithm called "Proximity-Aware Hierarchical Clustering" (PAHC) that exploits the local structure of deep representations. In the proposed method, a similarity measure between deep features is computed by evaluating linear SVM margins. SVMs are trained using nearest neighbors of sample data, and thus do not require any external training data. Clusters are then formed by thresholding the similarity scores. We evaluate the clustering performance using three challenging unconstrained face datasets, including Celebrity in Frontal-Profile (CFP), IARPA JANUS Benchmark A (IJB-A), and JANUS Challenge Set 3 (JANUS CS3) datasets. Experimental results demonstrate that the proposed approach can achieve significant improvements over state-of-the-art methods. Moreover, we also show that the proposed clustering algorithm can be applied to curate a set of large-scale and noisy training dataset while maintaining sufficient amount of images and their variations due to nuisance factors. The face verification performance on JANUS CS3 improves significantly by finetuning a DCNN model with the curated MS-Celeb-1M dataset which contains over three million face images

    The Perception and Production of Arabic Lexical Stress by Learners of Arabic: A Usage-Based Account

    Full text link
    Studies in second language (L2) stress perception and production over the past few decades have focused on the role of the native language (L1) of the L2 learner and how it systematically influences their performance in stress perception and production. However, these studies have not adequately explored and incorporated an important factor: input frequency of stress patterns (henceforth frequency), a factor that has been widely explored in other disciplines and has been found to be crucially relevant to language processing and learning. To bridge this gap in the literature, this study examines the effect of frequency, in addition to the role played by the learners’ L1, on the perception and production of primary lexical stress in Arabic by L2 learners of Arabic in an experimental environment. To this end, a stress perception and production experiment was conducted on first- and second-year L1 English and L1 Chinese learners of Arabic as well as L1 Arabic speakers as controls. In the experiment, the participants completed a stress production task, a stress identification task, and a lexical decision task, where they were asked to produce stimuli that were nonsense words with frequency-biased stress patterns, to listen then identify the position of the stressed syllable in these frequency-biased stimuli, and to determine whether the stimuli were real Arabic words or not, respectively. The results indicate a more evident frequency effect in the stress production task, where it had a local effect on learners’ performance on stimuli. Specifically, they were significantly quicker and more accurate in producing the stimuli when the stimuli had a frequent stress pattern whereas they were slower and less accurate when the stimuli had infrequent stress patterns. In contrast, the results show a more global effect of participants’ L1 on their perception and production of stress, as typological differences were found in the performance in the stress perception and production tasks. L1 speakers of Arabic consistently had slower and less accurate performance than the L2 learners in the stress identification task. The L1 Chinese participants had systematically more fluent and accurate production than their L1 English counterparts, which is argued to be contributed by the L2 Chinese participants’ better utilization of correlates for stress. These findings are taken to be in partial support for the role of frequency in stress perception and production, as significant differences were found in contexts with larger frequency contrast but not ones with moderate-to-small frequency contrast, and the fact that the performance of the participants was, to a large extent, conditioned by the preferences for acoustic cues and prosodic characteristics of their L1. However, frequency of input should not be disregarded, since it did capture aspects of learners’ performance that were not conditioned by their L1. Future studies should build upon the method implemented in the present study to further explore the role of frequency in L2 learners in higher proficiency as well. Pedagogically, the findings from the present study provided several implications for current Arabic curricular development and teaching practices, including raising the awareness of Arabic instructors of lexical stress and the importance of lexical stress for L2 teaching, developing teaching materials, and reflecting on current curricula practices that simultaneously engage multiple varieties exhibiting different stress systemsPHDNear Eastern StudiesUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143979/1/cwlin_1.pd

    Personalized Acoustic Modeling by Weakly Supervised Multi-Task Deep Learning using Acoustic Tokens Discovered from Unlabeled Data

    Full text link
    It is well known that recognizers personalized to each user are much more effective than user-independent recognizers. With the popularity of smartphones today, although it is not difficult to collect a large set of audio data for each user, it is difficult to transcribe it. However, it is now possible to automatically discover acoustic tokens from unlabeled personal data in an unsupervised way. We therefore propose a multi-task deep learning framework called a phoneme-token deep neural network (PTDNN), jointly trained from unsupervised acoustic tokens discovered from unlabeled data and very limited transcribed data for personalized acoustic modeling. We term this scenario "weakly supervised". The underlying intuition is that the high degree of similarity between the HMM states of acoustic token models and phoneme models may help them learn from each other in this multi-task learning framework. Initial experiments performed over a personalized audio data set recorded from Facebook posts demonstrated that very good improvements can be achieved in both frame accuracy and word accuracy over popularly-considered baselines such as fDLR, speaker code and lightly supervised adaptation. This approach complements existing speaker adaptation approaches and can be used jointly with such techniques to yield improved results.Comment: 5 pages, 5 figures, published in IEEE ICASSP 201
    corecore