314 research outputs found

    Robust correlated and individual component analysis

    Get PDF
    © 1979-2012 IEEE.Recovering correlated and individual components of two, possibly temporally misaligned, sets of data is a fundamental task in disciplines such as image, vision, and behavior computing, with application to problems such as multi-modal fusion (via correlated components), predictive analysis, and clustering (via the individual ones). Here, we study the extraction of correlated and individual components under real-world conditions, namely i) the presence of gross non-Gaussian noise and ii) temporally misaligned data. In this light, we propose a method for the Robust Correlated and Individual Component Analysis (RCICA) of two sets of data in the presence of gross, sparse errors. We furthermore extend RCICA in order to handle temporal incongruities arising in the data. To this end, two suitable optimization problems are solved. The generality of the proposed methods is demonstrated by applying them onto 4 applications, namely i) heterogeneous face recognition, ii) multi-modal feature fusion for human behavior analysis (i.e., audio-visual prediction of interest and conflict), iii) face clustering, and iv) thetemporal alignment of facial expressions. Experimental results on 2 synthetic and 7 real world datasets indicate the robustness and effectiveness of the proposed methodson these application domains, outperforming other state-of-the-art methods in the field

    Potential summer heat-stress of sheep at Greek husbandry areas of different landscape

    Get PDF
    During the last years and due to its economic importance sheep farming expands at flat land areas in Greece exhibiting less favourable climatic conditions especially during summer. It is therefore justifiable to assess the potential summer heat-stress of sheep growing at areas of different landscape. Potential heat-stress of sheep during summer was studied at three Greek husbandry areas of different landscape, namely Larissa - flat land, Ioannina - semi-mountainous and Trikala Korinthias - mountainous. Indices used were the night hours during which ambient temperature was below 21ºC, the Temperature Humidity Index (THI), the time percentage (%) within predefined heat-stress categories and the THI-hrs index. Overall, the area of Larissa exhibited the worst heat-stress conditions. Average ambient summer temperatures were above 21ºC during the whole 24 h period, whereas at Ioannina and Trikala Korinthias average temperatures were below 21ºC for almost half of the day including night. Daily average THI values were 27.2±0.2 for Larissa, 21.8±0.2 for Ioannina and 21.3±0.2 for Trikala Korinthias. During the hottest and the coolest summer days the average daily THI values at the area of Larissa were higher than those at Ioannina, which were also higher than at Trikala Korinthias. At Larissa the time percentage (%) within the extreme severe heat-stress category (IV) was significantly (P<0.05) higher, namely 58.3%, compared to Ioannina (34.3%) and Trikala Korinthias (9.2%). Average (2010-2014) THI-hrs under heat-stress were 11491 for Larissa, 5722 for Ioannina (49.8% of Larissa) and 1868 for Trikala Korinthias (16.3% of Larissa). Expansion of sheep husbandry at flat land areas and design criteria (e.g. breed used, feeding strategy, housing density, floor type, etc.) within sheep facilities should be implemented very cautiously

    Music classification by low-rank semantic mappings

    Get PDF
    A challenging open question in music classification is which music representation (i.e., audio features) and which machine learning algorithm is appropriate for a specific music classification task. To address this challenge, given a number of audio feature vectors for each training music recording that capture the different aspects of music (i.e., timbre, harmony, etc.), the goal is to find a set of linear mappings from several feature spaces to the semantic space spanned by the class indicator vectors. These mappings should reveal the common latent variables, which characterize a given set of classes and simultaneously define a multi-class linear classifier that classifies the extracted latent common features. Such a set of mappings is obtained, building on the notion of the maximum margin matrix factorization, by minimizing a weighted sum of nuclear norms. Since the nuclear norm imposes rank constraints to the learnt mappings, the proposed method is referred to as low-rank semantic mappings (LRSMs). The performance of the LRSMs in music genre, mood, and multi-label classification is assessed by conducting extensive experiments on seven manually annotated benchmark datasets. The reported experimental results demonstrate the superiority of the LRSMs over the classifiers that are compared to. Furthermore, the best reported classification results are comparable with or slightly superior to those obtained by the state-of-the-art task-specific music classification methods

    Elastic net subspace clustering applied to pop/rock music structure analysis

    Get PDF
    A novel homogeneity-based method for music structure analysis is proposed. The heart of the method is a similarity measure, derived from first principles, that is based on the matrix Elastic Net (EN) regularization and deals efficiently with highly correlated audio feature vectors. In particular, beat-synchronous mel-frequency cepstral coefficients, chroma features, and auditory temporal modulations model the audio signal. The EN induced similarity measure is employed to construct an affinity matrix, yielding a novel subspace clustering method referred to as Elastic Net subspace clustering (ENSC). The performance of the ENSC in structure analysis is assessed by conducting extensive experiments on the Beatles dataset. The experimental findings demonstrate the descriptive power of the EN-based affinity matrix over the affinity matrices employed in subspace clustering methods, attaining the state-of-the-art performance reported for the Beatles dataset

    GAGAN: Geometry-Aware Generative Adversarial Networks

    Full text link
    Deep generative models learned through adversarial training have become increasingly popular for their ability to generate naturalistic image textures. However, aside from their texture, the visual appearance of objects is significantly influenced by their shape geometry; information which is not taken into account by existing generative models. This paper introduces the Geometry-Aware Generative Adversarial Networks (GAGAN) for incorporating geometric information into the image generation process. Specifically, in GAGAN the generator samples latent variables from the probability space of a statistical shape model. By mapping the output of the generator to a canonical coordinate frame through a differentiable geometric transformation, we enforce the geometry of the objects and add an implicit connection from the prior to the generated object. Experimental results on face generation indicate that the GAGAN can generate realistic images of faces with arbitrary facial attributes such as facial expression, pose, and morphology, that are of better quality than current GAN-based methods. Our method can be used to augment any existing GAN architecture and improve the quality of the images generated

    Behavior prediction in-the-wild

    Get PDF
    In this paper, the problem of audio-visual behavior prediction in-the-wild is addressed. In this context, both audio-visual descriptors of behavioral cues (features) and continuous-time real-valued characterizations of behavior (annotations) are (possibly) corrupted by non-Gaussian noise of large magnitude. The modeling assumption behind the proposed framework is that naturalistic affect and behavior captured in audio-visual episodes are smoothly-varying dynamic phenomena and thus the hidden temporal dynamics can be modeled as a generative auto-regressive process. Consequently, continuous-time real-valued characterizations of behavior (annotations) are postulated to be outputs of a low-complexity (i.e., low-order) time-invariant Linear Dynamical System (LDS) when descriptors of behavioral cues (features) act as inputs. To learn the parameters of the LDS, a recently proposed spectral method that relies on Hankel-rank minimization is adopted. Experimental evaluation on a challenging database recorded in the wild demonstrate the effectiveness of the proposed approach in behavior prediction
    • …
    corecore