33,979 research outputs found

    Multi-Label Zero-Shot Human Action Recognition via Joint Latent Ranking Embedding

    Get PDF
    Human action recognition refers to automatic recognizing human actions from a video clip. In reality, there often exist multiple human actions in a video stream. Such a video stream is often weakly-annotated with a set of relevant human action labels at a global level rather than assigning each label to a specific video episode corresponding to a single action, which leads to a multi-label learning problem. Furthermore, there are many meaningful human actions in reality but it would be extremely difficult to collect/annotate video clips regarding all of various human actions, which leads to a zero-shot learning scenario. To the best of our knowledge, there is no work that has addressed all the above issues together in human action recognition. In this paper, we formulate a real-world human action recognition task as a multi-label zero-shot learning problem and propose a framework to tackle this problem in a holistic way. Our framework holistically tackles the issue of unknown temporal boundaries between different actions for multi-label learning and exploits the side information regarding the semantic relationship between different human actions for knowledge transfer. Consequently, our framework leads to a joint latent ranking embedding for multi-label zero-shot human action recognition. A novel neural architecture of two component models and an alternate learning algorithm are proposed to carry out the joint latent ranking embedding learning. Thus, multi-label zero-shot recognition is done by measuring relatedness scores of action labels to a test video clip in the joint latent visual and semantic embedding spaces. We evaluate our framework with different settings, including a novel data split scheme designed especially for evaluating multi-label zero-shot learning, on two datasets: Breakfast and Charades. The experimental results demonstrate the effectiveness of our framework.Comment: 27 pages, 10 figures and 7 tables. Technical report submitted to a journal. More experimental results/references were added and typos were correcte

    Transferable Positive/Negative Speech Emotion Recognition via Class-wise Adversarial Domain Adaptation

    Get PDF
    Speech emotion recognition plays an important role in building more intelligent and human-like agents. Due to the difficulty of collecting speech emotional data, an increasingly popular solution is leveraging a related and rich source corpus to help address the target corpus. However, domain shift between the corpora poses a serious challenge, making domain shift adaptation difficult to function even on the recognition of positive/negative emotions. In this work, we propose class-wise adversarial domain adaptation to address this challenge by reducing the shift for all classes between different corpora. Experiments on the well-known corpora EMODB and Aibo demonstrate that our method is effective even when only a very limited number of target labeled examples are provided.Comment: 5 pages, 3 figures, accepted to ICASSP 201

    A Total Fractional-Order Variation Model for Image Restoration with Non-homogeneous Boundary Conditions and its Numerical Solution

    Get PDF
    To overcome the weakness of a total variation based model for image restoration, various high order (typically second order) regularization models have been proposed and studied recently. In this paper we analyze and test a fractional-order derivative based total α\alpha-order variation model, which can outperform the currently popular high order regularization models. There exist several previous works using total α\alpha-order variations for image restoration; however first no analysis is done yet and second all tested formulations, differing from each other, utilize the zero Dirichlet boundary conditions which are not realistic (while non-zero boundary conditions violate definitions of fractional-order derivatives). This paper first reviews some results of fractional-order derivatives and then analyzes the theoretical properties of the proposed total α\alpha-order variational model rigorously. It then develops four algorithms for solving the variational problem, one based on the variational Split-Bregman idea and three based on direct solution of the discretise-optimization problem. Numerical experiments show that, in terms of restoration quality and solution efficiency, the proposed model can produce highly competitive results, for smooth images, to two established high order models: the mean curvature and the total generalized variation.Comment: 26 page

    X-ray Diffraction Tomographic Imaging and Reconstruction

    Full text link
    Material discrimination based on conventional or dual energy X-ray computed tomography (CT) imaging can be ambiguous. X-ray diffraction imaging (XDI) can be used to construct diffraction profiles of objects, providing new molecular signature information that can be used to characterize the presence of specific materials. Combining X-ray CT and diffraction imaging can lead to enhanced detection and identification of explosives in luggage screening. In this work we are investigating techniques for joint reconstruction of CT absorption and X-ray diffraction profile images of objects to achieve improved image quality and enhanced material classification. The initial results have been validated via simulation of X-ray absorption and coherent scattering in 2 dimensions.U. S. Department of Homeland Security (2008-ST-061-ED0001

    Exploring Language-Independent Emotional Acoustic Features via Feature Selection

    Full text link
    We propose a novel feature selection strategy to discover language-independent acoustic features that tend to be responsible for emotions regardless of languages, linguistics and other factors. Experimental results suggest that the language-independent feature subset discovered yields the performance comparable to the full feature set on various emotional speech corpora.Comment: 15 pages, 2 figures, 6 table
    • …
    corecore