236 research outputs found

    Deep Learning-based Anomaly Detection on X-ray Images of Fuel Cell Electrodes

    Get PDF
    Anomaly detection in X-ray images has been an active and lasting research area in the last decades, especially in the domain of medical X-ray images. For this work, we created a real-world labeled anomaly dataset, consisting of 16-bit X-ray image data of fuel cell electrodes coated with a platinum catalyst solution and perform anomaly detection on the dataset using a deep learning approach. The dataset contains a diverse set of anomalies with 11 identified common anomalies where the electrodes contain e.g. scratches, bubbles, smudges etc. We experiment with 16-bit image to 8-bit image conversion methods to utilize pre-trained Convolutional Neural Networks as feature extractors (transfer learning) and find that we achieve the best performance by maximizing the contrasts globally across the dataset during the 16-bit to 8-bit conversion, through histogram equalization. We group the fuel cell electrodes with anomalies into a single class called abnormal and the normal fuel cell electrodes into a class called normal, thereby abstracting the anomaly detection problem into a binary classification problem. We achieve a balanced accuracy of 85.18\%. The anomaly detection is used by the company, Serenergy, for optimizing the time spend on the quality control of the fuel cell electrodesComment: 10 pages, 9 figures, VISAPP202

    Cross-View Action Recognition from Temporal Self-Similarities

    Get PDF
    This paper concerns recognition of human actions under view changes. We explore self-similarities of action sequences over time and observe the striking stability of such measures across views. Building upon this key observation we develop an action descriptor that captures the structure of temporal similarities and dissimilarities within an action sequence. Despite this descriptor not being strictly view-invariant, we provide intuition and experimental validation demonstrating the high stability of self-similarities under view changes. Self-similarity descriptors are also shown stable under action variations within a class as well as discriminative for action recognition. Interestingly, self-similarities computed from different image features possess similar properties and can be used in a complementary fashion. Our method is simple and requires neither structure recovery nor multi-view correspondence estimation. Instead, it relies on weak geometric cues captured by self-similarities and combines them with machine learning for efficient cross-view action recognition. The method is validated on three public datasets, it has similar or superior performance compared to related methods and it performs well even in extreme conditions such as when recognizing actions from top views while using side views for training only

    Sparse motion bases selection for human motion denoising

    Get PDF
    Human motion denoising is an indispensable step of data preprocessing for many motion data based applications. In this paper, we propose a data-driven based human motion denoising method that sparsely selects the most correlated subset of motion bases for clean motion reconstruction. Meanwhile, it takes the statistic property of two common noises, i.e., Gaussian noise and outliers, into account in deriving the objective functions. In particular, our method firstly divides each human pose into five partitions termed as poselets to gain a much fine-grained pose representation. Then, these poselets are reorganized into multiple overlapped poselet groups using a lagged window moving across the entire motion sequence to preserve the embedded spatial 13temporal motion patterns. Afterward, five compacted and representative motion dictionaries are constructed in parallel by means of fast K-SVD in the training phase; they are used to remove the noise and outliers from noisy motion sequences in the testing phase by solving !131-minimization problems. Extensive experiments show that our method outperforms its competitors. More importantly, compared with other data-driven based method, our method does not need to specifically choose the training data, it can be more easily applied to real-world applications

    Exploiting temporal stability and low-rank structure for motion capture data refinement

    Get PDF
    Inspired by the development of the matrix completion theories and algorithms, a low-rank based motion capture (mocap) data refinement method has been developed, which has achieved encouraging results. However, it does not guarantee a stable outcome if we only consider the low-rank property of the motion data. To solve this problem, we propose to exploit the temporal stability of human motion and convert the mocap data refinement problem into a robust matrix completion problem, where both the low-rank structure and temporal stability properties of the mocap data as well as the noise effect are considered. An efficient optimization method derived from the augmented Lagrange multiplier algorithm is presented to solve the proposed model. Besides, a trust data detection method is also introduced to improve the degree of automation for processing the entire set of the data and boost the performance. Extensive experiments and comparisons with other methods demonstrate the effectiveness of our approaches on both predicting missing data and de-noising. © 2014 Elsevier Inc. All rights reserved

    Biview learning for human posture segmentation from 3D points cloud

    Get PDF
    Posture segmentation plays an essential role in human motion analysis. The state-of-the-art method extracts sufficiently high-dimensional features from 3D depth images for each 3D point and learns an efficient body part classifier. However, high-dimensional features are memory-consuming and difficult to handle on large-scale training dataset. In this paper, we propose an efficient two-stage dimension reduction scheme, termed biview learning, to encode two independent views which are depth-difference features (DDF) and relative position features (RPF). Biview learning explores the complementary property of DDF and RPF, and uses two stages to learn a compact yet comprehensive low-dimensional feature space for posture segmentation. In the first stage, discriminative locality alignment (DLA) is applied to the high-dimensional DDF to learn a discriminative low-dimensional representation. In the second stage, canonical correlation analysis (CCA) is used to explore the complementary property of RPF and the dimensionality reduced DDF. Finally, we train a support vector machine (SVM) over the output of CCA. We carefully validate the effectiveness of DLA and CCA utilized in the two-stage scheme on our 3D human points cloud dataset. Experimental results show that the proposed biview learning scheme significantly outperforms the state-of-the-art method for human posture segmentation. © 2014 Qiao et al

    Monocular Expressive Body Regression through Body-Driven Attention

    Full text link
    To understand how people look, interact, or perform tasks, we need to quickly and accurately capture their 3D body, face, and hands together from an RGB image. Most existing methods focus only on parts of the body. A few recent approaches reconstruct full expressive 3D humans from images using 3D body models that include the face and hands. These methods are optimization-based and thus slow, prone to local optima, and require 2D keypoints as input. We address these limitations by introducing ExPose (EXpressive POse and Shape rEgression), which directly regresses the body, face, and hands, in SMPL-X format, from an RGB image. This is a hard problem due to the high dimensionality of the body and the lack of expressive training data. Additionally, hands and faces are much smaller than the body, occupying very few image pixels. This makes hand and face estimation hard when body images are downscaled for neural networks. We make three main contributions. First, we account for the lack of training data by curating a dataset of SMPL-X fits on in-the-wild images. Second, we observe that body estimation localizes the face and hands reasonably well. We introduce body-driven attention for face and hand regions in the original image to extract higher-resolution crops that are fed to dedicated refinement modules. Third, these modules exploit part-specific knowledge from existing face- and hand-only datasets. ExPose estimates expressive 3D humans more accurately than existing optimization methods at a small fraction of the computational cost. Our data, model and code are available for research at https://expose.is.tue.mpg.de .Comment: Accepted in ECCV'20. Project page: http://expose.is.tue.mpg.d

    Automated Home-Cage Behavioural Phenotyping of Mice

    Get PDF
    Neurobehavioral analysis of mouse phenotypes requires the monitoring of mouse behavior over long periods of time. Here, we describe a trainable computer vision system enabling the automated analysis of complex mouse behaviors. We provide software and an extensive manually annotated video database used for training and testing the system. Our system performs on par with human scoring, as measured from ground-truth manual annotations of thousands of clips of freely behaving mice. As a validation of the system, we characterized the home-cage behaviors of two standard inbred and two non-standard mouse strains. From this data we were able to predict in a blind test the strain identity of individual animals with high accuracy. Our video-based software will complement existing sensor based automated approaches and enable an adaptable, comprehensive, high-throughput, fine-grained, automated analysis of mouse behavior.McGovern Institute for Brain ResearchCalifornia Institute of Technology. Broad Fellows Program in Brain CircuitryNational Science Council (China) (TMS-094-1-A032
    • …
    corecore