359 research outputs found

    3D Object Reconstruction from Imperfect Depth Data Using Extended YOLOv3 Network

    Get PDF
    State-of-the-art intelligent versatile applications provoke the usage of full 3D, depth-based streams, especially in the scenarios of intelligent remote control and communications, where virtual and augmented reality will soon become outdated and are forecasted to be replaced by point cloud streams providing explorable 3D environments of communication and industrial data. One of the most novel approaches employed in modern object reconstruction methods is to use a priori knowledge of the objects that are being reconstructed. Our approach is different as we strive to reconstruct a 3D object within much more difficult scenarios of limited data availability. Data stream is often limited by insufficient depth camera coverage and, as a result, the objects are occluded and data is lost. Our proposed hybrid artificial neural network modifications have improved the reconstruction results by 8.53 which allows us for much more precise filling of occluded object sides and reduction of noise during the process. Furthermore, the addition of object segmentation masks and the individual object instance classification is a leap forward towards a general-purpose scene reconstruction as opposed to a single object reconstruction task due to the ability to mask out overlapping object instances and using only masked object area in the reconstruction process

    Abnormal Infant Movements Classification With Deep Learning on Pose-Based Features

    Get PDF
    The pursuit of early diagnosis of cerebral palsy has been an active research area with some very promising results using tools such as the General Movements Assessment (GMA). In our previous work, we explored the feasibility of extracting pose-based features from video sequences to automatically classify infant body movement into two categories, normal and abnormal. The classification was based upon the GMA, which was carried out on the video data by an independent expert reviewer. In this paper we extend our previous work by extracting the normalised pose-based feature sets, Histograms of Joint Orientation 2D (HOJO2D) and Histograms of Joint Displacement 2D (HOJD2D), for use in new deep learning architectures. We explore the viability of using these pose-based feature sets for automated classification within a deep learning framework by carrying out extensive experiments on five new deep learning architectures. Experimental results show that the proposed fully connected neural network FCNet performed robustly across different feature sets. Furthermore, the proposed convolutional neural network architectures demonstrated excellent performance in handling features in higher dimensionality. We make the code, extracted features and associated GMA labels publicly available

    Illumination-Based Data Augmentation for Robust Background Subtraction

    Get PDF
    A core challenge in background subtraction (BGS) is handling videos with sudden illumination changes in consecutive frames. In this paper, we tackle the problem from a data point-of-view using data augmentation. Our method performs data augmentation that not only creates endless data on the fly, but also features semantic transformations of illumination which enhance the generalisation of the model. It successfully simulates flashes and shadows by applying the Euclidean distance transform over a binary mask generated randomly. Such data allows us to effectively train an illumination-invariant deep learning model for BGS. Experimental results demonstrate the contribution of the synthetics in the ability of the models to perform BGS even when significant illumination changes take place

    3D Car Shape Reconstruction from a Single Sketch Image

    Get PDF
    Efficient car shape design is a challenging problem in both the automotive industry and the computer animation/games industry. In this paper, we present a system to reconstruct the 3D car shape from a single 2D sketch image. To learn the correlation between 2D sketches and 3D cars, we propose a Variational Autoencoder deep neural network that takes a 2D sketch and generates a set of multiview depth & mask images, which are more effective representation comparing to 3D mesh, and can be combined to form the 3D car shape. To ensure the volume and diversity of the training data, we propose a feature-preserving car mesh augmentation pipeline for data augmentation. Since deep learning has limited capacity to reconstruct fine-detail features, we propose a lazy learning approach that constructs a small subspace based on a few relevant car samples in the database. Due to the small size of such a subspace, fine details can be represented effectively with a small number of parameters. With a low-cost optimization process, a high-quality car with detailed features is created. Experimental results show that the system performs consistently to create highly realistic cars of substantially different shape and topology, with a very low computational cost

    Prior-less 3D Human Shape Reconstruction with an Earth Mover’s Distance Informed CNN

    Get PDF
    We propose a novel end-to-end deep learning framework, capable of 3D human shape reconstruction from a 2D image without the need of a 3D prior parametric model. We employ a “prior-less” representation of the human shape using unordered point clouds. Due to the lack of prior information, comparing the generated and ground truth point clouds to evaluate the reconstruction error is challenging. We solve this problem by proposing an Earth Mover’s Distance (EMD) function to find the optimal mapping between point clouds. Our experimental results show that we are able to obtain a visually accurate estimation of the 3D human shape from a single 2D image, with some inaccuracy for heavily occluded parts

    Interaction-based Human Activity Comparison

    Get PDF
    Traditional methods for motion comparison consider features from individual characters. However, the semantic meaning of many human activities is usually defined by the interaction between them, such as a high-five interaction of two characters. There is little success in adapting interaction-based features in activity comparison, as they either do not have a fixed topology or are in high dimensional. In this paper, we propose a unified framework for activity comparison from the interaction point of view. Our new metric evaluates the similarity of interaction by adapting the Earth Mover’s Distance onto a customized geometric mesh structure that represents spatial-temporal interactions. This allows us to compare different classes of interactions and discover their intrinsic semantic similarity. We created five interaction databases of different natures, covering both two characters (synthetic and real-people) and character-object interactions, which are open for public uses. We demonstrate how the proposed metric aligns well with the semantic meaning of the interaction. We also apply the metric in interaction retrieval and show how it outperforms existing ones. The proposed method can be used for unsupervised activity detection in monitoring systems and activity retrieval in smart animation systems

    Single Sketch Image based 3D Car Shape Reconstruction with Deep Learning and Lazy Learning

    Get PDF
    Efficient car shape design is a challenging problem in both the automotive industry and the computer animation/games industry. In this paper, we present a system to reconstruct the 3D car shape from a single 2D sketchimage. To learn the correlation between 2D sketches and 3D cars, we propose a Variational Autoencoder deepneural network that takes a 2D sketch and generates a set of multi-view depth and mask images, which forma more effective representation comparing to 3D meshes, and can be effectively fused to generate a 3D carshape. Since global models like deep learning have limited capacity to reconstruct fine-detail features, wepropose a local lazy learning approach that constructs a small subspace based on a few relevant car samples inthe database. Due to the small size of such a subspace, fine details can be represented effectively with a smallnumber of parameters. With a low-cost optimization process, a high-quality car shape with detailed featuresis created. Experimental results show that the system performs consistently to create highly realistic cars ofsubstantially different shape and topology

    Establishing Pose Based Features Using Histograms for the Detection of Abnormal Infant Movements

    Get PDF
    The pursuit of early diagnosis of cerebral palsy has been an active research area with some very promising results using tools such as the General Movements Assessment (GMA). In this paper, we conducted a pilot study on extracting important information from video sequences to classify the body movement into two categories, normal and abnormal, and compared the results provided by an independent expert reviewer based on GMA. We present two new pose-based features, Histograms of Joint Orientation 2D (HOJO2D) and Histograms of Joint Displacement 2D (HOJD2D), for the pose-based analysis and classification of infant body movement from video footage. We extract the 2D skeletal joint locations from 2D RGB images using Cao et al.’s method 1. Using the MINI-RGBD dataset 2, we further segment the body into local regions to extract part specific features. As a result, the pose and the degree of displacement are represented by histograms of normalised data. To demonstrate the effectiveness of the proposed features, we trained several classifiers using combinations of HOJO2D and HOJD2D features and conducted a series of experiments to classify the body movement into categories. The classification algorithms used included k-Nearest Neighbour (kNN, k=1 and k=3), Linear Discriminant Analysis (LDA) and the Ensemble classifier. Encouraging results were attained, with high accuracy (91.67{\%}) obtained using the Ensemble classifier

    An interactive motion analysis framework for diagnosing and rectifying potential injuries caused through resistance training

    Get PDF
    With the rapid increase in individuals participating in resistance training activities, the number of injuries pertaining to these activities has also grown just as aggressively. Diagnosing the causes of injuries and discomfort requires a large amount of resources from highly experienced physiotherapists. In this paper, we propose a new framework to analyse and visualize movement patterns during performance of four major compound lifts. The analysis generated will be used to efficiently determine whether the exercises are being performed correctly, ensuring anatomy remains within its functional range of motion, in order to prevent strain or discomfort that may lead to injury

    Unifying Person and Vehicle Re-Identification

    Get PDF
    Person and vehicle re-identification (re-ID) are important challenges for the analysis of the burgeoning collection of urban surveillance videos. To efficiently evaluate such videos, which are populated with both vehicles and pedestrians, it would be preferable to have one unified framework with effective performance across both domains. Unfortunately, due to the contrasting composition of humans and vehicles, no architecture has yet been established that can adequately perform both tasks. We release a Person and Vehicle Unified Data Set (PVUD) comprising of both pedestrians and vehicles from popular existing re-ID data sets, in order to better model the data that we would expect to find in the real world. We exploit the generalisation ability of metric learning to propose a re-ID framework that can learn to re-identify humans and vehicles simultaneously. We design our network, MidTriNet, to harness the power of mid-level features to develop better representations for the re-ID tasks. We help the system to handle mixed data by appending unification terms with additional hard negative and hard positive mining to MidTriNet. We attain comparable accuracy training on PVUD to training on the comprising data sets separately, supporting the system's generalisation power. To further demonstrate the effectiveness of our framework, we also obtain results better than, or competitive with, the state-of-the-art on each of the Market-1501, CUHK03, VehicleID and VeRi data sets
    corecore