8 research outputs found

    Fast character modeling with sketch-based PDE surfaces

    Get PDF
    © 2020, The Author(s). Virtual characters are 3D geometric models of characters. They have a lot of applications in multimedia. In this paper, we propose a new physics-based deformation method and efficient character modelling framework for creation of detailed 3D virtual character models. Our proposed physics-based deformation method uses PDE surfaces. Here PDE is the abbreviation of Partial Differential Equation, and PDE surfaces are defined as sculpting force-driven shape representations of interpolation surfaces. Interpolation surfaces are obtained by interpolating key cross-section profile curves and the sculpting force-driven shape representation uses an analytical solution to a vector-valued partial differential equation involving sculpting forces to quickly obtain deformed shapes. Our proposed character modelling framework consists of global modeling and local modeling. The global modeling is also called model building, which is a process of creating a whole character model quickly with sketch-guided and template-based modeling techniques. The local modeling produces local details efficiently to improve the realism of the created character model with four shape manipulation techniques. The sketch-guided global modeling generates a character model from three different levels of sketched profile curves called primary, secondary and key cross-section curves in three orthographic views. The template-based global modeling obtains a new character model by deforming a template model to match the three different levels of profile curves. Four shape manipulation techniques for local modeling are investigated and integrated into the new modelling framework. They include: partial differential equation-based shape manipulation, generalized elliptic curve-driven shape manipulation, sketch assisted shape manipulation, and template-based shape manipulation. These new local modeling techniques have both global and local shape control functions and are efficient in local shape manipulation. The final character models are represented with a collection of surfaces, which are modeled with two types of geometric entities: generalized elliptic curves (GECs) and partial differential equation-based surfaces. Our experiments indicate that the proposed modeling approach can build detailed and realistic character models easily and quickly

    Voxel-Based 3D Object Reconstruction from Single 2D Image Using Variational Autoencoders

    No full text
    In recent years, learning-based approaches for 3D reconstruction have gained much popularity due to their encouraging results. However, unlike 2D images, 3D cannot be represented in its canonical form to make it computationally lean and memory-efficient. Moreover, the generation of a 3D model directly from a single 2D image is even more challenging due to the limited details available from the image for 3D reconstruction. Existing learning-based techniques still lack the desired resolution, efficiency, and smoothness of the 3D models required for many practical applications. In this paper, we propose voxel-based 3D object reconstruction (V3DOR) from a single 2D image for better accuracy, one using autoencoders (AE) and another using variational autoencoders (VAE). The encoder part of both models is used to learn suitable compressed latent representation from a single 2D image, and a decoder generates a corresponding 3D model. Our contribution is twofold. First, to the best of the authors’ knowledge, it is the first time that variational autoencoders (VAE) have been employed for the 3D reconstruction problem. Second, the proposed models extract a discriminative set of features and generate a smoother and high-resolution 3D model. To evaluate the efficacy of the proposed method, experiments have been conducted on a benchmark ShapeNet data set. The results confirm that the proposed method outperforms state-of-the-art methods

    DSTnet: Deformable Spatio-Temporal Convolutional Residual Network for Video Super-Resolution

    No full text
    Video super-resolution (VSR) aims at generating high-resolution (HR) video frames with plausible and temporally consistent details using their low-resolution (LR) counterparts, and neighboring frames. The key challenge for VSR lies in the effective exploitation of intra-frame spatial relation and temporal dependency between consecutive frames. Many existing techniques utilize spatial and temporal information separately and compensate motion via alignment. These methods cannot fully exploit the spatio-temporal information that significantly affects the quality of resultant HR videos. In this work, a novel deformable spatio-temporal convolutional residual network (DSTnet) is proposed to overcome the issues of separate motion estimation and compensation methods for VSR. The proposed framework consists of 3D convolutional residual blocks decomposed into spatial and temporal (2+1) D streams. This decomposition can simultaneously utilize input video’s spatial and temporal features without a separate motion estimation and compensation module. Furthermore, the deformable convolution layers have been used in the proposed model that enhances its motion-awareness capability. Our contribution is twofold; firstly, the proposed approach can overcome the challenges in modeling complex motions by efficiently using spatio-temporal information. Secondly, the proposed model has fewer parameters to learn than state-of-the-art methods, making it a computationally lean and efficient framework for VSR. Experiments are conducted on a benchmark Vid4 dataset to evaluate the efficacy of the proposed approach. The results demonstrate that the proposed approach achieves superior quantitative and qualitative performance compared to the state-of-the-art methods

    Hyperspectral Image Classification via a Novel Spectral–Spatial 3D ConvLSTM-CNN

    No full text
    In recent years, deep learning-based models have produced encouraging results for hyperspectral image (HSI) classification. Specifically, Convolutional Long Short-Term Memory (ConvLSTM) has shown good performance for learning valuable features and modeling long-term dependencies in spectral data. However, it is less effective for learning spatial features, which is an integral part of hyperspectral images. Alternatively, convolutional neural networks (CNNs) can learn spatial features, but they possess limitations in handling long-term dependencies due to the local feature extraction in these networks. Considering these factors, this paper proposes an end-to-end Spectral-Spatial 3D ConvLSTM-CNN based Residual Network (SSCRN), which combines 3D ConvLSTM and 3D CNN for handling both spectral and spatial information, respectively. The contribution of the proposed network is twofold. Firstly, it addresses the long-term dependencies of spectral dimension using 3D ConvLSTM to capture the information related to various ground materials effectively. Secondly, it learns the discriminative spatial features using 3D CNN by employing the concept of the residual blocks to accelerate the training process and alleviate the overfitting. In addition, SSCRN uses batch normalization and dropout to regularize the network for smooth learning. The proposed framework is evaluated on three benchmark datasets widely used by the research community. The results confirm that SSCRN outperforms state-of-the-art methods with an overall accuracy of 99.17%, 99.67%, and 99.31% over Indian Pines, Salinas, and Pavia University datasets, respectively. Moreover, it is worth mentioning that these excellent results were achieved with comparatively fewer epochs, which also confirms the fast learning capabilities of the SSCRN

    Android-Based Verification System for Banknotes

    No full text
    With the advancement in imaging technologies for scanning and printing, production of counterfeit banknotes has become cheaper, easier, and more common. The proliferation of counterfeit banknotes causes loss to banks, traders, and individuals involved in financial transactions. Hence, it is inevitably needed that efficient and reliable techniques for detection of counterfeit banknotes should be developed. With the availability of powerful smartphones, it has become possible to perform complex computations and image processing related tasks on these phones. In addition to this, smartphone users have increased greatly and numbers continue to increase. This is a great motivating factor for researchers and developers to propose innovative mobile-based solutions. In this study, a novel technique for verification of Pakistani banknotes is developed, targeting smartphones with android platform. The proposed technique is based on statistical features, and surface roughness of a banknote, representing different properties of the banknote, such as paper material, printing ink, paper quality, and surface roughness. The selection of these features is motivated by the X-ray Diffraction (XRD) and Scanning Electron Microscopy (SEM) analysis of genuine and counterfeit banknotes. In this regard, two important areas of the banknote, i.e., serial number and flag portions were considered since these portions showed the maximum difference between genuine and counterfeit banknote. The analysis confirmed that genuine and counterfeit banknotes are very different in terms of the printing process, the ingredients used in preparation of banknotes, and the quality of the paper. After extracting the discriminative set of features, support vector machine is used for classification. The experimental results confirm the high accuracy of the proposed technique

    Melanoma Classification from Dermoscopy Images Using Ensemble of Convolutional Neural Networks

    No full text
    Human skin is the most exposed part of the human body that needs constant protection and care from heat, light, dust, and direct exposure to other harmful radiation, such as UV rays. Skin cancer is one of the dangerous diseases found in humans. Melanoma is a form of skin cancer that begins in the cells (melanocytes) that control the pigment in human skin. Early detection and diagnosis of skin cancer, such as melanoma, is necessary to reduce the death rate due to skin cancer. In this paper, the classification of acral lentiginous melanoma, a type of melanoma with benign nevi, is being carried out. The proposed stacked ensemble method for melanoma classification uses different pre-trained models, such as Xception, Inceptionv3, InceptionResNet-V2, DenseNet121, and DenseNet201, by employing the concept of transfer learning and fine-tuning. The selection of pre-trained CNN architectures for transfer learning is based on models having the highest top-1 and top-5 accuracies on ImageNet. A novel stacked ensemble-based framework is presented to improve the generalizability and increase robustness by fusing fine-tuned pre-trained CNN models for acral lentiginous melanoma classification. The performance of the proposed method is evaluated by experimenting on a Figshare benchmark dataset. The impact of applying different augmentation techniques has also been analyzed through extensive experimentations. The results confirm that the proposed method outperforms state-of-the-art techniques and achieves an accuracy of 97.93%

    Melanoma Classification from Dermoscopy Images Using Ensemble of Convolutional Neural Networks

    No full text
    Human skin is the most exposed part of the human body that needs constant protection and care from heat, light, dust, and direct exposure to other harmful radiation, such as UV rays. Skin cancer is one of the dangerous diseases found in humans. Melanoma is a form of skin cancer that begins in the cells (melanocytes) that control the pigment in human skin. Early detection and diagnosis of skin cancer, such as melanoma, is necessary to reduce the death rate due to skin cancer. In this paper, the classification of acral lentiginous melanoma, a type of melanoma with benign nevi, is being carried out. The proposed stacked ensemble method for melanoma classification uses different pre-trained models, such as Xception, Inceptionv3, InceptionResNet-V2, DenseNet121, and DenseNet201, by employing the concept of transfer learning and fine-tuning. The selection of pre-trained CNN architectures for transfer learning is based on models having the highest top-1 and top-5 accuracies on ImageNet. A novel stacked ensemble-based framework is presented to improve the generalizability and increase robustness by fusing fine-tuned pre-trained CNN models for acral lentiginous melanoma classification. The performance of the proposed method is evaluated by experimenting on a Figshare benchmark dataset. The impact of applying different augmentation techniques has also been analyzed through extensive experimentations. The results confirm that the proposed method outperforms state-of-the-art techniques and achieves an accuracy of 97.93%

    Human action recognition using deep rule-based classifier

    Get PDF
    In recent years, numerous techniques have been proposed for human activity recognition (HAR) from images and videos. These techniques can be divided into two major categories: handcrafted and deep learning. Deep Learning-based models have produced remarkable results for HAR. However, these models have several shortcomings, such as the requirement for a massive amount of training data, lack of transparency, offline nature, and poor interpretability of their internal parameters. In this paper, a new approach for HAR is proposed, which consists of an interpretable, self-evolving, and self-organizing set of 0-order If...THEN rules. This approach is entirely data-driven, and non-parametric; thus, prototypes are identified automatically during the training process. To demonstrate the effectiveness of the proposed method, a set of high-level features is obtained using a pre-trained deep convolution neural network model, and a recently introduced deep rule-based classifier is applied for classification. Experiments are performed on a challenging benchmark dataset UCF50; results confirmed that the proposed approach outperforms state-of-the-art methods. In addition to this, an ablation study is conducted to demonstrate the efficacy of the proposed approach by comparing the performance of our DRB classifier with four state-of-the-art classifiers. This analysis revealed that the DRB classifier could perform better than state-of-the-art classifiers, even with limited training samples
    corecore