31 research outputs found

    HiMFR: A Hybrid Masked Face Recognition Through Face Inpainting

    Full text link
    To recognize the masked face, one of the possible solutions could be to restore the occluded part of the face first and then apply the face recognition method. Inspired by the recent image inpainting methods, we propose an end-to-end hybrid masked face recognition system, namely HiMFR, consisting of three significant parts: masked face detector, face inpainting, and face recognition. The masked face detector module applies a pretrained Vision Transformer (ViT\_b32) to detect whether faces are covered with masked or not. The inpainting module uses a fine-tune image inpainting model based on a Generative Adversarial Network (GAN) to restore faces. Finally, the hybrid face recognition module based on ViT with an EfficientNetB3 backbone recognizes the faces. We have implemented and evaluated our proposed method on four different publicly available datasets: CelebA, SSDMNV2, MAFA, {Pubfig83} with our locally collected small dataset, namely Face5. Comprehensive experimental results show the efficacy of the proposed HiMFR method with competitive performance. Code is available at https://github.com/mdhosen/HiMFRComment: 7 pages, 6 figures, International Conference on Pattern Recognition Workshop: Deep Learning for Visual Detection and Recognitio

    Masked Face Inpainting Through Residual Attention UNet

    Full text link
    Realistic image restoration with high texture areas such as removing face masks is challenging. The state-of-the-art deep learning-based methods fail to guarantee high-fidelity, cause training instability due to vanishing gradient problems (e.g., weights are updated slightly in initial layers) and spatial information loss. They also depend on intermediary stage such as segmentation meaning require external mask. This paper proposes a blind mask face inpainting method using residual attention UNet to remove the face mask and restore the face with fine details while minimizing the gap with the ground truth face structure. A residual block feeds info to the next layer and directly into the layers about two hops away to solve the gradient vanishing problem. Besides, the attention unit helps the model focus on the relevant mask region, reducing resources and making the model faster. Extensive experiments on the publicly available CelebA dataset show the feasibility and robustness of our proposed model. Code is available at \url{https://github.com/mdhosen/Mask-Face-Inpainting-Using-Residual-Attention-Unet}Comment: 5 pages, 8 figures, Innovations in Intelligent Systems and Applications Conferenc

    Denoising Diffusion Probabilistic Model for Retinal Image Generation and Segmentation

    Full text link
    Experts use retinal images and vessel trees to detect and diagnose various eye, blood circulation, and brain-related diseases. However, manual segmentation of retinal images is a time-consuming process that requires high expertise and is difficult due to privacy issues. Many methods have been proposed to segment images, but the need for large retinal image datasets limits the performance of these methods. Several methods synthesize deep learning models based on Generative Adversarial Networks (GAN) to generate limited sample varieties. This paper proposes a novel Denoising Diffusion Probabilistic Model (DDPM) that outperformed GANs in image synthesis. We developed a Retinal Trees (ReTree) dataset consisting of retinal images, corresponding vessel trees, and a segmentation network based on DDPM trained with images from the ReTree dataset. In the first stage, we develop a two-stage DDPM that generates vessel trees from random numbers belonging to a standard normal distribution. Later, the model is guided to generate fundus images from given vessel trees and random distribution. The proposed dataset has been evaluated quantitatively and qualitatively. Quantitative evaluation metrics include Frechet Inception Distance (FID) score, Jaccard similarity coefficient, Cohen's kappa, Matthew's Correlation Coefficient (MCC), precision, recall, F1-score, and accuracy. We trained the vessel segmentation model with synthetic data to validate our dataset's efficiency and tested it on authentic data. Our developed dataset and source code is available at https://github.com/AAleka/retree.Comment: International Conference on Computational Photography 2023 (ICCP 2023

    Saliency-aware Stereoscopic Video Retargeting

    Full text link
    Stereo video retargeting aims to resize an image to a desired aspect ratio. The quality of retargeted videos can be significantly impacted by the stereo videos spatial, temporal, and disparity coherence, all of which can be impacted by the retargeting process. Due to the lack of a publicly accessible annotated dataset, there is little research on deep learning-based methods for stereo video retargeting. This paper proposes an unsupervised deep learning-based stereo video retargeting network. Our model first detects the salient objects and shifts and warps all objects such that it minimizes the distortion of the salient parts of the stereo frames. We use 1D convolution for shifting the salient objects and design a stereo video Transformer to assist the retargeting process. To train the network, we use the parallax attention mechanism to fuse the left and right views and feed the retargeted frames to a reconstruction module that reverses the retargeted frames to the input frames. Therefore, the network is trained in an unsupervised manner. Extensive qualitative and quantitative experiments and ablation studies on KITTI stereo 2012 and 2015 datasets demonstrate the efficiency of the proposed method over the existing state-of-the-art methods. The code is available at https://github.com/z65451/SVR/.Comment: 8 pages excluding references. CVPRW conferenc

    A New Dataset and Transformer for Stereoscopic Video Super-Resolution

    Full text link
    Stereo video super-resolution (SVSR) aims to enhance the spatial resolution of the low-resolution video by reconstructing the high-resolution video. The key challenges in SVSR are preserving the stereo-consistency and temporal-consistency, without which viewers may experience 3D fatigue. There are several notable works on stereoscopic image super-resolution, but there is little research on stereo video super-resolution. In this paper, we propose a novel Transformer-based model for SVSR, namely Trans-SVSR. Trans-SVSR comprises two key novel components: a spatio-temporal convolutional self-attention layer and an optical flow-based feed-forward layer that discovers the correlation across different video frames and aligns the features. The parallax attention mechanism (PAM) that uses the cross-view information to consider the significant disparities is used to fuse the stereo views. Due to the lack of a benchmark dataset suitable for the SVSR task, we collected a new stereoscopic video dataset, SVSR-Set, containing 71 full high-definition (HD) stereo videos captured using a professional stereo camera. Extensive experiments on the collected dataset, along with two other datasets, demonstrate that the Trans-SVSR can achieve competitive performance compared to the state-of-the-art methods. Project code and additional results are available at https://github.com/H-deep/Trans-SVSR/Comment: Conference on Computer Vision and Pattern Recognition (CVPR 2022

    CANAL DISH (CD), THE NEW ANTIMICROBIAL TESTING APPARATUS

    Get PDF
    Objective: This writing aims to introduce new antimicrobial test apparatus called Canal Dish (CD), theoretically.Methods: We have designed two types of CD such as Circular CD (CCD) and Square CD (SCD). Internally, the CCD is a 80 mm diameter circular while the SCD is a 80×80 mm square CD plate. Both of them contain 2(40×2) mm parallel travelling canals from the each CD-centre having radius of 3 mm. Canals are 6 mm in depth.Results: The features of CCD and SCD indicate possible allowance of various size, low media consuming, the inclusion of multiple microorganisms and/or test samples/doses, ease of handling; therefore, understanding, rapidity, and economy.Conclusion: CD may replace currently used Petri dishes due to its cost-effectiveness, rapidity, ease of handling and a wider range of applicability.Keywords: Antimicrobial assay, Canal dish, Circular, Square, New apparatu

    MACERATION-VORTEX-TECHNIQUE (MVT), A RAPID AND NEW EXTRACTION METHOD IN PHYTO-PHARMACOLOGICAL SCREENING

    Get PDF
    Extraction is a process of preparation of extracts from biological materials (plant/animal/microorganism) in the essence of drug discovery and development scienticism. This writing is aimed at proposing a new, rapid, economical and easy extraction method, Maceration-Vortex-Technique (MVT). For this 2-5 g of powdered materials is sucked into 5-10 ml of solvent recommended for extraction. A cleaned, small amber-colored glass bottle of 20-25 ml capacity is needed for this purpose. Powdered materials are mixed with the solvent, then followed by shaking vigorously; shaken for 1.5-2 h and vortexed for 5 min. The extract is collected by immediate filtration through filter paper (Whatmann no. 1) and allowed for concentration and/or solvent partitioning. The MVT allows a 3 h extraction of plant materials. A small amount of extraction material is needed along with a small quantity of solvent which is a marker for the economy of this extraction process. In conclusion, rapidity in the extraction process is the rapidity in the screening process of biological materials. The MVT may be one of the speediest and most economical extraction processes.Keywords: Extraction, Maceration-Vortex-Technique, New method, Phyto-pharmacological screenin

    Saliency-aware Stereoscopic Video Retargeting

    No full text
    Stereo video retargeting aims to resize an image to a desired aspect ratio. The quality of retargeted videos can be significantly impacted by the stereo video’s spatial, temporal, and disparity coherence, all of which can be impacted by the retargeting process. Due to the lack of a publicly accessible annotated dataset, there is little research on deep learning-based methods for stereo video retargeting. This paper proposes an unsupervised deep learning-based stereo video retargeting network. Our model first detects the salient objects and shifts and warps all objects such that it minimizes the distortion of the salient parts of the stereo frames. We use 1D convolution for shifting the salient objects and design a stereo video Transformer to assist the retargeting process. To train the network, we use the parallax attention mechanism to fuse the left and right views and feed the retargeted frames to a reconstruction module that reverses the retargeted frames to the input frames. Therefore, the network is trained in an unsupervised manner. Extensive qualitative and quantitative experiments and ablation studies on KITTI stereo 2012 and 2015 datasets demonstrate the efficiency of the proposed method over the existing state-of-the-art methods. The code is available at https://github.com/z65451/SVR/
    corecore