9 research outputs found

    RT-MonoDepth: Real-time Monocular Depth Estimation on Embedded Systems

    Full text link
    Depth sensing is a crucial function of unmanned aerial vehicles and autonomous vehicles. Due to the small size and simple structure of monocular cameras, there has been a growing interest in depth estimation from a single RGB image. However, state-of-the-art monocular CNN-based depth estimation methods using fairly complex deep neural networks are too slow for real-time inference on embedded platforms. This paper addresses the problem of real-time depth estimation on embedded systems. We propose two efficient and lightweight encoder-decoder network architectures, RT-MonoDepth and RT-MonoDepth-S, to reduce computational complexity and latency. Our methodologies demonstrate that it is possible to achieve similar accuracy as prior state-of-the-art works on depth estimation at a faster inference speed. Our proposed networks, RT-MonoDepth and RT-MonoDepth-S, runs at 18.4\&30.5 FPS on NVIDIA Jetson Nano and 253.0\&364.1 FPS on NVIDIA Jetson AGX Orin on a single RGB image of resolution 640×\times192, and achieve relative state-of-the-art accuracy on the KITTI dataset. To the best of the authors' knowledge, this paper achieves the best accuracy and fastest inference speed compared with existing fast monocular depth estimation methods.Comment: 8 pages, 5 figure

    GMC-IQA: Exploiting Global-correlation and Mean-opinion Consistency for No-reference Image Quality Assessment

    Full text link
    Due to the subjective nature of image quality assessment (IQA), assessing which image has better quality among a sequence of images is more reliable than assigning an absolute mean opinion score for an image. Thus, IQA models are evaluated by global correlation consistency (GCC) metrics like PLCC and SROCC, rather than mean opinion consistency (MOC) metrics like MAE and MSE. However, most existing methods adopt MOC metrics to define their loss functions, due to the infeasible computation of GCC metrics during training. In this work, we construct a novel loss function and network to exploit Global-correlation and Mean-opinion Consistency, forming a GMC-IQA framework. Specifically, we propose a novel GCC loss by defining a pairwise preference-based rank estimation to solve the non-differentiable problem of SROCC and introducing a queue mechanism to reserve previous data to approximate the global results of the whole data. Moreover, we propose a mean-opinion network, which integrates diverse opinion features to alleviate the randomness of weight learning and enhance the model robustness. Experiments indicate that our method outperforms SOTA methods on multiple authentic datasets with higher accuracy and generalization. We also adapt the proposed loss to various networks, which brings better performance and more stable training

    Multiparameter Space Decision Voting and Fusion Features for Facial Expression Recognition

    No full text
    Obtaining a valid facial expression recognition (FER) method is still a research hotspot in the artificial intelligence field. In this paper, we propose a multiparameter fusion feature space and decision voting-based classification for facial expression recognition. First, the parameter of the fusion feature space is determined according to the cross-validation recognition accuracy of the Multiscale Block Local Binary Pattern Uniform Histogram (MB-LBPUH) descriptor filtering over the training samples. According to the parameters, we build various fusion feature spaces by employing multiclass linear discriminant analysis (LDA). In these spaces, fusion features composed of MB-LBPUH and Histogram of Oriented Gradient (HOG) features are used to represent different facial expressions. Finally, to resolve the inconvenient classifiable pattern problem caused by similar expression classes, a nearest neighbor-based decision voting strategy is designed to predict the classification results. In experiments with the JAFFE, CK+, and TFEID datasets, the proposed model clearly outperformed existing algorithms

    STDC‐Flow: large displacement flow field estimation using similarity transformation‐based dense correspondence

    No full text
    In order to improve the accuracy and robustness of optical flow computation under large displacements and motion occlusions, the authors present in this study a large displacement flow field estimation approach using similarity transformation‐based dense correspondence, named STDC‐Flow approach. First, the authors compute an initial nearest‐neighbour field by using the STDC‐Flow of the consecutive two frames, and then extract the consistent regions as the robust nearest‐neighbour field and label the inconsistent regions as the occlusion areas. Second, they improve a non‐local total variation with the L1 norm optical flow model by using the occlusion information to modify the weighted median filtering optimisation. Third, they fuse the robust nearest‐neighbour field and the computed flow field of the improved variational optical flow model to construct the final flow field by using the quadratic pseudo‐boolean optimisation fusion algorithm. Finally, the authors compare the proposed STDC‐Flow method with several state‐of‐the‐art approaches including the variational and deep learning‐based optical flow models by using the MPI‐Sintel and KITTI evaluation databases. The comparison results demonstrate that the proposed STDC‐Flow method has a high accuracy for flow field computation, especially the capacity of dealing with large displacements and motion occlusions

    A microfluidic serial dilutor (MSD):Design optimization and application to tuning of liposome nanoparticle preparation

    Get PDF
    Dilution of a sample over several orders of magnitude in concentration is a routine procedure in many analytical, bioanalytical, diagnostics, and formulation laboratories. Microfluidics offer the opportunity to automate the dilution procedure with many successful designs to be found in literature, yet fast microfluidic generation of a dilution series over several orders of magnitude remains so far unaddressed. This study realized a microfluidic serial dilutor (MSD) having up to 4 outlets and able to generate fast a serial dilution of a sample up to 4 orders of magnitude. The core design based on a cascade dilution of sample to diluent in a 1:10 volume ratio was fully characterized experimentally and using Computational Fluid Dynamics (CFD). The MSD was interfaced with a new absorbance measuring device based on a laser diode and spectral sensor, which enabled evaluating the dilution performance and confirming the effectiveness of the design. The MSD was applied to the synthesis of liposome nanoparticles by solvent diffusion method, enabling fine-tunning of synthesis of a popular drug and vaccine delivery microcarrier on a wide range of sizes by simply adjusting the flow rate and flow rate ratio. The MSD design showed to be simple, reliable, with operation at reduced pressure drops, and delivery of highly reproducible results.</p

    Controllable facial attribute editing via Gaussian mixture model disentanglement

    No full text
    Generative adversarial networks (GANs) have made much progress in the field of high-quality and realistic facial image synthesis in recent years. However, compared with their powerful generation ability, it is difficult for users to modify the desired attributes of the resulting image while keeping the others. How to disentangle the latent space of pre-trained GANs is essential and critical for controllable image synthesis. In this paper, a novel controllable facial attribute editing algorithm based on the Gaussian mixture model (GMM) representation is proposed. First, we assume that the latent variables with respect to each facial attribute lie in a subspace of the whole latent manifold composed of a fixed number of learned features, and each attribute subspace can be modeled by a GMM. Then, to avoid unintended changes during attribute editing, a coordinate accumulation strategy with orthogonal regularization is introduced to enhance the independence of distinct attribute subspaces which helps improving the controllability of attribute editing. In addition, a resampling strategy is utilized to improve the stability of the model. Through qualitative and quantitative experimental results, the proposed method achieves the state-of-the-art performance on facial attribute editing, and improves the controllability of desired attribute editing
    corecore