9 research outputs found
RT-MonoDepth: Real-time Monocular Depth Estimation on Embedded Systems
Depth sensing is a crucial function of unmanned aerial vehicles and
autonomous vehicles. Due to the small size and simple structure of monocular
cameras, there has been a growing interest in depth estimation from a single
RGB image. However, state-of-the-art monocular CNN-based depth estimation
methods using fairly complex deep neural networks are too slow for real-time
inference on embedded platforms. This paper addresses the problem of real-time
depth estimation on embedded systems. We propose two efficient and lightweight
encoder-decoder network architectures, RT-MonoDepth and RT-MonoDepth-S, to
reduce computational complexity and latency. Our methodologies demonstrate that
it is possible to achieve similar accuracy as prior state-of-the-art works on
depth estimation at a faster inference speed. Our proposed networks,
RT-MonoDepth and RT-MonoDepth-S, runs at 18.4\&30.5 FPS on NVIDIA Jetson Nano
and 253.0\&364.1 FPS on NVIDIA Jetson AGX Orin on a single RGB image of
resolution 640192, and achieve relative state-of-the-art accuracy on
the KITTI dataset. To the best of the authors' knowledge, this paper achieves
the best accuracy and fastest inference speed compared with existing fast
monocular depth estimation methods.Comment: 8 pages, 5 figure
GMC-IQA: Exploiting Global-correlation and Mean-opinion Consistency for No-reference Image Quality Assessment
Due to the subjective nature of image quality assessment (IQA), assessing
which image has better quality among a sequence of images is more reliable than
assigning an absolute mean opinion score for an image. Thus, IQA models are
evaluated by global correlation consistency (GCC) metrics like PLCC and SROCC,
rather than mean opinion consistency (MOC) metrics like MAE and MSE. However,
most existing methods adopt MOC metrics to define their loss functions, due to
the infeasible computation of GCC metrics during training. In this work, we
construct a novel loss function and network to exploit Global-correlation and
Mean-opinion Consistency, forming a GMC-IQA framework. Specifically, we propose
a novel GCC loss by defining a pairwise preference-based rank estimation to
solve the non-differentiable problem of SROCC and introducing a queue mechanism
to reserve previous data to approximate the global results of the whole data.
Moreover, we propose a mean-opinion network, which integrates diverse opinion
features to alleviate the randomness of weight learning and enhance the model
robustness. Experiments indicate that our method outperforms SOTA methods on
multiple authentic datasets with higher accuracy and generalization. We also
adapt the proposed loss to various networks, which brings better performance
and more stable training
Multiparameter Space Decision Voting and Fusion Features for Facial Expression Recognition
Obtaining a valid facial expression recognition (FER) method is still a research hotspot in the artificial intelligence field. In this paper, we propose a multiparameter fusion feature space and decision voting-based classification for facial expression recognition. First, the parameter of the fusion feature space is determined according to the cross-validation recognition accuracy of the Multiscale Block Local Binary Pattern Uniform Histogram (MB-LBPUH) descriptor filtering over the training samples. According to the parameters, we build various fusion feature spaces by employing multiclass linear discriminant analysis (LDA). In these spaces, fusion features composed of MB-LBPUH and Histogram of Oriented Gradient (HOG) features are used to represent different facial expressions. Finally, to resolve the inconvenient classifiable pattern problem caused by similar expression classes, a nearest neighbor-based decision voting strategy is designed to predict the classification results. In experiments with the JAFFE, CK+, and TFEID datasets, the proposed model clearly outperformed existing algorithms
STDCâFlow: large displacement flow field estimation using similarity transformationâbased dense correspondence
In order to improve the accuracy and robustness of optical flow computation under large displacements and motion occlusions, the authors present in this study a large displacement flow field estimation approach using similarity transformationâbased dense correspondence, named STDCâFlow approach. First, the authors compute an initial nearestâneighbour field by using the STDCâFlow of the consecutive two frames, and then extract the consistent regions as the robust nearestâneighbour field and label the inconsistent regions as the occlusion areas. Second, they improve a nonâlocal total variation with the L1 norm optical flow model by using the occlusion information to modify the weighted median filtering optimisation. Third, they fuse the robust nearestâneighbour field and the computed flow field of the improved variational optical flow model to construct the final flow field by using the quadratic pseudoâboolean optimisation fusion algorithm. Finally, the authors compare the proposed STDCâFlow method with several stateâofâtheâart approaches including the variational and deep learningâbased optical flow models by using the MPIâSintel and KITTI evaluation databases. The comparison results demonstrate that the proposed STDCâFlow method has a high accuracy for flow field computation, especially the capacity of dealing with large displacements and motion occlusions
A microfluidic serial dilutor (MSD):Design optimization and application to tuning of liposome nanoparticle preparation
Dilution of a sample over several orders of magnitude in concentration is a routine procedure in many analytical, bioanalytical, diagnostics, and formulation laboratories. Microfluidics offer the opportunity to automate the dilution procedure with many successful designs to be found in literature, yet fast microfluidic generation of a dilution series over several orders of magnitude remains so far unaddressed. This study realized a microfluidic serial dilutor (MSD) having up to 4 outlets and able to generate fast a serial dilution of a sample up to 4 orders of magnitude. The core design based on a cascade dilution of sample to diluent in a 1:10 volume ratio was fully characterized experimentally and using Computational Fluid Dynamics (CFD). The MSD was interfaced with a new absorbance measuring device based on a laser diode and spectral sensor, which enabled evaluating the dilution performance and confirming the effectiveness of the design. The MSD was applied to the synthesis of liposome nanoparticles by solvent diffusion method, enabling fine-tunning of synthesis of a popular drug and vaccine delivery microcarrier on a wide range of sizes by simply adjusting the flow rate and flow rate ratio. The MSD design showed to be simple, reliable, with operation at reduced pressure drops, and delivery of highly reproducible results.</p
Controllable facial attribute editing via Gaussian mixture model disentanglement
Generative adversarial networks (GANs) have made much progress in the field of high-quality and realistic facial image synthesis in recent years. However, compared with their powerful generation ability, it is difficult for users to modify the desired attributes of the resulting image while keeping the others. How to disentangle the latent space of pre-trained GANs is essential and critical for controllable image synthesis. In this paper, a novel controllable facial attribute editing algorithm based on the Gaussian mixture model (GMM) representation is proposed. First, we assume that the latent variables with respect to each facial attribute lie in a subspace of the whole latent manifold composed of a fixed number of learned features, and each attribute subspace can be modeled by a GMM. Then, to avoid unintended changes during attribute editing, a coordinate accumulation strategy with orthogonal regularization is introduced to enhance the independence of distinct attribute subspaces which helps improving the controllability of attribute editing. In addition, a resampling strategy is utilized to improve the stability of the model. Through qualitative and quantitative experimental results, the proposed method achieves the state-of-the-art performance on facial attribute editing, and improves the controllability of desired attribute editing