8 research outputs found
Amortized Variational Inference: Towards the Mathematical Foundation and Review
The core principle of Variational Inference (VI) is to convert the
statistical inference problem of computing complex posterior probability
densities into a tractable optimization problem. This property enables VI to be
faster than several sampling-based techniques. However, the traditional VI
algorithm is not scalable to large data sets and is unable to readily infer
out-of-bounds data points without re-running the optimization process. Recent
developments in the field, like stochastic-, black box- and amortized-VI, have
helped address these issues. Generative modeling tasks nowadays widely make use
of amortized VI for its efficiency and scalability, as it utilizes a
parameterized function to learn the approximate posterior density parameters.
With this paper, we review the mathematical foundations of various VI
techniques to form the basis for understanding amortized VI. Additionally, we
provide an overview of the recent trends that address several issues of
amortized VI, such as the amortization gap, generalization issues, inconsistent
representation learning, and posterior collapse. Finally, we analyze alternate
divergence measures that improve VI optimization
FastCLIPstyler: Optimisation-free Text-based Image Style Transfer Using Style Representations
In recent years, language-driven artistic style transfer has emerged as a new
type of style transfer technique, eliminating the need for a reference style
image by using natural language descriptions of the style. The first model to
achieve this, called CLIPstyler, has demonstrated impressive stylisation
results. However, its lengthy optimisation procedure at runtime for each query
limits its suitability for many practical applications. In this work, we
present FastCLIPstyler, a generalised text-based image style transfer model
capable of stylising images in a single forward pass for arbitrary text inputs.
Furthermore, we introduce EdgeCLIPstyler, a lightweight model designed for
compatibility with resource-constrained devices. Through quantitative and
qualitative comparisons with state-of-the-art approaches, we demonstrate that
our models achieve superior stylisation quality based on measurable metrics
while offering significantly improved runtime efficiency, particularly on edge
devices.Comment: Accepted at the 2024 IEEE/CVF Winter Conference on Applications of
Computer Vision (WACV 2024
Development of a face mask detection pipeline for mask-wearing monitoring in the era of the COVID-19 pandemic: A modular approach
During the SARS-Cov-2 pandemic, mask-wearing became an effective tool to
prevent spreading and contracting the virus. The ability to monitor the
mask-wearing rate in the population would be useful for determining public
health strategies against the virus. However, artificial intelligence
technologies for detecting face masks have not been deployed at a large scale
in real-life to measure the mask-wearing rate in public. In this paper, we
present a two-step face mask detection approach consisting of two separate
modules: 1) face detection and alignment and 2) face mask classification. This
approach allowed us to experiment with different combinations of face detection
and face mask classification modules. More specifically, we experimented with
PyramidKey and RetinaFace as face detectors while maintaining a lightweight
backbone for the face mask classification module. Moreover, we also provide a
relabeled annotation of the test set of the AIZOO dataset, where we rectified
the incorrect labels for some face images. The evaluation results on the AIZOO
and Moxa 3K datasets showed that the proposed face mask detection pipeline
surpassed the state-of-the-art methods. The proposed pipeline also yielded a
higher mAP on the relabeled test set of the AIZOO dataset than the original
test set. Since we trained the proposed model using in-the-wild face images, we
can successfully deploy our model to monitor the mask-wearing rate using public
CCTV images.Comment: Accepted at the 19th International Joint Conference on Computer
Science and Software Engineering (JCSSE 2022
Fast Separable Gabor Filter for Fingerprint Enhancement
Abstract. Since two-dimensional Gabor filter can be separated into onedimensional Gaussian low pass filter and one-dimensional Gaussian band pass filter to the perpendicular, a new set of separable Gabor filters are implemented for fingerprint enhancement. This separable Gabor filtering consumes approximately 2.6 time faster than the conventional Gabor filtering with comparable enhancement results. This alternative fingerprint enhancement scheme is very promising for practical fast implementation in the near future.