Search CORE

8 research outputs found

Amortized Variational Inference: Towards the Mathematical Foundation and Review

Author: Ganguly Ankush
Jain Sanjana
Watchareeruetai Ukrit
Publication venue
Publication date: 22/09/2022
Field of study

The core principle of Variational Inference (VI) is to convert the statistical inference problem of computing complex posterior probability densities into a tractable optimization problem. This property enables VI to be faster than several sampling-based techniques. However, the traditional VI algorithm is not scalable to large data sets and is unable to readily infer out-of-bounds data points without re-running the optimization process. Recent developments in the field, like stochastic-, black box- and amortized-VI, have helped address these issues. Generative modeling tasks nowadays widely make use of amortized VI for its efficiency and scalability, as it utilizes a parameterized function to learn the approximate posterior density parameters. With this paper, we review the mathematical foundations of various VI techniques to form the basis for understanding amortized VI. Additionally, we provide an overview of the recent trends that address several issues of amortized VI, such as the amortization gap, generalization issues, inconsistent representation learning, and posterior collapse. Finally, we analyze alternate divergence measures that improve VI optimization

arXiv.org e-Print Archive

FastCLIPstyler: Optimisation-free Text-based Image Style Transfer Using Style Representations

Author: Ganguly Ankush
Jain Sanjana
Noinongyao Pavit
Samacoits Aubin
Suresh Ananda Padhmanabhan
Watchareeruetai Ukrit
Publication venue
Publication date: 14/11/2023
Field of study

In recent years, language-driven artistic style transfer has emerged as a new type of style transfer technique, eliminating the need for a reference style image by using natural language descriptions of the style. The first model to achieve this, called CLIPstyler, has demonstrated impressive stylisation results. However, its lengthy optimisation procedure at runtime for each query limits its suitability for many practical applications. In this work, we present FastCLIPstyler, a generalised text-based image style transfer model capable of stylising images in a single forward pass for arbitrary text inputs. Furthermore, we introduce EdgeCLIPstyler, a lightweight model designed for compatibility with resource-constrained devices. Through quantitative and qualitative comparisons with state-of-the-art approaches, we demonstrate that our models achieve superior stylisation quality based on measurable metrics while offering significantly improved runtime efficiency, particularly on edge devices.Comment: Accepted at the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024

arXiv.org e-Print Archive

Development of a face mask detection pipeline for mask-wearing monitoring in the era of the COVID-19 pandemic: A modular approach

Author: Boonmanunt Suparee
Earp Samuel W. F.
Ganguly Ankush
Kitiyakara Taya
Sommana Benjaphan
Thammasudjarit Ratchainant
Watchareeruetai Ukrit
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2022
Field of study

During the SARS-Cov-2 pandemic, mask-wearing became an effective tool to prevent spreading and contracting the virus. The ability to monitor the mask-wearing rate in the population would be useful for determining public health strategies against the virus. However, artificial intelligence technologies for detecting face masks have not been deployed at a large scale in real-life to measure the mask-wearing rate in public. In this paper, we present a two-step face mask detection approach consisting of two separate modules: 1) face detection and alignment and 2) face mask classification. This approach allowed us to experiment with different combinations of face detection and face mask classification modules. More specifically, we experimented with PyramidKey and RetinaFace as face detectors while maintaining a lightweight backbone for the face mask classification module. Moreover, we also provide a relabeled annotation of the test set of the AIZOO dataset, where we rectified the incorrect labels for some face images. The evaluation results on the AIZOO and Moxa 3K datasets showed that the proposed face mask detection pipeline surpassed the state-of-the-art methods. The proposed pipeline also yielded a higher mAP on the relabeled test set of the AIZOO dataset than the original test set. Since we trained the proposed model using in-the-wild face images, we can successfully deploy our model to monitor the mask-wearing rate using public CCTV images.Comment: Accepted at the 19th International Joint Conference on Computer Science and Software Engineering (JCSSE 2022

arXiv.org e-Print Archive

Fast Separable Gabor Filter for Fingerprint Enhancement

Author: Sawasd Tantaratana
Ukrit Watchareeruetai
Vutipong Areekul
Publication venue: Springer
Publication date: 01/01/2004
Field of study

Abstract. Since two-dimensional Gabor filter can be separated into onedimensional Gaussian low pass filter and one-dimensional Gaussian band pass filter to the perpendicular, a new set of separable Gabor filters are implemented for fingerprint enhancement. This separable Gabor filtering consumes approximately 2.6 time faster than the conventional Gabor filtering with comparable enhancement results. This alternative fingerprint enhancement scheme is very promising for practical fast implementation in the near future.

CiteSeerX

Construction of image feature extractors based on multi-objective genetic programming with redundancy regulations

Author: Kudo Hiroaki
Matsumoto Tetsuya
Ohnishi Noboru
Takeuchi Yoshinori
Watchareeruetai Ukrit
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Institutional Repositories DataBase (IRDB)

Multi-Objective Genetic Programming with Redundancy-Regulations for Automatic Construction of Image Feature Extractors

Author: KUDO Hiroaki
MATSUMOTO Tetsuya
OHNISHI Noboru
TAKEUCHI Yoshinori
WATCHAREERUETAI Ukrit
Publication venue: 'The Institute of Electronics, Information and Communication Engineers'
Publication date
Field of study

Institutional Repositories DataBase (IRDB)

Acceleration of Genetic Programming by Hierarchical Structure Learning: A Case Study on Image Recognition Program Synthesis

Author: KUDO Hiroaki
MATSUMOTO Tetsuya
OHNISHI Noboru
TAKEUCHI Yoshinori
WATCHAREERUETAI Ukrit
Publication venue: 'The Institute of Electronics, Information and Communication Engineers'
Publication date
Field of study

Institutional Repositories DataBase (IRDB)