178 research outputs found
Compact color texture descriptor based on rank transform and product ordering in the RGB color space
Robust recognition and segmentation of human actions using HMMs with missing observations
This paper describes the integration of missing observation data with hidden Markov models to create a framework that is able to segment and classify individual actions from a stream of human motion using an incomplete 3D human pose estimation. Based on this framework, a model is trained to automatically segment and classify an activity sequence into its constituent subactions during inferencing. This is achieved by introducing action labels into the observation vector and setting these labels as missing data during inferencing, thus forcing the system to infer the probability of each action label. Additionally, missing data provides recognition-level support for occlusions and imperfect silhouette segmentation, permitting the use of a fast (real-time) pose estimation that delegates the burden of handling undetected limbs onto the action recognition system. Findings show that the use of missing data to segment activities is an accurate and elegant approach. Furthermore, action recognition can be accurate even when almost half of the pose feature data is missing due to occlusions, since not all of the pose data is important all of the time
Improving End-to-End Text Image Translation From the Auxiliary Text Translation Task
End-to-end text image translation (TIT), which aims at translating the source
language embedded in images to the target language, has attracted intensive
attention in recent research. However, data sparsity limits the performance of
end-to-end text image translation. Multi-task learning is a non-trivial way to
alleviate this problem via exploring knowledge from complementary related
tasks. In this paper, we propose a novel text translation enhanced text image
translation, which trains the end-to-end model with text translation as an
auxiliary task. By sharing model parameters and multi-task training, our model
is able to take full advantage of easily-available large-scale text parallel
corpus. Extensive experimental results show our proposed method outperforms
existing end-to-end methods, and the joint multi-task learning with both text
translation and recognition tasks achieves better results, proving translation
and recognition auxiliary tasks are complementary.Comment: Accepted at the 26TH International Conference on Pattern Recognition
(ICPR 2022
Video enhancement using adaptive spatio-temporal connective filter and piecewise mapping
This paper presents a novel video enhancement system based on an adaptive spatio-temporal connective (ASTC) noise filter and an adaptive piecewise mapping function (APMF). For ill-exposed videos or those with much noise, we first introduce a novel local image statistic to identify impulse noise pixels, and then incorporate it into the classical bilateral filter to form ASTC, aiming to reduce the mixture of the most two common types of noises - Gaussian and impulse noises in spatial and temporal directions. After noise removal, we enhance the video contrast with APMF based on the statistical information of frame segmentation results. The experiment results demonstrate that, for diverse low-quality videos corrupted by mixed noise, underexposure, overexposure, or any mixture of the above, the proposed system can automatically produce satisfactory results
EURASIP Journal on Applied Signal Processing 2005:13, 2110–2126 c ○ 2005 Hindawi Publishing Corporation Robust Recognition and Segmentation of Human Actions Using HMMs with Missing Observations
This paper describes the integration of missing observation data with hidden Markov models to create a framework that is able to segment and classify individual actions from a stream of human motion using an incomplete 3D human pose estimation. Based on this framework, a model is trained to automatically segment and classify an activity sequence into its constituent subactions during inferencing. This is achieved by introducing action labels into the observation vector and setting these labels as missing data during inferencing, thus forcing the system to infer the probability of each action label. Additionally, missing data provides recognitionlevel support for occlusions and imperfect silhouette segmentation, permitting the use of a fast (real-time) pose estimation that delegates the burden of handling undetected limbs onto the action recognition system. Findings show that the use of missing data to segment activities is an accurate and elegant approach. Furthermore, action recognition can be accurate even when almost half of the pose feature data is missing due to occlusions, since not all of the pose data is important all of the time
- …