4 research outputs found
Model and Feature Selection in Hidden Conditional Random Fields with Group Regularization
Proceedings of: 8th International Conference on Hybrid Artificial Intelligence Systems (HAIS 2013). Salamanca, September 11-13, 2013.Sequence classification is an important problem in computer vision, speech analysis or computational biology. This paper presents a new training strategy for the Hidden Conditional Random Field sequence classifier incorporating model and feature selection. The standard Lasso regularization employed in the estimation of model parameters is replaced by overlapping group-L1 regularization. Depending on the configuration of the overlapping groups, model selection, feature selection,or both are performed. The sequence classifiers trained in this way have better predictive performance. The application of the proposed method in a human action recognition task confirms that fact.This work was supported in part by Projects MINECO TEC2012-37832-C02-01, CICYT TEC2011-28626-C02-02, CAM CONTEXTS (S2009/TIC-1485)Publicad
With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition
In egocentric videos, actions occur in quick succession. We capitalise on the
action's temporal context and propose a method that learns to attend to
surrounding actions in order to improve recognition performance. To incorporate
the temporal context, we propose a transformer-based multimodal model that
ingests video and audio as input modalities, with an explicit language model
providing action sequence context to enhance the predictions. We test our
approach on EPIC-KITCHENS and EGTEA datasets reporting state-of-the-art
performance. Our ablations showcase the advantage of utilising temporal context
as well as incorporating audio input modality and language model to rescore
predictions. Code and models at: https://github.com/ekazakos/MTCN.Comment: Accepted at BMVC 202