2 research outputs found
Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models
In temporal action localization, given an input video, the goal is to predict
which actions it contains, where they begin, and where they end. Training and
testing current state-of-the-art deep learning models requires access to large
amounts of data and computational power. However, gathering such data is
challenging and computational resources might be limited. This work explores
and measures how current deep temporal action localization models perform in
settings constrained by the amount of data or computational power. We measure
data efficiency by training each model on a subset of the training set. We find
that TemporalMaxer outperforms other models in data-limited settings.
Furthermore, we recommend TriDet when training time is limited. To test the
efficiency of the models during inference, we pass videos of different lengths
through each model. We find that TemporalMaxer requires the least computational
resources, likely due to its simple architecture.Comment: Accepted to the CVEU workshop at ICCV 202
Efficient Temporal Action Localization model development practices: A review and analysis of models and a guide of best methods
Temporal Action Localization (TAL) is an important problem in computer vision with uses in video surveillance and recommendation, healthcare, entertainment, and human-computer interaction. Being an inherently data-heavy process, TAL has been bound by the availability of computing power, resulting in its slow pace of innovation. This work aims to accelerate the development of TAL models by conducting a short review of TAL's state-of-the-art, and providing extensive data about the latest models' data and compute efficiency. By researching how TAL models perform in limited data and compute settings, we find that using less data than available is often beneficial to iterating a model quickly, while in some cases, TAL is constrained by the limited amount of data. Finally, we provide general guidelines that create a simple framework for efficient TAL model development.CSE3000 Research ProjectComputer Science and Engineerin