1,105 research outputs found

    Large-Scale Video Classification with Convolutional Neural Networks

    Full text link
    Convolutional Neural Networks (CNNs) have been es-tablished as a powerful class of models for image recog-nition problems. Encouraged by these results, we pro-vide an extensive empirical evaluation of CNNs on large-scale video classification using a new dataset of 1 million YouTube videos belonging to 487 classes. We study multi-ple approaches for extending the connectivity of the a CNN in time domain to take advantage of local spatio-temporal information and suggest a multiresolution, foveated archi-tecture as a promising way of speeding up the training. Our best spatio-temporal networks display significant per-formance improvements compared to strong feature-based baselines (55.3 % to 63.9%), but only a surprisingly mod-est improvement compared to single-frame models (59.3% to 60.9%). We further study the generalization performance of our best model by retraining the top layers on the UCF-101 Action Recognition dataset and observe significant per-formance improvements compared to the UCF-101 baseline model (63.3 % up from 43.9%). 1
    • …
    corecore