5,649 research outputs found

    Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification

    Full text link
    Despite the steady progress in video analysis led by the adoption of convolutional neural networks (CNNs), the relative improvement has been less drastic as that in 2D static image classification. Three main challenges exist including spatial (image) feature representation, temporal information representation, and model/computation complexity. It was recently shown by Carreira and Zisserman that 3D CNNs, inflated from 2D networks and pretrained on ImageNet, could be a promising way for spatial and temporal representation learning. However, as for model/computation complexity, 3D CNNs are much more expensive than 2D CNNs and prone to overfit. We seek a balance between speed and accuracy by building an effective and efficient video classification system through systematic exploration of critical network design choices. In particular, we show that it is possible to replace many of the 3D convolutions by low-cost 2D convolutions. Rather surprisingly, best result (in both speed and accuracy) is achieved when replacing the 3D convolutions at the bottom of the network, suggesting that temporal representation learning on high-level semantic features is more useful. Our conclusion generalizes to datasets with very different properties. When combined with several other cost-effective designs including separable spatial/temporal convolution and feature gating, our system results in an effective video classification system that that produces very competitive results on several action classification benchmarks (Kinetics, Something-something, UCF101 and HMDB), as well as two action detection (localization) benchmarks (JHMDB and UCF101-24).Comment: ECCV 2018 camera read

    Music 2025 : The Music Data Dilemma: issues facing the music industry in improving data management

    Get PDF
    © Crown Copyright 2019Music 2025ʼ investigates the infrastructure issues around the management of digital data in an increasingly stream driven industry. The findings are the culmination of over 50 interviews with high profile music industry representatives across the sector and reflects key issues as well as areas of consensus and contrasting views. The findings reveal whilst there are great examples of data initiatives across the value chain, there are opportunities to improve efficiency and interoperability

    Stochastic Prediction of Multi-Agent Interactions from Partial Observations

    Full text link
    We present a method that learns to integrate temporal information, from a learned dynamics model, with ambiguous visual information, from a learned vision model, in the context of interacting agents. Our method is based on a graph-structured variational recurrent neural network (Graph-VRNN), which is trained end-to-end to infer the current state of the (partially observed) world, as well as to forecast future states. We show that our method outperforms various baselines on two sports datasets, one based on real basketball trajectories, and one generated by a soccer game engine.Comment: ICLR 2019 camera read

    T Cell Responses during Acute Respiratory Virus Infection

    Get PDF
    This article is made available for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.The T cell response is an integral and essential part of the host immune response to acute virus infection. Each viral pathogen has unique, frequently nuanced, aspects to its replication, which affects the host response and as a consequence the capacity of the virus to produce disease. There are, however, common features to the T cell response to viruses, which produce acute limited infection. This is true whether virus replication is restricted to a single site, for example, the respiratory tract (RT), CNS etc., or replication is in multiple sites throughout the body. In describing below the acute T cell response to virus infection, we employ acute virus infection of the RT as a convenient model to explore this process of virus infection and the host response. We divide the process into three phases: the induction (initiation) of the response, the expression of antiviral effector activity resulting in virus elimination, and the resolution of inflammation with restoration of tissue homeostasis

    An Optical-Infrared Study of the Young Multipolar Planetary Nebula NGC 6644

    Get PDF
    High-resolution HST imaging of the compact planetary nebula NGC 6644 has revealed two pairs of bipolar lobes and a central ring lying close to the plane of the sky. From mid-infrared imaging obtained with the Gemini Telescope, we have found a dust torus which is oriented nearly perpendicular to one pair of the lobes. We suggest that NGC 6644 is a multipolar nebula and have constructed a 3-D model which allows the visualization of the object from different lines of sight. These results suggest that NGC 6644 may have similar intrinsic structures as other multipolar nebulae and the phenomenon of multipolar nebulosity may be more common than previously believed.Comment: 31 pages, 13 figures, accepted for publication in Ap

    Materials Story of Sir John Soane’s Life

    Get PDF
    Based on research about cognitive architecture in mental and physical space, this paper describes a project which mapped phases of Sir John Soane’s life story onto particular building materials relevant to both the museum space and his life
    • …
    corecore