6,175 research outputs found
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification
Despite the steady progress in video analysis led by the adoption of
convolutional neural networks (CNNs), the relative improvement has been less
drastic as that in 2D static image classification. Three main challenges exist
including spatial (image) feature representation, temporal information
representation, and model/computation complexity. It was recently shown by
Carreira and Zisserman that 3D CNNs, inflated from 2D networks and pretrained
on ImageNet, could be a promising way for spatial and temporal representation
learning. However, as for model/computation complexity, 3D CNNs are much more
expensive than 2D CNNs and prone to overfit. We seek a balance between speed
and accuracy by building an effective and efficient video classification system
through systematic exploration of critical network design choices. In
particular, we show that it is possible to replace many of the 3D convolutions
by low-cost 2D convolutions. Rather surprisingly, best result (in both speed
and accuracy) is achieved when replacing the 3D convolutions at the bottom of
the network, suggesting that temporal representation learning on high-level
semantic features is more useful. Our conclusion generalizes to datasets with
very different properties. When combined with several other cost-effective
designs including separable spatial/temporal convolution and feature gating,
our system results in an effective video classification system that that
produces very competitive results on several action classification benchmarks
(Kinetics, Something-something, UCF101 and HMDB), as well as two action
detection (localization) benchmarks (JHMDB and UCF101-24).Comment: ECCV 2018 camera read
Music 2025 : The Music Data Dilemma: issues facing the music industry in improving data management
© Crown Copyright 2019Music 2025ʼ investigates the infrastructure issues around the management of digital data in an increasingly stream driven industry. The findings are the culmination of over 50 interviews with high profile music industry representatives across the sector and reflects key issues as well as areas of consensus and contrasting views. The findings reveal whilst there are great examples of data initiatives across the value chain, there are opportunities to improve efficiency and interoperability
Stochastic Prediction of Multi-Agent Interactions from Partial Observations
We present a method that learns to integrate temporal information, from a
learned dynamics model, with ambiguous visual information, from a learned
vision model, in the context of interacting agents. Our method is based on a
graph-structured variational recurrent neural network (Graph-VRNN), which is
trained end-to-end to infer the current state of the (partially observed)
world, as well as to forecast future states. We show that our method
outperforms various baselines on two sports datasets, one based on real
basketball trajectories, and one generated by a soccer game engine.Comment: ICLR 2019 camera read
T Cell Responses during Acute Respiratory Virus Infection
This article is made available for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.The T cell response is an integral and essential part of the host immune response to acute virus infection. Each viral pathogen has unique, frequently nuanced, aspects to its replication, which affects the host response and as a consequence the capacity of the virus to produce disease. There are, however, common features to the T cell response to viruses, which produce acute limited infection. This is true whether virus replication is restricted to a single site, for example, the respiratory tract (RT), CNS etc., or replication is in multiple sites throughout the body. In describing below the acute T cell response to virus infection, we employ acute virus infection of the RT as a convenient model to explore this process of virus infection and the host response. We divide the process into three phases: the induction (initiation) of the response, the expression of antiviral effector activity resulting in virus elimination, and the resolution of inflammation with restoration of tissue homeostasis
An Optical-Infrared Study of the Young Multipolar Planetary Nebula NGC 6644
High-resolution HST imaging of the compact planetary nebula NGC 6644 has
revealed two pairs of bipolar lobes and a central ring lying close to the plane
of the sky. From mid-infrared imaging obtained with the Gemini Telescope, we
have found a dust torus which is oriented nearly perpendicular to one pair of
the lobes. We suggest that NGC 6644 is a multipolar nebula and have constructed
a 3-D model which allows the visualization of the object from different lines
of sight. These results suggest that NGC 6644 may have similar intrinsic
structures as other multipolar nebulae and the phenomenon of multipolar
nebulosity may be more common than previously believed.Comment: 31 pages, 13 figures, accepted for publication in Ap
Materials Story of Sir John Soane’s Life
Based on research about cognitive architecture in mental and physical space, this paper describes a project which mapped phases of Sir John Soane’s life story onto particular building materials relevant to both the museum space and his life
- …