53 research outputs found
Statistical analysis of trajectories on Riemannian manifolds: Bird migration, hurricane tracking and video surveillance
We consider the statistical analysis of trajectories on Riemannian manifolds
that are observed under arbitrary temporal evolutions. Past methods rely on
cross-sectional analysis, with the given temporal registration, and
consequently may lose the mean structure and artificially inflate observed
variances. We introduce a quantity that provides both a cost function for
temporal registration and a proper distance for comparison of trajectories.
This distance is used to define statistical summaries, such as sample means and
covariances, of synchronized trajectories and "Gaussian-type" models to capture
their variability at discrete times. It is invariant to identical time-warpings
(or temporal reparameterizations) of trajectories. This is based on a novel
mathematical representation of trajectories, termed transported square-root
vector field (TSRVF), and the norm on the space of TSRVFs. We
illustrate this framework using three representative
manifolds---, and shape space of planar
contours---involving both simulated and real data. In particular, we
demonstrate: (1) improvements in mean structures and significant reductions in
cross-sectional variances using real data sets, (2) statistical modeling for
capturing variability in aligned trajectories, and (3) evaluating random
trajectories under these models. Experimental results concern bird migration,
hurricane tracking and video surveillance.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS701 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Elastic Functional Coding of Riemannian Trajectories
Visual observations of dynamic phenomena, such as human actions, are often
represented as sequences of smoothly-varying features . In cases where the
feature spaces can be structured as Riemannian manifolds, the corresponding
representations become trajectories on manifolds. Analysis of these
trajectories is challenging due to non-linearity of underlying spaces and
high-dimensionality of trajectories. In vision problems, given the nature of
physical systems involved, these phenomena are better characterized on a
low-dimensional manifold compared to the space of Riemannian trajectories. For
instance, if one does not impose physical constraints of the human body, in
data involving human action analysis, the resulting representation space will
have highly redundant features. Learning an effective, low-dimensional
embedding for action representations will have a huge impact in the areas of
search and retrieval, visualization, learning, and recognition. The difficulty
lies in inherent non-linearity of the domain and temporal variability of
actions that can distort any traditional metric between trajectories. To
overcome these issues, we use the framework based on transported square-root
velocity fields (TSRVF); this framework has several desirable properties,
including a rate-invariant metric and vector space representations. We propose
to learn an embedding such that each action trajectory is mapped to a single
point in a low-dimensional Euclidean space, and the trajectories that differ
only in temporal rates map to the same point. We utilize the TSRVF
representation, and accompanying statistical summaries of Riemannian
trajectories, to extend existing coding methods such as PCA, KSVD and Label
Consistent KSVD to Riemannian trajectories or more generally to Riemannian
functions.Comment: Under major revision at IEEE T-PAMI, 201
Video Event Understanding with Pattern Theory
We propose a combinatorial approach built on Grenander’s pattern theory to generate semantic interpretations of video events of human activities. The basic units of representations, termed generators, are linked with each other using pairwise connections, termed bonds, that satisfy predefined relations. Different generators are specified for different levels, from (image) features at the bottom level to (human) actions at the highest, providing a rich representation of items in a scene. The resulting configurations of connected generators provide scene interpretations; the inference goal is to parse given video data and generate high-probability configurations. The probabilistic structures are imposed using energies that have contributions from both data (classification scores) and prior information (ontological constraints and concept co-occurrence frequency values). The search for optimal configurations is based on an MCMC, simulated-annealing algorithm that uses simple moves to propose configuration changes and to accept/reject them according to the posterior energy. In contrast to current graphical methods, this framework does not preselect a neighborhood structure but tries to infer it from the data. This framework can potentially handle clutter, i.e. objects/actions that are not related to the main activity, and can infer actions despite some unobserved components. We evaluated the performance of our pattern-theoretic framework using video segments from the YouCook dataset. We verified an overall improvement of more than 50% in recall and 100% in precision compared to a purely machine learning based approach
Two-person Graph Convolutional Network for Skeleton-based Human Interaction Recognition
Graph convolutional networks (GCNs) have been the predominant methods in
skeleton-based human action recognition, including human-human interaction
recognition. However, when dealing with interaction sequences, current
GCN-based methods simply split the two-person skeleton into two discrete graphs
and perform graph convolution separately as done for single-person action
classification. Such operations ignore rich interactive information and hinder
effective spatial inter-body relationship modeling. To overcome the above
shortcoming, we introduce a novel unified two-person graph to represent
inter-body and intra-body correlations between joints. Experiments show
accuracy improvements in recognizing both interactions and individual actions
when utilizing the proposed two-person graph topology. In addition, We design
several graph labeling strategies to supervise the model to learn discriminant
spatial-temporal interactive features. Finally, we propose a two-person graph
convolutional network (2P-GCN). Our model achieves state-of-the-art results on
four benchmarks of three interaction datasets: SBU, interaction subsets of
NTU-RGB+D and NTU-RGB+D 120
A deep transfer learning network for structural condition identification with limited real-world training data
Structural condition identification based on monitoring data is important for
automatic civil infrastructure asset management. Nevertheless, the monitoring
data is almost always insufficient, because the real-time monitoring data of a
structure only reflects a limited number of structural conditions, while the
number of possible structural conditions is infinite. With insufficient
monitoring data, the identification performance may significantly degrade. This
study aims to tackle this challenge by proposing a deep transfer learning (TL)
approach for structural condition identification. It effectively integrates
physics-based and data-driven methods, by generating various training data
based on the calibrated finite element (FE) model, pretraining a deep learning
(DL) network, and transferring its embedded knowledge to the real
monitoring/testing domain. Its performance is demonstrated in a challenging
case, vibration-based condition identification of steel frame structures with
bolted connection damage. The results show that even though the training data
are from a different domain and with different types of labels, intrinsic
physics can be learned through the pretraining process, and the TL results can
be clearly improved, with the identification accuracy increasing from 81.8% to
89.1%. The comparative studies show that SHMnet with three convolutional layers
stands out as the pretraining DL architecture, with 21.8% and 25.5% higher
identification accuracy values over the other two networks, VGGnet-16 and
ResNet-18. The findings of this study advance the potential application of the
proposed approach towards expert-level condition identification based on
limited real-world training data
Hierarchical Dense Correlation Distillation for Few-Shot Segmentation-Extended Abstract
Few-shot semantic segmentation (FSS) aims to form class-agnostic models
segmenting unseen classes with only a handful of annotations. Previous methods
limited to the semantic feature and prototype representation suffer from coarse
segmentation granularity and train-set overfitting. In this work, we design
Hierarchically Decoupled Matching Network (HDMNet) mining pixel-level support
correlation based on the transformer architecture. The self-attention modules
are used to assist in establishing hierarchical dense features, as a means to
accomplish the cascade matching between query and support features. Moreover,
we propose a matching module to reduce train-set overfitting and introduce
correlation distillation leveraging semantic correspondence from coarse
resolution to boost fine-grained segmentation. Our method performs decently in
experiments. We achieve 50.0% mIoU on COCO dataset one-shot setting and 56.0%
on five-shot segmentation, respectively. The code will be available on the
project website. We hope our work can benefit broader industrial applications
where novel classes with limited annotations are required to be decently
identified.Comment: Accepted to CVPR 2023 VISION Workshop, Oral. The extended abstract of
Hierarchical Dense Correlation Distillation for Few-Shot Segmentation. arXiv
admin note: substantial text overlap with arXiv:2303.1465
Rate-invariant analysis of covariance trajectories
Statistical analysis of dynamic systems, such as videos and dynamic functional connectivity, is often translated into a problem of analyzing trajectories of relevant features, particularly covariance matrices. As an example, in video-based action recognition, a natural mathematical representation of activity videos is as parameterized trajectories on the set of symmetric, positive-definite matrices (SPDMs). The variable execution-rates of actions, implying arbitrary parameterizations of trajectories, complicates their analysis and classification. To handle this challenge, we represent covariance trajectories using transported square-root vector fields (TSRVFs), constructed by parallel translating scaled-velocity vectors of trajectories to their starting points. The space of such representations forms a vector bundle on the SPDM manifold. Using a natural Riemannian metric on this vector bundle, we approximate geodesic paths and geodesic distances between trajectories in the quotient space of this vector bundle. This metric is invariant to the action of the reparameterization group, and leads to a rate-invariant analysis of trajectories. In the process, we remove the parameterization variability and temporally register trajectories during analysis. We demonstrate this framework in multiple contexts, using both generative statistical models and discriminative data analysis. The latter is illustrated using several applications involving video-based action recognition and dynamic functional connectivity analysis
Statistical analysis of the community lockdown for COVID-19 pandemic
AbstractAs the global pandemic of the COVID-19 continues, the statistical modeling and analysis of the spreading process of COVID-19 have attracted widespread attention. Various propagation simulation models have been proposed to predict the spread of the epidemic and the effectiveness of related control measures. These models play an indispensable role in understanding the complex dynamic situation of the epidemic. Most existing work studies the spread of epidemic at two levels including population and agent. However, there is no comprehensive statistical analysis of community lockdown measures and corresponding control effects. This paper performs a statistical analysis of the effectiveness of community lockdown based on the Agent-Level Pandemic Simulation (ALPS) model. We propose a statistical model to analyze multiple variables affecting the COVID-19 pandemic, which include the timings of implementing and lifting lockdown, the crowd mobility, and other factors. Specifically, a motion model followed by ALPS and related basic assumptions is discussed first. Then the model has been evaluated using the real data of COVID-19. The simulation study and comparison with real data have validated the effectiveness of our model.</jats:p
- …
