592 research outputs found
Leveraging triplet loss for unsupervised action segmentation
© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In this paper, we propose a novel fully unsupervised framework that learns action representations suitable for the action segmentation task from the single input video itself, without requiring any training data. Our method is a deep metric learning approach rooted in a shallow network with a triplet loss operating on similarity distributions and a novel triplet selection strategy that effectively models temporal and semantic priors to discover actions in the new representational space. Under these circumstances, we successfully recover temporal boundaries in the learned action representations with higher quality compared with existing unsupervised approaches. The proposed method is evaluated on two widely used benchmark datasets for the action segmentation task and it achieves competitive performance by applying a generic clustering algorithm on the learned representations.This work was supported by the project PID2019-110977GA-I00 funded by MCIN/ AEI/ 10.13039/501100011033 and by ”ESF Investing in your future”Peer ReviewedPostprint (author's final draft
Leveraging triplet loss for unsupervised action segmentation
In this paper, we propose a novel fully unsupervised framework that learns
action representations suitable for the action segmentation task from the
single input video itself, without requiring any training data. Our method is a
deep metric learning approach rooted in a shallow network with a triplet loss
operating on similarity distributions and a novel triplet selection strategy
that effectively models temporal and semantic priors to discover actions in the
new representational space. Under these circumstances, we successfully recover
temporal boundaries in the learned action representations with higher quality
compared with existing unsupervised approaches. The proposed method is
evaluated on two widely used benchmark datasets for the action segmentation
task and it achieves competitive performance by applying a generic clustering
algorithm on the learned representations.Comment: Accepted to the Workshop on Learning with Limited Labelled Data in
conjunction with CVPR 202
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling
Raw point clouds data inevitably contains outliers or noise through
acquisition from 3D sensors or reconstruction algorithms. In this paper, we
present a novel end-to-end network for robust point clouds processing, named
PointASNL, which can deal with point clouds with noise effectively. The key
component in our approach is the adaptive sampling (AS) module. It first
re-weights the neighbors around the initial sampled points from farthest point
sampling (FPS), and then adaptively adjusts the sampled points beyond the
entire point cloud. Our AS module can not only benefit the feature learning of
point clouds, but also ease the biased effect of outliers. To further capture
the neighbor and long-range dependencies of the sampled point, we proposed a
local-nonlocal (L-NL) module inspired by the nonlocal operation. Such L-NL
module enables the learning process insensitive to noise. Extensive experiments
verify the robustness and superiority of our approach in point clouds
processing tasks regardless of synthesis data, indoor data, and outdoor data
with or without noise. Specifically, PointASNL achieves state-of-the-art robust
performance for classification and segmentation tasks on all datasets, and
significantly outperforms previous methods on real-world outdoor SemanticKITTI
dataset with considerate noise. Our code is released through
https://github.com/yanx27/PointASNL.Comment: To appear in CVPR 2020. Also seen in
http://kaldir.vc.in.tum.de/scannet_benchmark
- …