553,985 research outputs found
Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning
Despite the success of fully-supervised human skeleton sequence modeling,
utilizing self-supervised pre-training for skeleton sequence representation
learning has been an active field because acquiring task-specific skeleton
annotations at large scales is difficult. Recent studies focus on learning
video-level temporal and discriminative information using contrastive learning,
but overlook the hierarchical spatial-temporal nature of human skeletons.
Different from such superficial supervision at the video level, we propose a
self-supervised hierarchical pre-training scheme incorporated into a
hierarchical Transformer-based skeleton sequence encoder (Hi-TRS), to
explicitly capture spatial, short-term, and long-term temporal dependencies at
frame, clip, and video levels, respectively. To evaluate the proposed
self-supervised pre-training scheme with Hi-TRS, we conduct extensive
experiments covering three skeleton-based downstream tasks including action
recognition, action detection, and motion prediction. Under both supervised and
semi-supervised evaluation protocols, our method achieves the state-of-the-art
performance. Additionally, we demonstrate that the prior knowledge learned by
our model in the pre-training stage has strong transfer capability for
different downstream tasks.Comment: Accepted to ECCV 202
Recommended from our members
VOX : an extensible natural language processor
VOX is a Natural Language Processor whose knowledge can be extended by interaction with a user.VOX consists of a text analyzer and an extensibility system that share a knowledge base. The extensibility system lets the user add vocabulary, concepts, phrases, events, and scenarios to the knowledge base. The analyzer uses information obtained in this way to understand previously unhandled text.The underlying knowledge representation of VOX, called Conceptual Grammar, has been developed to meet the severe requirements of extensibility. Conceptual Grammar uniformly represents syntactic and semantic information, and permits modular addition of knowledge
What does semantic tiling of the cortex tell us about semantics?
Recent use of voxel-wise modeling in cognitive neuroscience suggests that semantic maps tile the cortex. Although this impressive research establishes distributed cortical areas active during the conceptual processing that underlies semantics, it tells us little about the nature of this processing. While mapping concepts between Marr's computational and implementation levels to support neural encoding and decoding, this approach ignores Marr's algorithmic level, central for understanding the mechanisms that implement cognition, in general, and conceptual processing, in particular. Following decades of research in cognitive science and neuroscience, what do we know so far about the representation and processing mechanisms that implement conceptual abilities? Most basically, much is known about the mechanisms associated with: (1) features and frame representations, (2) grounded, abstract, and linguistic representations, (3) knowledge-based inference, (4) concept composition, and (5) conceptual flexibility. Rather than explaining these fundamental representation and processing mechanisms, semantic tiles simply provide a trace of their activity over a relatively short time period within a specific learning context. Establishing the mechanisms that implement conceptual processing in the brain will require more than mapping it to cortical (and sub-cortical) activity, with process models from cognitive science likely to play central roles in specifying the intervening mechanisms. More generally, neuroscience will not achieve its basic goals until it establishes algorithmic-level mechanisms that contribute essential explanations to how the brain works, going beyond simply establishing the brain areas that respond to various task conditions
Passive sentences and structural parsing
Traditional language parsing is mainly based on generative grammar in English. As English and Chinese belong to two different families of language, a grammar is not sufficient for Chinese parsing although it is still important. In passive sentences in English and Chinese, there exists some similarity, but there also exists some difference. In this paper, first the sememe analysis is introduced in Chinese parsing. Second, we will compare the passive sentence in English and Chinese with respect to sentence pattern, semantic relations and other aspects in view of knowledge graph theory. We find that after we use sememe analysis in Chinese parsing, we can easily deal with Chinese passive sentences
A graph theoretical analysis of certain aspects of Bahasa Indonesia
In this paper the theory of knowledge graphs is applied to some characteristic features of the Indonesian language. The characteristic features to be considered are active and passive form of verbs and the derived noun
Recommended from our members
Nomad, a naval message understanding system
We are building systems to automatically analyze Navy messages. Such messages typically are terse and use many abbreviations and Navy jargon. As a result, they are more difficult to understand than everyday English.The NOMAD system interacts with a message sender to ensure that only unambiguous and reasonably correct messages are generated. The VOX system will allow a human tutor to interactively extend the knowledge base of NOMAD
A Probabilistic Logic Programming Event Calculus
We present a system for recognising human activity given a symbolic
representation of video content. The input of our system is a set of
time-stamped short-term activities (STA) detected on video frames. The output
is a set of recognised long-term activities (LTA), which are pre-defined
temporal combinations of STA. The constraints on the STA that, if satisfied,
lead to the recognition of a LTA, have been expressed using a dialect of the
Event Calculus. In order to handle the uncertainty that naturally occurs in
human activity recognition, we adapted this dialect to a state-of-the-art
probabilistic logic programming framework. We present a detailed evaluation and
comparison of the crisp and probabilistic approaches through experimentation on
a benchmark dataset of human surveillance videos.Comment: Accepted for publication in the Theory and Practice of Logic
Programming (TPLP) journa
- …