Search CORE

149 research outputs found

Symmetry of Planar Four-Body Convex Central Configurations

Author: Albouy Alain
Fu Yanning
Sun Shanzhong
Publication venue: Royal Society, The
Publication date: 01/01/2007
Field of study

International audienceWe study the relationship between the masses and the geometric properties of central configurations. We prove that in the planar four-body problem, a convex central configuration is symmetric with respect to one diagonal if and only if the masses of the two particles on the other diagonal are equal. If these two masses are unequal, then the less massive one is closer to the former diagonal. Finally, we extend these results to the case of non-planar central configurations of five particles

arXiv.org e-Print Archive

CiteSeerX

HAL-INSU

HAL-OBSPM

VS-TransGRU: A Novel Transformer-GRU-based Framework Enhanced by Visual-Semantic Fusion for Egocentric Action Anticipation

Author: Cao Congqi
Lv Qinyi
Min Lingtong
Sun Ze
Zhang Yanning
Publication venue
Publication date: 08/07/2023
Field of study

Egocentric action anticipation is a challenging task that aims to make advanced predictions of future actions from current and historical observations in the first-person view. Most existing methods focus on improving the model architecture and loss function based on the visual input and recurrent neural network to boost the anticipation performance. However, these methods, which merely consider visual information and rely on a single network architecture, gradually reach a performance plateau. In order to fully understand what has been observed and capture the dependencies between current observations and future actions well enough, we propose a novel visual-semantic fusion enhanced and Transformer GRU-based action anticipation framework in this paper. Firstly, high-level semantic information is introduced to improve the performance of action anticipation for the first time. We propose to use the semantic features generated based on the class labels or directly from the visual observations to augment the original visual features. Secondly, an effective visual-semantic fusion module is proposed to make up for the semantic gap and fully utilize the complementarity of different modalities. Thirdly, to take advantage of both the parallel and autoregressive models, we design a Transformer based encoder for long-term sequential modeling and a GRU-based decoder for flexible iteration decoding. Extensive experiments on two large-scale first-person view datasets, i.e., EPIC-Kitchens and EGTEA Gaze+, validate the effectiveness of our proposed method, which achieves new state-of-the-art performance, outperforming previous approaches by a large margin.Comment: 12 pages, 7 figure

arXiv.org e-Print Archive

MixCycle: Mixup Assisted Semi-Supervised 3D Single Object Tracking with Cycle Consistency

Author: Salzmann Mathieu
Sun Kun
Wu Qiao
Yang Jiaqi
Zhang Chu'ai
Zhang Yanning
Publication venue
Publication date: 16/08/2023
Field of study

3D single object tracking (SOT) is an indispensable part of automated driving. Existing approaches rely heavily on large, densely labeled datasets. However, annotating point clouds is both costly and time-consuming. Inspired by the great success of cycle tracking in unsupervised 2D SOT, we introduce the first semi-supervised approach to 3D SOT. Specifically, we introduce two cycle-consistency strategies for supervision: 1) Self tracking cycles, which leverage labels to help the model converge better in the early stages of training; 2) forward-backward cycles, which strengthen the tracker's robustness to motion variations and the template noise caused by the template update strategy. Furthermore, we propose a data augmentation strategy named SOTMixup to improve the tracker's robustness to point cloud diversity. SOTMixup generates training samples by sampling points in two point clouds with a mixing rate and assigns a reasonable loss weight for training according to the mixing rate. The resulting MixCycle approach generalizes to appearance matching-based trackers. On the KITTI benchmark, based on the P2B tracker, MixCycle trained with

\textbf{10\%}

labels outperforms P2B trained with

\textbf{100\%}

labels, and achieves a

\textbf{28.4\%}

precision improvement when using

\textbf{1\%}

labels. Our code will be released at \url{https://github.com/Mumuqiao/MixCycle}.Comment: Accepted by ICCV2

arXiv.org e-Print Archive

Driving Simulator Validity of Driving Behavior in Work Zones

Author: Sun Zhi
Yanning Zhang
Zhongyin Guo
Publication venue
Publication date: 08/06/2020
Field of study

Driving simulation is an efficient, safe, and data-collection-friendly method to examine driving behavior in a controlled environment. However, the validity of a driving simulator is inconsistent when the type of the driving simulator or the driving scenario is different. The purpose of this research is to verify driving simulator validity in driving behavior research in work zones. A field experiment and a corresponding simulation experiment were conducted to collect behavioral data. Indicators such as speed, car-following distance, and reaction delay time were chosen to examine the absolute and relative validity of the driving simulator. In particular, a survival analysis method was proposed in this research to examine the validity of reaction delay time. The result indicates the following: (1) most indicators are valid in driving behavior research in the work zone. For example, spot speed, car-following distance, headway, and reaction delay time show absolute validity. (2) Standard deviation of the car-following distance shows relative validity. Consistent with previous researches, some driving behaviors appear to be more aggressive in the simulation environment. Document type: Articl

Scipedia

Boosting Multi-view Stereo with Late Cost Aggregation

Author: Li Rui
Sun Jinqiu
Wu Jiang
Zhang Yanning
Zhao Wenxun
Zhu Yu
Publication venue
Publication date: 24/01/2024
Field of study

Pairwise matching cost aggregation is a crucial step for modern learning-based Multi-view Stereo (MVS). Prior works adopt an early aggregation scheme, which adds up pairwise costs into an intermediate cost. However, we analyze that this process can degrade informative pairwise matchings, thereby blocking the depth network from fully utilizing the original geometric matching cues. To address this challenge, we present a late aggregation approach that allows for aggregating pairwise costs throughout the network feed-forward process, achieving accurate estimations with only minor changes of the plain CasMVSNet. Instead of building an intermediate cost by weighted sum, late aggregation preserves all pairwise costs along a distinct view channel. This enables the succeeding depth network to fully utilize the crucial geometric cues without loss of cost fidelity. Grounded in the new aggregation scheme, we propose further techniques addressing view order dependence inside the preserved cost, handling flexible testing views, and improving the depth filtering process. Despite its technical simplicity, our method improves significantly upon the baseline cascade-based approach, achieving comparable results with state-of-the-art methods with favorable computation overhead.Comment: Code and models are available at https://github.com/Wuuu3511/LAMVSNE

arXiv.org e-Print Archive

IL-17A is implicated in lipopolysaccharide-induced neuroinflammation and cognitive impairment in aged rats via microglial activation

Author: Hongquan Dong
Jie Sun
Susu Zhang
Xiang Zhang
Xiaobao Zhang
Yanning Qian
Publication venue: Springer Nature
Publication date: 01/01/2015
Field of study

Springer - Publisher Connector

Open-Vocabulary Video Anomaly Detection

Author: Liu Jing
Pang Guansong
Sun Yujia
Wang Peng
Wu Peng
Zhang Yanning
Zhou Xuerong
Publication venue
Publication date: 14/11/2023
Field of study

Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal. However, current approaches are inherently limited to a closed-set setting and may struggle in open-world applications where there can be anomaly categories in the test data unseen during training. A few recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos. However, such a setting focuses on predicting frame anomaly scores, having no ability to recognize the specific categories of anomalies, despite the fact that this ability is essential for building more informed video surveillance systems. This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies. To this end, we propose a model that decouples OVVAD into two mutually complementary tasks -- class-agnostic detection and class-specific classification -- and jointly optimizes both tasks. Particularly, we devise a semantic knowledge injection module to introduce semantic knowledge from large language models for the detection task, and design a novel anomaly synthesis module to generate pseudo unseen anomaly videos with the help of large vision generation models for the classification task. These semantic knowledge and synthesis anomalies substantially extend our model's capability in detecting and categorizing a variety of seen and unseen anomalies. Extensive experiments on three widely-used benchmarks demonstrate our model achieves state-of-the-art performance on OVVAD task.Comment: Submitte

arXiv.org e-Print Archive

S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning

Author: Gao Yiqi
Liu Weisong
Sun Mengyang
Suo Wei
Wang Peng
Wu Qi
Zhang Yanning
Publication venue
Publication date: 05/09/2023
Field of study

VQA Natural Language Explanation (VQA-NLE) task aims to explain the decision-making process of VQA models in natural language. Unlike traditional attention or gradient analysis, free-text rationales can be easier to understand and gain users' trust. Existing methods mostly use post-hoc or self-rationalization models to obtain a plausible explanation. However, these frameworks are bottlenecked by the following challenges: 1) the reasoning process cannot be faithfully responded to and suffer from the problem of logical inconsistency. 2) Human-annotated explanations are expensive and time-consuming to collect. In this paper, we propose a new Semi-Supervised VQA-NLE via Self-Critical Learning (S3C), which evaluates the candidate explanations by answering rewards to improve the logical consistency between answers and rationales. With a semi-supervised learning framework, the S3C can benefit from a tremendous amount of samples without human-annotated explanations. A large number of automatic measures and human evaluations all show the effectiveness of our method. Meanwhile, the framework achieves a new state-of-the-art performance on the two VQA-NLE datasets.Comment: CVPR202

arXiv.org e-Print Archive