Search CORE

25 research outputs found

Spatial-Temporal Fusion Graph Neural Networks for Traffic Flow Forecasting

Author: Li Mengzhang
Zhu Zhanxing
Publication venue
Publication date: 06/03/2021
Field of study

Spatial-temporal data forecasting of traffic flow is a challenging task because of complicated spatial dependencies and dynamical trends of temporal pattern between different roads. Existing frameworks typically utilize given spatial adjacency graph and sophisticated mechanisms for modeling spatial and temporal correlations. However, limited representations of given spatial graph structure with incomplete adjacent connections may restrict effective spatial-temporal dependencies learning of those models. To overcome those limitations, our paper proposes Spatial-Temporal Fusion Graph Neural Networks (STFGNN) for traffic flow forecasting. SFTGNN could effectively learn hidden spatial-temporal dependencies by a novel fusion operation of various spatial and temporal graphs, which is generated by a data-driven method. Meanwhile, by integrating this fusion graph module and a novel gated convolution module into a unified layer, SFTGNN could handle long sequences. Experimental results on several public traffic datasets demonstrate that our method achieves state-of-the-art performance consistently than other baselines.Comment: 8 pages, 3 figures, to be published in AAAI2021. arXiv admin note: text overlap with arXiv:1903.00919 by other author

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Association for the Advancement of Artificial Intelligence: AAAI Publications

Who Is the Rightful Owner? Young Children’s Ownership Judgments in Different Transfer Contexts

Author: Jing Yu
Liqi Zhu
Minli Qi
Zhanxing Li
Zhanxing Li
Publication venue: 'Frontiers Media SA'
Publication date: 01/07/2018
Field of study

This study aimed to examine whether Chinese preschoolers understand that ownership can be transferred in different contexts. The study participants were 3- to 5-year-old Chinese children (n = 96) and adults (n = 34). With four scenarios that contained different transfer types (giving, stealing, losing, and abandoning), participants were asked four questions about ownership. The results indicated that preschoolers’ ability to distinguish legitimate ownership transfers from illegitimate ownership transfers improved with age. Three-year-olds understood that ownership cannot be transferred in a stealing context, but the appropriate understanding of ownership was not attained until 4 years old in a giving context and 5 years old in losing and abandoning contexts, which is similar to the adults’ performance. In addition to the first possessor bias (a tendency to judge the first possessor as the owner) found in previous studies, 3-year-olds also displayed a loan bias (a tendency to believe everything that is transferred should be returned) in the study. The findings suggest that the developmental trajectories of preschoolers’ understanding of ownership transfers varied across different contexts, which may relate to children’s ability to consider the role of intent in determining ownership and parents’ disciplinary behavior. Both cross-cultural similarities and differences are discussed

Directory of Open Access Journals

Institutional Repository of Institute of Psychology, Chinese Academy of Sciences

Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization

Author: Sugiyama Masashi
Xie Zeke
Yuan Li
Zhu Zhanxing
Publication venue
Publication date: 01/01/2021
Field of study

It is well-known that stochastic gradient noise (SGN) acts as implicit regularization for deep learning and is essentially important for both optimization and generalization of deep networks. Some works attempted to artificially simulate SGN by injecting random noise to improve deep learning. However, it turned out that the injected simple random noise cannot work as well as SGN, which is anisotropic and parameter-dependent. For simulating SGN at low computational costs and without changing the learning rate or batch size, we propose the Positive-Negative Momentum (PNM) approach that is a powerful alternative to conventional Momentum in classic optimizers. The introduced PNM method maintains two approximate independent momentum terms. Then, we can control the magnitude of SGN explicitly by adjusting the momentum difference. We theoretically prove the convergence guarantee and the generalization advantage of PNM over Stochastic Gradient Descent (SGD). By incorporating PNM into the two conventional optimizers, SGD with Momentum and Adam, our extensive experiments empirically verified the significant advantage of the PNM-based variants over the corresponding conventional Momentum-based optimizers.Comment: ICML 2021; 20 pages; 13 figures; Key Words: deep learning theory, optimizer, momentum, generalization, gradient nois

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Amata: An Annealing Mechanism for Adversarial Training Acceleration

Author: Li Qianxiao
Ye Nanyang
Zhou Xiao-Yun
Zhu Zhanxing
Publication venue
Publication date: 15/12/2020
Field of study

Despite the empirical success in various domains, it has been revealed that deep neural networks are vulnerable to maliciously perturbed input data that much degrade their performance. This is known as adversarial attacks. To counter adversarial attacks, adversarial training formulated as a form of robust optimization has been demonstrated to be effective. However, conducting adversarial training brings much computational overhead compared with standard training. In order to reduce the computational cost, we propose an annealing mechanism, Amata, to reduce the overhead associated with adversarial training. The proposed Amata is provably convergent, well-motivated from the lens of optimal control theory and can be combined with existing acceleration methods to further enhance performance. It is demonstrated that on standard datasets, Amata can achieve similar or better robustness with around 1/3 to 1/2 the computational time compared with traditional methods. In addition, Amata can be incorporated into other adversarial training acceleration algorithms (e.g. YOPO, Free, Fast, and ATTA), which leads to further reduction in computational time on large-scale problems.Comment: accepted by AAA

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Towards Making Deep Transfer Learning Never Hurt

Author: Huan Jun
Li Xingjian
Wan Ruosi
Xiong Haoyi
Zhu Zhanxing
Publication venue
Publication date: 18/11/2019
Field of study

Transfer learning have been frequently used to improve deep neural network training through incorporating weights of pre-trained networks as the starting-point of optimization for regularization. While deep transfer learning can usually boost the performance with better accuracy and faster convergence, transferring weights from inappropriate networks hurts training procedure and may lead to even lower accuracy. In this paper, we consider deep transfer learning as minimizing a linear combination of empirical loss and regularizer based on pre-trained weights, where the regularizer would restrict the training procedure from lowering the empirical loss, with conflicted descent directions (e.g., derivatives). Following the view, we propose a novel strategy making regularization-based Deep Transfer learning Never Hurt (DTNH) that, for each iteration of training procedure, computes the derivatives of the two terms separately, then re-estimates a new descent direction that does not hurt the empirical loss minimization while preserving the regularization affects from the pre-trained weights. Extensive experiments have been done using common transfer learning regularizers, such as L2-SP and knowledge distillation, on top of a wide range of deep transfer learning benchmarks including Caltech, MIT indoor 67, CIFAR-10 and ImageNet. The empirical results show that the proposed descent direction estimation strategy DTNH can always improve the performance of deep transfer learning tasks based on all above regularizers, even when transferring pre-trained weights from inappropriate networks. All in all, DTNH strategy can improve state-of-the-art regularizers in all cases with 0.1%--7% higher accuracy in all experiments.Comment: 10 page

arXiv.org e-Print Archive

Crossref

Southampton (e-Prints Soton)

Spatial-temporal fusion graph neural networks for traffic flow forecasting

Author: Li Mengzhang
Zhu Zhanxing
Publication venue: AAAI Press
Publication date: 18/05/2021
Field of study

Spatial-temporal data forecasting of traffic flow is a challenging task because of complicated spatial dependencies and dynamical trends of temporal pattern between different roads. Existing frameworks typically utilize given spatial adjacency graph and sophisticated mechanisms for modeling spatial and temporal correlations. However, limited representations of given spatial graph structure with incomplete adjacent connections may restrict effective spatial-temporal dependencies learning of those models. Furthermore, existing methods are out at elbows when solving complicated spatial-temporal data: they usually utilize separate modules for spatial and temporal correlations, or they only use independent components capturing localized or global heterogeneous dependencies. To overcome those limitations, our paper proposes a novel Spatial-Temporal Fusion Graph Neural Networks (STFGNN) for traffic flow forecasting. First, a data-driven method of generating “temporal graph” is proposed to compensate several existing correlations that spatial graph may not reflect. SFTGNN could effectively learn hidden spatial-temporal dependencies by a novel fusion operation of various spatial and temporal graphs, treated for different time periods in parallel. Meanwhile, by integrating this fusion graph module and a novel gated convolution module into a unified layer, SFTGNN could handle long sequences by learning more spatial-temporal dependencies with layers stacked. Experimental results on several public traffic datasets demonstrate that our method achieves state-of-the-art performance consistently than other baselines

Southampton (e-Prints Soton)

Neural lad: a neural latent dynamics framework for times series modeling

Author: Li Jianguo
Li Ting
Zhu Zhanxing
Publication venue
Publication date: 01/01/2023
Field of study

Southampton (e-Prints Soton)

The Role of Social Value Orientation in Chinese Adolescents’ Moral Emotion Attribution

Author: Dong Dong
Jun Qiao
Zhanxing Li
Publication venue: 'MDPI AG'
Publication date: 01/12/2022
Field of study

Previous studies have explored the role of cognitive factors and sympathy in children’s development of moral emotion attribution, but the effect of personal dispositional factors on adolescents’ moral emotion expectancy has been neglected. In this study, we address this issue by testing adolescents’ moral emotion attribution with different social value orientation (SVO). Eight hundred and eighty Chinese adolescents were classified into proselfs, prosocials and mixed types in SVO and asked to indicate their moral emotions in four moral contexts (prosocial, antisocial, failing to act prosocially (FAP) and resisting antisocial impulse (RAI)). The findings revealed an obvious contextual effect in adolescents’ moral emotion attribution and the effect depends on SVO. Prosocials evaluated more positively than proselfs and mixed types in the prosocial and RAI contexts, but proselfs evaluated more positively than prosocials and mixed types in the antisocial and FAP contexts. The findings indicate that individual differences of adolescents’ moral emotion attribution have roots in their social value orientation, and suggest the role of dispositional factors in the processing of moral emotion

Directory of Open Access Journals