Search CORE

143 research outputs found

Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features

Author: Liu Chengyu
Wei Shoushui
Zhang Li
Zhang Yatao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/04/2018
Field of study

For evaluating performance of nonlinear features and iterative and non-iterative classification algorithms (i.e. kernel support vector machine (KSVM), random forest (RaF), least squares SVM (LS-SVM) and multi-surface proximal SVM based oblique RaF (ORaF) for ECG quality assessment we compared the four algorithms on 7 feature schemes yielded from 27 linear and nonlinear features including four features derived from a new encoding Lempel–Ziv complexity (ELZC) and the other 26 features. Seven feature schemes include the first scheme consisting of 7 waveform features, the second consisting of 15 waveform and frequency features, the third consisting of 19 waveform, frequency and approximate entropy (ApEn) features, the fourth consisting of 19 waveform, frequency and permutation entropy (PE) features, the fifth consisting of 19 waveform, frequency and ELZC features, the sixth consisting of 23 waveform, frequency, PE and ELZC features, and the last consisting of all 27 features. Up to 1500 mobile ECG recordings from the Physionet/Computing in Cardiology Challenge 2011 were employed in this study. Three indices i.e., sensitivity (Se), specificity (Sp) and accuracy (Acc), were used for evaluating performances of the classifiers on the seven feature schemes, respectively. The experiment results indicated PE and ELZC can help to improve performance of the aforementioned four classifiers for assessing ECG quality. Using all features except ApEn features obtained the best performances for each classifier. For this sixth scheme, the LS-SVM yielded the highest Acc of 92.20% on hidden test data, as well as a relatively high Acc of 93.60% on training data. Compared with the other classifiers, the LS-SVM classifier also demonstrated the superior generalization ability

Northumbria Research Link

Crossref

Context-aware multi-head self-attentional neural network model for next location prediction

Author: Hong Ye
Raubal Martin
Schindler Konrad
Zhang Yatao
Publication venue
Publication date: 21/08/2023
Field of study

Accurate activity location prediction is a crucial component of many mobility applications and is particularly required to develop personalized, sustainable transportation systems. Despite the widespread adoption of deep learning models, next location prediction models lack a comprehensive discussion and integration of mobility-related spatio-temporal contexts. Here, we utilize a multi-head self-attentional (MHSA) neural network that learns location transition patterns from historical location visits, their visit time and activity duration, as well as their surrounding land use functions, to infer an individual's next location. Specifically, we adopt point-of-interest data and latent Dirichlet allocation for representing locations' land use contexts at multiple spatial scales, generate embedding vectors of the spatio-temporal features, and learn to predict the next location with an MHSA network. Through experiments on two large-scale GNSS tracking datasets, we demonstrate that the proposed model outperforms other state-of-the-art prediction models, and reveal the contribution of various spatio-temporal contexts to the model's performance. Moreover, we find that the model trained on population data achieves higher prediction performance with fewer parameters than individual-level models due to learning from collective movement patterns. We also reveal mobility conducted in the recent past and one week before has the largest influence on the current prediction, showing that learning from a subset of the historical mobility is sufficient to obtain an accurate location prediction result. We believe that the proposed model is vital for context-aware mobility prediction. The gained insights will help to understand location prediction models and promote their implementation for mobility applications.Comment: updated Discussion section; accepted by Transportation Research Part

arXiv.org e-Print Archive

Bifurcation characteristics of torsional-horizontal coupled vibration of rolling mill system

Author: Meng Zong
Shuang Liu
Yatao Shi
Yunpeng Zhang
Publication venue: 'JVE International Ltd.'
Publication date: 15/05/2017
Field of study

In this paper, the static bifurcation and dynamic bifurcation of rolling mill system with torsional-horizontal coupling vibration are studied. Firstly, the dynamic equation of torsional-horizontal coupled vibration of rolling mill main drive system is established. Rolling mill is driven by the gear pair, so the torsional vibration of the gear pair, the horizontal vibration and the friction factor are considered in the dynamic equation. Then the equivalent low-dimensional bifurcation equation, which can reveal the system nonlinear characteristics, is obtained using Lyapunov-Schmidt method, and the system static bifurcation characteristics are studied using singularity theory. Lastly, the bifurcation condition and stability of the system with dynamic Hopf bifurcation are studied using Hopf bifurcation theorem. Numerical simulation of the actual parameter values confirms the analytical results

Journal of Vibroengineering

Directory of Open Access Journals

Activity Cliff Prediction: Dataset and Benchmark

Author: Bian Yatao
Xie Ailin
Zhang Ziqiao
Zhao Bangyi
Zhou Shuigeng
Publication venue
Publication date: 15/02/2023
Field of study

Activity cliffs (ACs), which are generally defined as pairs of structurally similar molecules that are active against the same bio-target but significantly different in the binding potency, are of great importance to drug discovery. Up to date, the AC prediction problem, i.e., to predict whether a pair of molecules exhibit the AC relationship, has not yet been fully explored. In this paper, we first introduce ACNet, a large-scale dataset for AC prediction. ACNet curates over 400K Matched Molecular Pairs (MMPs) against 190 targets, including over 20K MMP-cliffs and 380K non-AC MMPs, and provides five subsets for model development and evaluation. Then, we propose a baseline framework to benchmark the predictive performance of molecular representations encoded by deep neural networks for AC prediction, and 16 models are evaluated in experiments. Our experimental results show that deep learning models can achieve good performance when the models are trained on tasks with adequate amount of data, while the imbalanced, low-data and out-of-distribution features of the ACNet dataset still make it challenging for deep neural networks to cope with. In addition, the traditional ECFP method shows a natural advantage on MMP-cliff prediction, and outperforms other deep learning models on most of the data subsets. To the best of our knowledge, our work constructs the first large-scale dataset for AC prediction, which may stimulate the study of AC prediction models and prompt further breakthroughs in AI-aided drug discovery. The codes and dataset can be accessed by https://drugai.github.io/ACNet/

arXiv.org e-Print Archive

Hierarchical Few-Shot Object Detection: Problem, Benchmark and Method

Author: Bian Yatao
Guan Jihong
Wang Yang
Zhang Chenbo
Zhang Lu
Zhang Yinglu
Zhou Jiaogen
Zhou Shuigeng
Publication venue
Publication date: 08/10/2022
Field of study

Few-shot object detection (FSOD) is to detect objects with a few examples. However, existing FSOD methods do not consider hierarchical fine-grained category structures of objects that exist widely in real life. For example, animals are taxonomically classified into orders, families, genera and species etc. In this paper, we propose and solve a new problem called hierarchical few-shot object detection (Hi-FSOD), which aims to detect objects with hierarchical categories in the FSOD paradigm. To this end, on the one hand, we build the first large-scale and high-quality Hi-FSOD benchmark dataset HiFSOD-Bird, which contains 176,350 wild-bird images falling to 1,432 categories. All the categories are organized into a 4-level taxonomy, consisting of 32 orders, 132 families, 572 genera and 1,432 species. On the other hand, we propose the first Hi-FSOD method HiCLPL, where a hierarchical contrastive learning approach is developed to constrain the feature space so that the feature distribution of objects is consistent with the hierarchical taxonomy and the model's generalization power is strengthened. Meanwhile, a probabilistic loss is designed to enable the child nodes to correct the classification errors of their parent nodes in the taxonomy. Extensive experiments on the benchmark dataset HiFSOD-Bird show that our method HiCLPL outperforms the existing FSOD methods.Comment: Accepted by ACM MM 202

arXiv.org e-Print Archive

Fairness-guided Few-shot Prompting for Large Language Models

Author: Bian Yatao
Fu Huazhu
Hu Qinghua
Liu Lemao
Ma Huan
Wu Bingzhe
Zhang Changqing
Zhang Shu
Zhang Zhirui
Zhao Peilin
Publication venue
Publication date: 25/03/2023
Field of study

Large language models have demonstrated surprising ability to perform in-context learning, i.e., these models can be directly applied to solve numerous downstream tasks by conditioning on a prompt constructed by a few input-output examples. However, prior research has shown that in-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats. Therefore, the construction of an appropriate prompt is essential for improving the performance of in-context learning. In this paper, we revisit this problem from the view of predictive bias. Specifically, we introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes. Then we empirically show that prompts with higher bias always lead to unsatisfactory predictive quality. Based on this observation, we propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning. We perform comprehensive experiments with state-of-the-art mainstream models such as GPT-3 on various downstream tasks. Our results indicate that our method can enhance the model's in-context learning performance in an effective and interpretable manner

arXiv.org e-Print Archive

Learning Invariant Molecular Representation in Latent Discrete Space

Author: Bian Yatao
Chen Hongyang
Chen Huajun
Ding Keyan
Lv Jingsong
Wang Xiao
Zhang Qiang
Zhuang Xiang
Publication venue
Publication date: 22/10/2023
Field of study

Molecular representation learning lays the foundation for drug discovery. However, existing methods suffer from poor out-of-distribution (OOD) generalization, particularly when data for training and testing originate from different environments. To address this issue, we propose a new framework for learning molecular representations that exhibit invariance and robustness against distribution shifts. Specifically, we propose a strategy called ``first-encoding-then-separation'' to identify invariant molecule features in the latent space, which deviates from conventional practices. Prior to the separation step, we introduce a residual vector quantization module that mitigates the over-fitting to training data distributions while preserving the expressivity of encoders. Furthermore, we design a task-agnostic self-supervised learning objective to encourage precise invariance identification, which enables our method widely applicable to a variety of tasks, such as regression and multi-label classification. Extensive experiments on 18 real-world molecular datasets demonstrate that our model achieves stronger generalization against state-of-the-art baselines in the presence of various distribution shifts. Our code is available at https://github.com/HICAI-ZJU/iMoLD

arXiv.org e-Print Archive