143 research outputs found
Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features
For evaluating performance of nonlinear features and iterative and non-iterative classification algorithms (i.e. kernel support vector machine (KSVM), random forest (RaF), least squares SVM (LS-SVM) and multi-surface proximal SVM based oblique RaF (ORaF) for ECG quality assessment we compared the four algorithms on 7 feature schemes yielded from 27 linear and nonlinear features including four features derived from a new encoding Lempel–Ziv complexity (ELZC) and the other 26 features. Seven feature schemes include the first scheme consisting of 7 waveform features, the second consisting of 15 waveform and frequency features, the third consisting of 19 waveform, frequency and approximate entropy (ApEn) features, the fourth consisting of 19 waveform, frequency and permutation entropy (PE) features, the fifth consisting of 19 waveform, frequency and ELZC features, the sixth consisting of 23 waveform, frequency, PE and ELZC features, and the last consisting of all 27 features. Up to 1500 mobile ECG recordings from the Physionet/Computing in Cardiology Challenge 2011 were employed in this study. Three indices i.e., sensitivity (Se), specificity (Sp) and accuracy (Acc), were used for evaluating performances of the classifiers on the seven feature schemes, respectively. The experiment results indicated PE and ELZC can help to improve performance of the aforementioned four classifiers for assessing ECG quality. Using all features except ApEn features obtained the best performances for each classifier. For this sixth scheme, the LS-SVM yielded the highest Acc of 92.20% on hidden test data, as well as a relatively high Acc of 93.60% on training data. Compared with the other classifiers, the LS-SVM classifier also demonstrated the superior generalization ability
Context-aware multi-head self-attentional neural network model for next location prediction
Accurate activity location prediction is a crucial component of many mobility
applications and is particularly required to develop personalized, sustainable
transportation systems. Despite the widespread adoption of deep learning
models, next location prediction models lack a comprehensive discussion and
integration of mobility-related spatio-temporal contexts. Here, we utilize a
multi-head self-attentional (MHSA) neural network that learns location
transition patterns from historical location visits, their visit time and
activity duration, as well as their surrounding land use functions, to infer an
individual's next location. Specifically, we adopt point-of-interest data and
latent Dirichlet allocation for representing locations' land use contexts at
multiple spatial scales, generate embedding vectors of the spatio-temporal
features, and learn to predict the next location with an MHSA network. Through
experiments on two large-scale GNSS tracking datasets, we demonstrate that the
proposed model outperforms other state-of-the-art prediction models, and reveal
the contribution of various spatio-temporal contexts to the model's
performance. Moreover, we find that the model trained on population data
achieves higher prediction performance with fewer parameters than
individual-level models due to learning from collective movement patterns. We
also reveal mobility conducted in the recent past and one week before has the
largest influence on the current prediction, showing that learning from a
subset of the historical mobility is sufficient to obtain an accurate location
prediction result. We believe that the proposed model is vital for
context-aware mobility prediction. The gained insights will help to understand
location prediction models and promote their implementation for mobility
applications.Comment: updated Discussion section; accepted by Transportation Research Part
Bifurcation characteristics of torsional-horizontal coupled vibration of rolling mill system
In this paper, the static bifurcation and dynamic bifurcation of rolling mill system with torsional-horizontal coupling vibration are studied. Firstly, the dynamic equation of torsional-horizontal coupled vibration of rolling mill main drive system is established. Rolling mill is driven by the gear pair, so the torsional vibration of the gear pair, the horizontal vibration and the friction factor are considered in the dynamic equation. Then the equivalent low-dimensional bifurcation equation, which can reveal the system nonlinear characteristics, is obtained using Lyapunov-Schmidt method, and the system static bifurcation characteristics are studied using singularity theory. Lastly, the bifurcation condition and stability of the system with dynamic Hopf bifurcation are studied using Hopf bifurcation theorem. Numerical simulation of the actual parameter values confirms the analytical results
Activity Cliff Prediction: Dataset and Benchmark
Activity cliffs (ACs), which are generally defined as pairs of structurally
similar molecules that are active against the same bio-target but significantly
different in the binding potency, are of great importance to drug discovery. Up
to date, the AC prediction problem, i.e., to predict whether a pair of
molecules exhibit the AC relationship, has not yet been fully explored. In this
paper, we first introduce ACNet, a large-scale dataset for AC prediction. ACNet
curates over 400K Matched Molecular Pairs (MMPs) against 190 targets, including
over 20K MMP-cliffs and 380K non-AC MMPs, and provides five subsets for model
development and evaluation. Then, we propose a baseline framework to benchmark
the predictive performance of molecular representations encoded by deep neural
networks for AC prediction, and 16 models are evaluated in experiments. Our
experimental results show that deep learning models can achieve good
performance when the models are trained on tasks with adequate amount of data,
while the imbalanced, low-data and out-of-distribution features of the ACNet
dataset still make it challenging for deep neural networks to cope with. In
addition, the traditional ECFP method shows a natural advantage on MMP-cliff
prediction, and outperforms other deep learning models on most of the data
subsets. To the best of our knowledge, our work constructs the first
large-scale dataset for AC prediction, which may stimulate the study of AC
prediction models and prompt further breakthroughs in AI-aided drug discovery.
The codes and dataset can be accessed by https://drugai.github.io/ACNet/
Hierarchical Few-Shot Object Detection: Problem, Benchmark and Method
Few-shot object detection (FSOD) is to detect objects with a few examples.
However, existing FSOD methods do not consider hierarchical fine-grained
category structures of objects that exist widely in real life. For example,
animals are taxonomically classified into orders, families, genera and species
etc. In this paper, we propose and solve a new problem called hierarchical
few-shot object detection (Hi-FSOD), which aims to detect objects with
hierarchical categories in the FSOD paradigm. To this end, on the one hand, we
build the first large-scale and high-quality Hi-FSOD benchmark dataset
HiFSOD-Bird, which contains 176,350 wild-bird images falling to 1,432
categories. All the categories are organized into a 4-level taxonomy,
consisting of 32 orders, 132 families, 572 genera and 1,432 species. On the
other hand, we propose the first Hi-FSOD method HiCLPL, where a hierarchical
contrastive learning approach is developed to constrain the feature space so
that the feature distribution of objects is consistent with the hierarchical
taxonomy and the model's generalization power is strengthened. Meanwhile, a
probabilistic loss is designed to enable the child nodes to correct the
classification errors of their parent nodes in the taxonomy. Extensive
experiments on the benchmark dataset HiFSOD-Bird show that our method HiCLPL
outperforms the existing FSOD methods.Comment: Accepted by ACM MM 202
Fairness-guided Few-shot Prompting for Large Language Models
Large language models have demonstrated surprising ability to perform
in-context learning, i.e., these models can be directly applied to solve
numerous downstream tasks by conditioning on a prompt constructed by a few
input-output examples. However, prior research has shown that in-context
learning can suffer from high instability due to variations in training
examples, example order, and prompt formats. Therefore, the construction of an
appropriate prompt is essential for improving the performance of in-context
learning. In this paper, we revisit this problem from the view of predictive
bias. Specifically, we introduce a metric to evaluate the predictive bias of a
fixed prompt against labels or a given attributes. Then we empirically show
that prompts with higher bias always lead to unsatisfactory predictive quality.
Based on this observation, we propose a novel search strategy based on the
greedy search to identify the near-optimal prompt for improving the performance
of in-context learning. We perform comprehensive experiments with
state-of-the-art mainstream models such as GPT-3 on various downstream tasks.
Our results indicate that our method can enhance the model's in-context
learning performance in an effective and interpretable manner
Learning Invariant Molecular Representation in Latent Discrete Space
Molecular representation learning lays the foundation for drug discovery.
However, existing methods suffer from poor out-of-distribution (OOD)
generalization, particularly when data for training and testing originate from
different environments. To address this issue, we propose a new framework for
learning molecular representations that exhibit invariance and robustness
against distribution shifts. Specifically, we propose a strategy called
``first-encoding-then-separation'' to identify invariant molecule features in
the latent space, which deviates from conventional practices. Prior to the
separation step, we introduce a residual vector quantization module that
mitigates the over-fitting to training data distributions while preserving the
expressivity of encoders. Furthermore, we design a task-agnostic
self-supervised learning objective to encourage precise invariance
identification, which enables our method widely applicable to a variety of
tasks, such as regression and multi-label classification. Extensive experiments
on 18 real-world molecular datasets demonstrate that our model achieves
stronger generalization against state-of-the-art baselines in the presence of
various distribution shifts. Our code is available at
https://github.com/HICAI-ZJU/iMoLD
- …