89 research outputs found
Model-free Change-point Detection Using Modern Classifiers
In contemporary data analysis, it is increasingly common to work with
non-stationary complex datasets. These datasets typically extend beyond the
classical low-dimensional Euclidean space, making it challenging to detect
shifts in their distribution without relying on strong structural assumptions.
This paper introduces a novel offline change-point detection method that
leverages modern classifiers developed in the machine-learning community. With
suitable data splitting, the test statistic is constructed through sequential
computation of the Area Under the Curve (AUC) of a classifier, which is trained
on data segments on both ends of the sequence. It is shown that the resulting
AUC process attains its maxima at the true change-point location, which
facilitates the change-point estimation. The proposed method is characterized
by its complete nonparametric nature, significant versatility, considerable
flexibility, and absence of stringent assumptions pertaining to the underlying
data or any distributional shifts. Theoretically, we derive the limiting
pivotal distribution of the proposed test statistic under null, as well as the
asymptotic behaviors under both local and fixed alternatives. The weak
consistency of the change-point estimator is provided. Extensive simulation
studies and the analysis of two real-world datasets illustrate the superior
performance of our approach compared to existing model-free change-point
detection methods
Two-Sample and Change-Point Inference for Non-Euclidean Valued Time Series
Data objects taking value in a general metric space have become increasingly
common in modern data analysis. In this paper, we study two important
statistical inference problems, namely, two-sample testing and change-point
detection, for such non-Euclidean data under temporal dependence. Typical
examples of non-Euclidean valued time series include yearly mortality
distributions, time-varying networks, and covariance matrix time series. To
accommodate unknown temporal dependence, we advance the self-normalization (SN)
technique (Shao, 2010) to the inference of non-Euclidean time series, which is
substantially different from the existing SN-based inference for functional
time series that reside in Hilbert space (Zhang et al., 2011). Theoretically,
we propose new regularity conditions that could be easier to check than those
in the recent literature, and derive the limiting distributions of the proposed
test statistics under both null and local alternatives. For change-point
detection problem, we also derive the consistency for the change-point location
estimator, and combine our proposed change-point test with wild binary
segmentation to perform multiple change-point estimation. Numerical simulations
demonstrate the effectiveness and robustness of our proposed tests compared
with existing methods in the literature. Finally, we apply our tests to
two-sample inference in mortality data and change-point detection in
cryptocurrency data
Testing Serial Independence of Object-Valued Time Series
We propose a novel method for testing serial independence of object-valued
time series in metric spaces, which is more general than Euclidean or Hilbert
spaces. The proposed method is fully nonparametric, free of tuning parameters,
and can capture all nonlinear pairwise dependence. The key concept used in this
paper is the distance covariance in metric spaces, which is extended to auto
distance covariance for object-valued time series. Furthermore, we propose a
generalized spectral density function to account for pairwise dependence at all
lags and construct a Cramer-von Mises type test statistic. New theoretical
arguments are developed to establish the asymptotic behavior of the test
statistic. A wild bootstrap is also introduced to obtain the critical values of
the non-pivotal limiting null distribution. Extensive numerical simulations and
two real data applications are conducted to illustrate the effectiveness and
versatility of our proposed method
SNSeg: An R Package for Time Series Segmentation via Self-Normalization
Time series segmentation aims to identify potential change-points in a
sequence of temporally dependent data, so that the original sequence can be
partitioned into several homogeneous subsequences. It is useful for modeling
and predicting non-stationary time series and is widely applied in natural and
social sciences. Existing segmentation methods primarily focus on only one type
of parameter changes such as mean and variance, and they typically depend on
laborious tuning or smoothing parameters, which can be challenging to choose in
practice. The self-normalization based change-point estimation framework SNCP
by Zhao et al. (2022), however, offers users more flexibility and convenience
as it allows for change-point estimation of different types of parameters (e.g.
mean, variance, quantile and autocovariance) in a unified fashion, and requires
effortless tuning. In this paper, the R package SNSeg is introduced to
implement SNCP for segmentation of univariate and multivariate time series. An
extension of SNCP, named SNHD, is also designed and implemented for
change-point estimation in the mean vector of high-dimensional time series. The
estimated changepoints as well as segmented time series are available with
graphical tools. Detailed examples of SNSeg are given in simulations of
multivariate autoregressive processes with change-points
Matrix GARCH Model: Inference and Application
Matrix-variate time series data are largely available in applications.
However, no attempt has been made to study their conditional heteroskedasticity
that is often observed in economic and financial data. To address this gap, we
propose a novel matrix generalized autoregressive conditional
heteroskedasticity (GARCH) model to capture the dynamics of conditional row and
column covariance matrices of matrix time series. The key innovation of the
matrix GARCH model is the use of a univariate GARCH specification for the trace
of conditional row or column covariance matrix, which allows for the
identification of conditional row and column covariance matrices. Moreover, we
introduce a quasi maximum likelihood estimator (QMLE) for model estimation and
develop a portmanteau test for model diagnostic checking. Simulation studies
are conducted to assess the finite-sample performance of the QMLE and
portmanteau test. To handle large dimensional matrix time series, we also
propose a matrix factor GARCH model. Finally, we demonstrate the superiority of
the matrix GARCH and matrix factor GARCH models over existing multivariate
GARCH-type models in volatility forecasting and portfolio allocations using
three applications on credit default swap prices, global stock sector indices,
and future prices
In-house deep environmental sentience for smart homecare solutions toward ageing society.
With an increasing amount of elderly people needing home care around the clock, care workers are not able to keep up with the demand of providing maximum support to those who require it. As medical costs of home care increase the quality is care suffering as a result of staff shortages, a solution is desperately needed to make the valuable care time of these workers more efficient. This paper proposes a system that is able to make use of the deep learning resources currently available to produce a base system that could provide a solution to many of the problems that care homes and staff face today. Transfer learning was conducted on a deep convolutional neural network to recognize common household objects was proposed. This system showed promising results with an accuracy, sensitivity and specificity of 90.6%, 0.90977 and 0.99668 respectively. Real-time applications were also considered, with the system achieving a maximum speed of 19.6 FPS on an MSI GTX 1060 GPU with 4GB of VRAM allocated
CodeKGC: Code Language Model for Generative Knowledge Graph Construction
Current generative knowledge graph construction approaches usually fail to
capture structural knowledge by simply flattening natural language into
serialized texts or a specification language. However, large generative
language model trained on structured data such as code has demonstrated
impressive capability in understanding natural language for structural
prediction and reasoning tasks. Intuitively, we address the task of generative
knowledge graph construction with code language model: given a code-format
natural language input, the target is to generate triples which can be
represented as code completion tasks. Specifically, we develop schema-aware
prompts that effectively utilize the semantic structure within the knowledge
graph. As code inherently possesses structure, such as class and function
definitions, it serves as a useful model for prior semantic structural
knowledge. Furthermore, we employ a rationale-enhanced generation method to
boost the performance. Rationales provide intermediate steps, thereby improving
knowledge extraction abilities. Experimental results indicate that the proposed
approach can obtain better performance on benchmark datasets compared with
baselines. Code and datasets are available in
https://github.com/zjunlp/DeepKE/tree/main/example/llm.Comment: Work in progres
Hybrid in Radiative Decays from Lattice QCD
We present the first theoretical prediction of the production rate of
light hybrid meson in radiative decays. In the
lattice QCD formalism with the pion mass MeV, the
related electromagnetic multipole form factors are extracted from the
three-point functions that involve necessarily quark annihilation diagrams,
which are calculated through the distillation method. The partial width of
is determined to be at the
mass GeV. If corresponds to the recently
observed in the process by BESIII, then the branching fraction is estimated to be , which implies
- …