6 research outputs found
An effective scheme for QoS estimation via alternating direction method-based matrix factorization
Accurately estimating unknown quality-of-service (QoS) data based on historical records of Web-service invocations is vital for automatic service selection. This work presents an effective scheme for addressing this issue via alternating direction method-based matrix factorization. Its main idea consists of a) adopting the principle of the alternating direction method to decompose the task of building a matrix factorization-based QoS-estimator into small subtasks, where each one trains a subset of desired parameters based on the latest status of the whole parameter set; b) building an ensemble of diversified single models with sophisticated diversifying and aggregating mechanism; and c) parallelizing the construction process of the ensemble to drastically reduce the time cost. Experimental results on two industrial QoS datasets demonstrate that with the proposed scheme, more accurate QoS estimates can be achieved than its peers with comparable computing time with the help of its practical parallelization.This work was supported in part by the FDCT (Fundo para o Desenvolvimento das Ciências e da Tecnologia) under Grant119/2014/A3, in part by the National Natu-ral Science Foundation of China under Grant 61370150, and Grant 61433014; in part by the Young Scientist Foun-dation of Chongqing under Grant cstc2014kjrc-qnrc40005; in part by the Chongqing Research Program of Basic Re-search and Frontier Technology under Grant cstc2015jcyjB0244; in part by the Postdoctoral Science Funded Project of Chongqing under Grant Xm2014043; in part by the Fundamental Research Funds for the Central Universities under Grant 106112015CDJXY180005; in part by the Specialized Research Fund for the Doctoral Pro-gram of Higher Education under Grant 20120191120030
A Dynamic Linear Bias Incorporation Scheme for Nonnegative Latent Factor Analysis
High-Dimensional and Incomplete (HDI) data is commonly encountered in big
data-related applications like social network services systems, which are
concerning the limited interactions among numerous nodes. Knowledge acquisition
from HDI data is a vital issue in the domain of data science due to their
embedded rich patterns like node behaviors, where the fundamental task is to
perform HDI data representation learning. Nonnegative Latent Factor Analysis
(NLFA) models have proven to possess the superiority to address this issue,
where a linear bias incorporation (LBI) scheme is important in present the
training overshooting and fluctuation, as well as preventing the model from
premature convergence. However, existing LBI schemes are all statistic ones
where the linear biases are fixed, which significantly restricts the
scalability of the resultant NLFA model and results in loss of representation
learning ability to HDI data. Motivated by the above discoveries, this paper
innovatively presents the dynamic linear bias incorporation (DLBI) scheme. It
firstly extends the linear bias vectors into matrices, and then builds a binary
weight matrix to switch the active/inactive states of the linear biases. The
weight matrix's each entry switches between the binary states dynamically
corresponding to the linear bias value variation, thereby establishing the
dynamic linear biases for an NLFA model. Empirical studies on three HDI
datasets from real applications demonstrate that the proposed DLBI-based NLFA
model obtains higher representation accuracy several than state-of-the-art
models do, as well as highly-competitive computational efficiency.Comment: arXiv admin note: substantial text overlap with arXiv:2306.03911,
arXiv:2302.12122, arXiv:2306.0364
Proximal Symmetric Non-negative Latent Factor Analysis: A Novel Approach to Highly-Accurate Representation of Undirected Weighted Networks
An Undirected Weighted Network (UWN) is commonly found in big data-related
applications. Note that such a network's information connected with its nodes,
and edges can be expressed as a Symmetric, High-Dimensional and Incomplete
(SHDI) matrix. However, existing models fail in either modeling its intrinsic
symmetry or low-data density, resulting in low model scalability or
representation learning ability. For addressing this issue, a Proximal
Symmetric Nonnegative Latent-factor-analysis (PSNL) model is proposed. It
incorporates a proximal term into symmetry-aware and data density-oriented
objective function for high representation accuracy. Then an adaptive
Alternating Direction Method of Multipliers (ADMM)-based learning scheme is
implemented through a Tree-structured of Parzen Estimators (TPE) method for
high computational efficiency. Empirical studies on four UWNs demonstrate that
PSNL achieves higher accuracy gain than state-of-the-art models, as well as
highly competitive computational efficiency
An Incomplete Tensor Tucker decomposition based Traffic Speed Prediction Method
In intelligent transport systems, it is common and inevitable with missing
data. While complete and valid traffic speed data is of great importance to
intelligent transportation systems. A latent factorization-of-tensors (LFT)
model is one of the most attractive approaches to solve missing traffic data
recovery due to its well-scalability. A LFT model achieves optimization usually
via a stochastic gradient descent (SGD) solver, however, the SGD-based LFT
suffers from slow convergence. To deal with this issue, this work integrates
the unique advantages of the proportional-integral-derivative (PID) controller
into a Tucker decomposition based LFT model. It adopts two-fold ideas: a)
adopting tucker decomposition to build a LFT model for achieving a better
recovery accuracy. b) taking the adjusted instance error based on the PID
control theory into the SGD solver to effectively improve convergence rate. Our
experimental studies on two major city traffic road speed datasets show that
the proposed model achieves significant efficiency gain and highly competitive
prediction accuracy
An ADRC-Incorporated Stochastic Gradient Descent Algorithm for Latent Factor Analysis
High-dimensional and incomplete (HDI) matrix contains many complex
interactions between numerous nodes. A stochastic gradient descent (SGD)-based
latent factor analysis (LFA) model is remarkably effective in extracting
valuable information from an HDI matrix. However, such a model commonly
encounters the problem of slow convergence because a standard SGD algorithm
only considers the current learning error to compute the stochastic gradient
without considering the historical and future state of the learning error. To
address this critical issue, this paper innovatively proposes an
ADRC-incorporated SGD (ADS) algorithm by refining the instance learning error
by considering the historical and future state by following the principle of an
ADRC controller. With it, an ADS-based LFA model is further achieved for fast
and accurate latent factor analysis on an HDI matrix. Empirical studies on two
HDI datasets demonstrate that the proposed model outperforms the
state-of-the-art LFA models in terms of computational efficiency and accuracy
for predicting the missing data of an HDI matrix
An Online Sparse Streaming Feature Selection Algorithm
Online streaming feature selection (OSFS), which conducts feature selection
in an online manner, plays an important role in dealing with high-dimensional
data. In many real applications such as intelligent healthcare platform,
streaming feature always has some missing data, which raises a crucial
challenge in conducting OSFS, i.e., how to establish the uncertain relationship
between sparse streaming features and labels. Unfortunately, existing OSFS
algorithms never consider such uncertain relationship. To fill this gap, we in
this paper propose an online sparse streaming feature selection with
uncertainty (OS2FSU) algorithm. OS2FSU consists of two main parts: 1) latent
factor analysis is utilized to pre-estimate the missing data in sparse
streaming features before con-ducting feature selection, and 2) fuzzy logic and
neighborhood rough set are employed to alleviate the uncertainty between
estimated streaming features and labels during conducting feature selection. In
the experiments, OS2FSU is compared with five state-of-the-art OSFS algorithms
on six real datasets. The results demonstrate that OS2FSU outperforms its
competitors when missing data are encountered in OSFS