4 research outputs found
Data collaboration analysis for distributed datasets
In this paper, we propose a data collaboration analysis method for
distributed datasets. The proposed method is a centralized machine learning
while training datasets and models remain distributed over some institutions.
Recently, data became large and distributed with decreasing costs of data
collection. If we can centralize these distributed datasets and analyse them as
one dataset, we expect to obtain novel insight and achieve a higher prediction
performance compared with individual analyses on each distributed dataset.
However, it is generally difficult to centralize the original datasets due to
their huge data size or regarding a privacy-preserving problem. To avoid these
difficulties, we propose a data collaboration analysis method for distributed
datasets without sharing the original datasets. The proposed method centralizes
only intermediate representation constructed individually instead of the
original dataset.Comment: 7 page
Interpretable collaborative data analysis on distributed data
This paper proposes an interpretable non-model sharing collaborative data
analysis method as one of the federated learning systems, which is an emerging
technology to analyze distributed data. Analyzing distributed data is essential
in many applications such as medical, financial, and manufacturing data
analyses due to privacy, and confidentiality concerns. In addition,
interpretability of the obtained model has an important role for practical
applications of the federated learning systems. By centralizing intermediate
representations, which are individually constructed in each party, the proposed
method obtains an interpretable model, achieving a collaborative analysis
without revealing the individual data and learning model distributed over local
parties. Numerical experiments indicate that the proposed method achieves
better recognition performance for artificial and real-world problems than
individual analysis.Comment: 16 pages, 3 figures, 3 table
Projection method for interior eigenproblems of linear nonsquare matrix pencils
Eigensolvers involving complex moments can determine all the eigenvalues in a
given region in the complex plane and the corresponding eigenvectors of a
regular linear matrix pencil. The complex moment acts as a filter for
extracting eigencomponents of interest from random vectors or matrices. This
study extends a projection method for regular eigenproblems to the singular
nonsquare case, thus replacing the standard matrix inverse in the resolvent
with the pseudoinverse. The extended method involves complex moments given by
the contour integrals of generalized resolvents associated with nonsquare
matrices. We establish conditions such that the method gives all finite
eigenvalues in a prescribed region in the complex plane. In numerical
computations, the contour integrals are approximated using numerical
quadratures. The primary cost lies in the solutions of linear least squares
problems that arise from quadrature points, and they can be readily
parallelized in practice. Numerical experiments on large matrix pencils
illustrate this method. The new method is more robust and efficient than
previous methods, and based on experimental results, it is conjectured to be
more efficient in parallelized settings. Notably, the proposed method does not
fail in cases involving pairs of extremely close eigenvalues, and it overcomes
the issue of problem size.Comment: 20 pages, 2 figure
Multiclass spectral feature scaling method for dimensionality reduction
Irregular features disrupt the desired classification. In this paper, we
consider aggressively modifying scales of features in the original space
according to the label information to form well-separated clusters in
low-dimensional space. The proposed method exploits spectral clustering to
derive scaling factors that are used to modify the features. Specifically, we
reformulate the Laplacian eigenproblem of the spectral clustering as an
eigenproblem of a linear matrix pencil whose eigenvector has the scaling
factors. Numerical experiments show that the proposed method outperforms
well-established supervised dimensionality reduction methods for toy problems
with more samples than features and real-world problems with more features than
samples