4 research outputs found

    Data collaboration analysis for distributed datasets

    Full text link
    In this paper, we propose a data collaboration analysis method for distributed datasets. The proposed method is a centralized machine learning while training datasets and models remain distributed over some institutions. Recently, data became large and distributed with decreasing costs of data collection. If we can centralize these distributed datasets and analyse them as one dataset, we expect to obtain novel insight and achieve a higher prediction performance compared with individual analyses on each distributed dataset. However, it is generally difficult to centralize the original datasets due to their huge data size or regarding a privacy-preserving problem. To avoid these difficulties, we propose a data collaboration analysis method for distributed datasets without sharing the original datasets. The proposed method centralizes only intermediate representation constructed individually instead of the original dataset.Comment: 7 page

    Interpretable collaborative data analysis on distributed data

    Full text link
    This paper proposes an interpretable non-model sharing collaborative data analysis method as one of the federated learning systems, which is an emerging technology to analyze distributed data. Analyzing distributed data is essential in many applications such as medical, financial, and manufacturing data analyses due to privacy, and confidentiality concerns. In addition, interpretability of the obtained model has an important role for practical applications of the federated learning systems. By centralizing intermediate representations, which are individually constructed in each party, the proposed method obtains an interpretable model, achieving a collaborative analysis without revealing the individual data and learning model distributed over local parties. Numerical experiments indicate that the proposed method achieves better recognition performance for artificial and real-world problems than individual analysis.Comment: 16 pages, 3 figures, 3 table

    Projection method for interior eigenproblems of linear nonsquare matrix pencils

    Full text link
    Eigensolvers involving complex moments can determine all the eigenvalues in a given region in the complex plane and the corresponding eigenvectors of a regular linear matrix pencil. The complex moment acts as a filter for extracting eigencomponents of interest from random vectors or matrices. This study extends a projection method for regular eigenproblems to the singular nonsquare case, thus replacing the standard matrix inverse in the resolvent with the pseudoinverse. The extended method involves complex moments given by the contour integrals of generalized resolvents associated with nonsquare matrices. We establish conditions such that the method gives all finite eigenvalues in a prescribed region in the complex plane. In numerical computations, the contour integrals are approximated using numerical quadratures. The primary cost lies in the solutions of linear least squares problems that arise from quadrature points, and they can be readily parallelized in practice. Numerical experiments on large matrix pencils illustrate this method. The new method is more robust and efficient than previous methods, and based on experimental results, it is conjectured to be more efficient in parallelized settings. Notably, the proposed method does not fail in cases involving pairs of extremely close eigenvalues, and it overcomes the issue of problem size.Comment: 20 pages, 2 figure

    Multiclass spectral feature scaling method for dimensionality reduction

    Full text link
    Irregular features disrupt the desired classification. In this paper, we consider aggressively modifying scales of features in the original space according to the label information to form well-separated clusters in low-dimensional space. The proposed method exploits spectral clustering to derive scaling factors that are used to modify the features. Specifically, we reformulate the Laplacian eigenproblem of the spectral clustering as an eigenproblem of a linear matrix pencil whose eigenvector has the scaling factors. Numerical experiments show that the proposed method outperforms well-established supervised dimensionality reduction methods for toy problems with more samples than features and real-world problems with more features than samples
    corecore