    Improved balanced incomplete factorization

    [EN] . In this paper we improve the BIF algorithm which computes simultaneously the LU factors (direct factors) of a given matrix and their inverses (inverse factors). This algorithm was introduced in [R. Bru, J. Mar´ın, J. Mas, and M. T˚uma, SIAM J. Sci. Comput., 30 (2008), pp. 2302– 2318]. The improvements are based on a deeper understanding of the inverse Sherman–Morrison (ISM) decomposition, and they provide a new insight into the BIF decomposition. In particular, it is shown that a slight algorithmic reformulation of the basic algorithm implies that the direct and inverse factors numerically influence each other even without any dropping for incompleteness. Algorithmically, the nonsymmetric version of the improved BIF algorithm is formulated. Numerical experiments show very high robustness of the incomplete implementation of the algorithm used for preconditioning nonsymmetric linear systemsReceived by the editors January 26, 2009; accepted for publication (in revised form) by V. Simoncini June 1, 2010; published electronically August 12, 2010. This work was supported by Spanish grant MTM 2007-64477, by project IAA100300802 of the Grant Agency of the Academy of Sciences of the Czech Republic, and partially also by the International Collaboration Support M100300902 of AS CR.Bru García, R.; Marín Mateos-Aparicio, J.; Mas Marí, J.; Tuma, M. (2010). Improved balanced incomplete factorization. SIAM Journal on Matrix Analysis and Applications. 31(5):2431-2452. https://doi.org/10.1137/090747804S2431245231

    Balanced incomplete factorization preconditioner with pivoting

    [EN] In this work we study pivoting strategies for the preconditioner presented in Bru (SIAM J Sci Comput 30(5):2302-2318, 2008) which computes the LU factorization of a matrix A. This preconditioner is based on the Inverse Sherman Morrison (ISM) decomposition [Preconditioning sparse nonsymmetric linear systems with the Sherman-Morrison formula. Bru (SIAM J Sci Comput 25(2):701-715, 2003), that using recursion formulas derived from the Sherman-Morrison formula, obtains the direct and inverse LU factors of a matrix. We present a modification of the ISM decomposition that allows for pivoting, and so the computation of preconditioners for any nonsingular matrix. While the ISM algorithm at a given step computes only a new pair of vectors, the new pivoting algorithm in the k-th step also modifies all the remaining vectors from k + 1 to n. Thus, it can be seen as a right looking version of the ISM decomposition. The results of numerical experiments with ill-conditioned and highly indefinite matrices arising from different applications show the robustness of the new algorithm, since it is able to solve problems that are not possible to solve otherwise.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. The work was supported by Conselleria de Innovacion, Universidades, Ciencia y Sociedad Digital, Generalitat Valenciana (CIAICO/2021/162).Marín Mateos-Aparicio, J.; Mas Marí, J. (2023). Balanced incomplete factorization preconditioner with pivoting. Revista de la Real Academia de Ciencias Exactas Físicas y Naturales Serie A Matemáticas. 117(1). https://doi.org/10.1007/s13398-022-01334-1117

    Low-rank updates of balanced incomplete factorization preconditioners

    [EN] Let Ax = b be a large and sparse system of linear equations where A is a nonsingular matrix. An approximate solution is frequently obtained by applying preconditioned terations. Consider the matrix B = A + PQT where P,Q ∈ RnĂ—k are full rank matrices. In this work, we study the problem of updating a previously computed preconditioner for A in order to solve the updated linear system Bx = b by preconditioned iterations. In particular, we propose a method for updating a Balanced Incomplete Factorization preconditioner. The strategy is based on the computation of an approximate Inverse Sherman-Morrison decomposition for an equivalent augmented linear system. Approximation properties of the preconditioned matrix and an analysis of the computational cost of the algorithm are studied.     Federated Knowledge Graph Completion via Latent Embedding Sharing and Tensor Factorization

    Knowledge graphs (KGs), which consist of triples, are inherently incomplete and always require completion procedure to predict missing triples. In real-world scenarios, KGs are distributed across clients, complicating completion tasks due to privacy restrictions. Many frameworks have been proposed to address the issue of federated knowledge graph completion. However, the existing frameworks, including FedE, FedR, and FEKG, have certain limitations. = FedE poses a risk of information leakage, FedR's optimization efficacy diminishes when there is minimal overlap among relations, and FKGE suffers from computational costs and mode collapse issues. To address these issues, we propose a novel method, i.e., Federated Latent Embedding Sharing Tensor factorization (FLEST), which is a novel approach using federated tensor factorization for KG completion. FLEST decompose the embedding matrix and enables sharing of latent dictionary embeddings to lower privacy risks. Empirical results demonstrate FLEST's effectiveness and efficiency, offering a balanced solution between performance and privacy. FLEST expands the application of federated tensor factorization in KG completion tasks.Comment: Accepted by ICDM 202

    Updating preconditioners for modified least squares problems

    Principles for problem aggregation and assignment in medium scale multiprocessors

    One of the most important issues in parallel processing is the mapping of workload to processors. This paper considers a large class of problems having a high degree of potential fine grained parallelism, and execution requirements that are either not predictable, or are too costly to predict. The main issues in mapping such a problem onto medium scale multiprocessors are those of aggregation and assignment. We study a method of parameterized aggregation that makes few assumptions about the workload. The mapping of aggregate units of work onto processors is uniform, and exploits locality of workload intensity to balance the unknown workload. In general, a finer aggregate granularity leads to a better balance at the price of increased communication/synchronization costs; the aggregation parameters can be adjusted to find a reasonable granularity. The effectiveness of this scheme is demonstrated on three model problems: an adaptive one-dimensional fluid dynamics problem with message passing, a sparse triangular linear system solver on both a shared memory and a message-passing machine, and a two-dimensional time-driven battlefield simulation employing message passing. Using the model problems, the tradeoffs are studied between balanced workload and the communication/synchronization costs. Finally, an analytical model is used to explain why the method balances workload and minimizes the variance in system behavior
