22 research outputs found

    Parallel two-stage algorithms for solving the PageRank problem

    Get PDF
    In this work we present parallel algorithms based on the use of two-stage methods for solving the PageRank problem as a linear system. Different parallel versions of these methods are explored and their convergence properties are analyzed. The parallel implementation has been developed using a mixed MPI/OpenMP model to exploit parallelism beyond a single level. In order to investigate and analyze the proposed parallel algorithms, we have used several realistic large datasets. The numerical results show that the proposed algorithms can speed up the time to converge with respect to the parallel Power algorithm and behave better than other well-known techniques.This research was supported by the Spanish Ministry of Economy and Competitiveness (MINECO) and the European Commission (FEDER funds) under Grant Number TIN2015-66972-C5-4-R

    Steady-state analysis of Google-like stochastic matrices

    Get PDF
    Ankara : The Department of Computer Engineering and the Institute of Engineering and Science of Bilkent University, 2007.Thesis (Master's) -- Bilkent University, 2007.Includes bibliographical references leaves 93-97.Many search engines use a two-step process to retrieve from the web pages related to a user’s query. In the first step, traditional text processing is performed to find all pages matching the given query terms. Due to the massive size of the web, this step can result in thousands of retrieved pages. In the second step, many search engines sort the list of retrieved pages according to some ranking criterion to make it manageable for the user. One popular way to create this ranking is to exploit additional information inherent in the web due to its hyperlink structure. One successful and well publicized link-based ranking system is PageRank, the ranking system used by the Google search engine. The dynamically changing matrices reflecting the hyperlink structure of the web and used by Google in ranking pages are not only very large, but they are also sparse, reducible, stochastic matrices with some zero rows. Ranking pages amounts to solving for the steady-state vectors of linear combinations of these matrices with appropriately chosen rank-1 matrices. The most suitable method of choice for this task appears to be the power method. Certain improvements have been obtained using techniques such as quadratic extrapolation and iterative aggregation. In this thesis, we propose iterative methods based on various block partitionings, including those with triangular diagonal blocks obtained using cutsets, for the computation of the steady-state vector of such stochastic matrices. The proposed iterative methods together with power and quadratically extrapolated power methods are coded into a software tool. Experimental results on benchmark matrices show that it is possible to recommend Gauss-Seidel for easier web problems and block Gauss-Seidel with partitionings based on a block upper triangular form in the remaining problems, although it takes about twice as much memory as quadratically extrapolated power method.Noyan, Gökçe NilM.S

    Geometric Learning on Graph Structured Data

    Get PDF
    Graphs provide a ubiquitous and universal data structure that can be applied in many domains such as social networks, biology, chemistry, physics, and computer science. In this thesis we focus on two fundamental paradigms in graph learning: representation learning and similarity learning over graph-structured data. Graph representation learning aims to learn embeddings for nodes by integrating topological and feature information of a graph. Graph similarity learning brings into play with similarity functions that allow to compute similarity between pairs of graphs in a vector space. We address several challenging issues in these two paradigms, designing powerful, yet efficient and theoretical guaranteed machine learning models that can leverage rich topological structural properties of real-world graphs. This thesis is structured into two parts. In the first part of the thesis, we will present how to develop powerful Graph Neural Networks (GNNs) for graph representation learning from three different perspectives: (1) spatial GNNs, (2) spectral GNNs, and (3) diffusion GNNs. We will discuss the model architecture, representational power, and convergence properties of these GNN models. Specifically, we first study how to develop expressive, yet efficient and simple message-passing aggregation schemes that can go beyond the Weisfeiler-Leman test (1-WL). We propose a generalized message-passing framework by incorporating graph structural properties into an aggregation scheme. Then, we introduce a new local isomorphism hierarchy on neighborhood subgraphs. We further develop a novel neural model, namely GraphSNN, and theoretically prove that this model is more expressive than the 1-WL test. After that, we study how to build an effective and efficient graph convolution model with spectral graph filters. In this study, we propose a spectral GNN model, called DFNets, which incorporates a novel spectral graph filter, namely feedback-looped filters. As a result, this model can provide better localization on neighborhood while achieving fast convergence and linear memory requirements. Finally, we study how to capture the rich topological information of a graph using graph diffusion. We propose a novel GNN architecture with dynamic PageRank, based on a learnable transition matrix. We explore two variants of this GNN architecture: forward-euler solution and invariable feature solution, and theoretically prove that our forward-euler GNN architecture is guaranteed with the convergence to a stationary distribution. In the second part of this thesis, we will introduce a new optimal transport distance metric on graphs in a regularized learning framework for graph kernels. This optimal transport distance metric can preserve both local and global structures between graphs during the transport, in addition to preserving features and their local variations. Furthermore, we propose two strongly convex regularization terms to theoretically guarantee the convergence and numerical stability in finding an optimal assignment between graphs. One regularization term is used to regularize a Wasserstein distance between graphs in the same ground space. This helps to preserve the local clustering structure on graphs by relaxing the optimal transport problem to be a cluster-to-cluster assignment between locally connected vertices. The other regularization term is used to regularize a Gromov-Wasserstein distance between graphs across different ground spaces based on degree-entropy KL divergence. This helps to improve the matching robustness of an optimal alignment to preserve the global connectivity structure of graphs. We have evaluated our optimal transport-based graph kernel using different benchmark tasks. The experimental results show that our models considerably outperform all the state-of-the-art methods in all benchmark tasks

    Resiliency in numerical algorithm design for extreme scale simulations

    Get PDF
    This work is based on the seminar titled ‘Resiliency in Numerical Algorithm Design for Extreme Scale Simulations’ held March 1–6, 2020, at Schloss Dagstuhl, that was attended by all the authors. Advanced supercomputing is characterized by very high computation speeds at the cost of involving an enormous amount of resources and costs. A typical large-scale computation running for 48 h on a system consuming 20 MW, as predicted for exascale systems, would consume a million kWh, corresponding to about 100k Euro in energy cost for executing 1023 floating-point operations. It is clearly unacceptable to lose the whole computation if any of the several million parallel processes fails during the execution. Moreover, if a single operation suffers from a bit-flip error, should the whole computation be declared invalid? What about the notion of reproducibility itself: should this core paradigm of science be revised and refined for results that are obtained by large-scale simulation? Naive versions of conventional resilience techniques will not scale to the exascale regime: with a main memory footprint of tens of Petabytes, synchronously writing checkpoint data all the way to background storage at frequent intervals will create intolerable overheads in runtime and energy consumption. Forecasts show that the mean time between failures could be lower than the time to recover from such a checkpoint, so that large calculations at scale might not make any progress if robust alternatives are not investigated. More advanced resilience techniques must be devised. The key may lie in exploiting both advanced system features as well as specific application knowledge. Research will face two essential questions: (1) what are the reliability requirements for a particular computation and (2) how do we best design the algorithms and software to meet these requirements? While the analysis of use cases can help understand the particular reliability requirements, the construction of remedies is currently wide open. One avenue would be to refine and improve on system- or application-level checkpointing and rollback strategies in the case an error is detected. Developers might use fault notification interfaces and flexible runtime systems to respond to node failures in an application-dependent fashion. Novel numerical algorithms or more stochastic computational approaches may be required to meet accuracy requirements in the face of undetectable soft errors. These ideas constituted an essential topic of the seminar. The goal of this Dagstuhl Seminar was to bring together a diverse group of scientists with expertise in exascale computing to discuss novel ways to make applications resilient against detected and undetected faults. In particular, participants explored the role that algorithms and applications play in the holistic approach needed to tackle this challenge. This article gathers a broad range of perspectives on the role of algorithms, applications and systems in achieving resilience for extreme scale simulations. The ultimate goal is to spark novel ideas and encourage the development of concrete solutions for achieving such resilience holistically.Peer Reviewed"Article signat per 36 autors/es: Emmanuel Agullo, Mirco Altenbernd, Hartwig Anzt, Leonardo Bautista-Gomez, Tommaso Benacchio, Luca Bonaventura, Hans-Joachim Bungartz, Sanjay Chatterjee, Florina M. Ciorba, Nathan DeBardeleben, Daniel Drzisga, Sebastian Eibl, Christian Engelmann, Wilfried N. Gansterer, Luc Giraud, Dominik G ̈oddeke, Marco Heisig, Fabienne Jezequel, Nils Kohl, Xiaoye Sherry Li, Romain Lion, Miriam Mehl, Paul Mycek, Michael Obersteiner, Enrique S. Quintana-Ortiz, Francesco Rizzi, Ulrich Rude, Martin Schulz, Fred Fung, Robert Speck, Linda Stals, Keita Teranishi, Samuel Thibault, Dominik Thonnes, Andreas Wagner and Barbara Wohlmuth"Postprint (author's final draft

    Author index to volumes 301–400

    Get PDF

    Parametric controllability of the personalized PageRank: Classic model vs biplex approach

    Full text link
    [EN] Measures of centrality in networks defined by means of matrix algebra, like PageRank-type centralities, have been used for over 70 years. Recently, new extensions of PageRank have been formulated and may include a personalization (or teleportation) vector. It is accepted that one of the key issues for any centrality measure formulation is to what extent someone can control its variability. In this paper, we compare the limits of variability of two centrality measures for complex networks that we call classic PageRank (PR) and biplex approach PageRank (BPR). Both centrality measures depend on the so-called damping parameter alpha that controls the quantity of teleportation. Our first result is that the intersection of the intervals of variation of both centrality measures is always a nonempty set. Our second result is that when alpha is lower that 0.48 (and, therefore, the ranking is highly affected by teleportation effects) then the upper limits of PR are more controllable than the upper limits of BPR; on the contrary, when alpha is greater than 0.5 (and we recall that the usual PageRank algorithm uses the value 0.85), then the upper limits of PR are less controllable than the upper limits of BPR, provided certain mild assumptions on the local structure of the graph. Regarding the lower limits of variability, we give a result for small values of alpha. We illustrate the results with some analytical networks and also with a real Facebook network.This work has been partially supported by the Spanish Ministry of Science, Innovation and Universities under Project Nos. PGC2018-101625-B-I00, MTM2016-76808-P, and MTM2017-84194-P (AEI/FEDER, UE).Flores, J.; GarcĂ­a, E.; Pedroche SĂĄnchez, F.; Romance, M. (2020). Parametric controllability of the personalized PageRank: Classic model vs biplex approach. Chaos An Interdisciplinary Journal of Nonlinear Science. 30(2):1-15. https://doi.org/10.1063/1.5128567S115302Agryzkov, T., Curado, M., Pedroche, F., Tortosa, L., & Vicent, J. (2019). Extending the Adapted PageRank Algorithm Centrality to Multiplex Networks with Data Using the PageRank Two-Layer Approach. Symmetry, 11(2), 284. doi:10.3390/sym11020284Agryzkov, T., Pedroche, F., Tortosa, L., & Vicent, J. (2018). Combining the Two-Layers PageRank Approach with the APA Centrality in Networks with Data. ISPRS International Journal of Geo-Information, 7(12), 480. doi:10.3390/ijgi7120480Allcott, H., Gentzkow, M., & Yu, C. (2019). Trends in the diffusion of misinformation on social media. Research & Politics, 6(2), 205316801984855. doi:10.1177/2053168019848554Aleja, D., Criado, R., GarcĂ­a del Amo, A. J., PĂ©rez, Á., & Romance, M. (2019). Non-backtracking PageRank: From the classic model to hashimoto matrices. Chaos, Solitons & Fractals, 126, 283-291. doi:10.1016/j.chaos.2019.06.017Barabási, A.-L., & Albert, R. (1999). Emergence of Scaling in Random Networks. Science, 286(5439), 509-512. doi:10.1126/science.286.5439.509Bavelas, A. (1948). A Mathematical Model for Group Structures. Human Organization, 7(3), 16-30. doi:10.17730/humo.7.3.f4033344851gl053Benson, A. R. (2019). Three Hypergraph Eigenvector Centralities. SIAM Journal on Mathematics of Data Science, 1(2), 293-312. doi:10.1137/18m1203031Boccaletti, S., Bianconi, G., Criado, R., del Genio, C. I., GĂłmez-Gardeñes, J., Romance, M., 
 Zanin, M. (2014). The structure and dynamics of multilayer networks. Physics Reports, 544(1), 1-122. doi:10.1016/j.physrep.2014.07.001Boldi, P., & Vigna, S. (2014). Axioms for Centrality. Internet Mathematics, 10(3-4), 222-262. doi:10.1080/15427951.2013.865686Boldi, P., Santini, M., & Vigna, S. (2009). PageRank. ACM Transactions on Information Systems, 27(4), 1-23. doi:10.1145/1629096.1629097Bonacich, P. (1972). Factoring and weighting approaches to status scores and clique identification. The Journal of Mathematical Sociology, 2(1), 113-120. doi:10.1080/0022250x.1972.9989806Borgatti, S. P., & Everett, M. G. (2006). A Graph-theoretic perspective on centrality. Social Networks, 28(4), 466-484. doi:10.1016/j.socnet.2005.11.005Buzzanca, M., Carchiolo, V., Longheu, A., Malgeri, M., & Mangioni, G. (2018). Black hole metric: Overcoming the pagerank normalization problem. Information Sciences, 438, 58-72. doi:10.1016/j.ins.2018.01.033De Domenico, M., SolĂ©-Ribalta, A., Omodei, E., GĂłmez, S., & Arenas, A. (2015). Ranking in interconnected multilayer networks reveals versatile nodes. Nature Communications, 6(1). doi:10.1038/ncomms7868DeFord, D. R., & Pauls, S. D. (2017). A new framework for dynamical models on multiplex networks. Journal of Complex Networks, 6(3), 353-381. doi:10.1093/comnet/cnx041Del Corso, G. M., & Romani, F. (2016). A multi-class approach for ranking graph nodes: Models and experiments with incomplete data. Information Sciences, 329, 619-637. doi:10.1016/j.ins.2015.09.046Estrada, E., & Silver, G. (2017). Accounting for the role of long walks on networks via a new matrix function. Journal of Mathematical Analysis and Applications, 449(2), 1581-1600. doi:10.1016/j.jmaa.2016.12.062Festinger, L. (1949). The Analysis of Sociograms using Matrix Algebra. Human Relations, 2(2), 153-158. doi:10.1177/001872674900200205Votruba, J. (1975). On the determination of χl,η+−0 AND η000 from bubble chamber measurements. Czechoslovak Journal of Physics, 25(6), 619-625. doi:10.1007/bf01591018Freeman, L. C. (1978). Centrality in social networks conceptual clarification. Social Networks, 1(3), 215-239. doi:10.1016/0378-8733(78)90021-7Ermann, L., Frahm, K. M., & Shepelyansky, D. L. (2015). Google matrix analysis of directed networks. Reviews of Modern Physics, 87(4), 1261-1310. doi:10.1103/revmodphys.87.1261Frahm, K. M., & Shepelyansky, D. L. (2019). Ising-PageRank model of opinion formation on social networks. Physica A: Statistical Mechanics and its Applications, 526, 121069. doi:10.1016/j.physa.2019.121069GarcĂ­a, E., Pedroche, F., & Romance, M. (2013). On the localization of the personalized PageRank of complex networks. Linear Algebra and its Applications, 439(3), 640-652. doi:10.1016/j.laa.2012.10.051Gu, C., Jiang, X., Shao, C., & Chen, Z. (2018). A GMRES-Power algorithm for computing PageRank problems. Journal of Computational and Applied Mathematics, 343, 113-123. doi:10.1016/j.cam.2018.03.017Halu, A., MondragĂłn, R. J., Panzarasa, P., & Bianconi, G. (2013). Multiplex PageRank. PLoS ONE, 8(10), e78293. doi:10.1371/journal.pone.0078293Horn, R. A., & Johnson, C. R. (1991). Topics in Matrix Analysis. doi:10.1017/cbo9780511840371Iacovacci, J., & Bianconi, G. (2016). Extracting information from multiplex networks. Chaos: An Interdisciplinary Journal of Nonlinear Science, 26(6), 065306. doi:10.1063/1.4953161Iacovacci, J., Rahmede, C., Arenas, A., & Bianconi, G. (2016). Functional Multiplex PageRank. EPL (Europhysics Letters), 116(2), 28004. doi:10.1209/0295-5075/116/28004IvĂĄn, G., & Grolmusz, V. (2010). When the Web meets the cell: using personalized PageRank for analyzing protein interaction networks. Bioinformatics, 27(3), 405-407. doi:10.1093/bioinformatics/btq680Kalecky, K., & Cho, Y.-R. (2018). PrimAlign: PageRank-inspired Markovian alignment for large biological networks. Bioinformatics, 34(13), i537-i546. doi:10.1093/bioinformatics/bty288Katz, L. (1953). A new status index derived from sociometric analysis. Psychometrika, 18(1), 39-43. doi:10.1007/bf02289026Langville, A., & Meyer, C. (2004). Deeper Inside PageRank. Internet Mathematics, 1(3), 335-380. doi:10.1080/15427951.2004.10129091Liu, Y.-Y., Slotine, J.-J., & BarabĂĄsi, A.-L. (2011). Controllability of complex networks. Nature, 473(7346), 167-173. doi:10.1038/nature10011Lv, L., Zhang, K., Zhang, T., Bardou, D., Zhang, J., & Cai, Y. (2019). PageRank centrality for temporal networks. Physics Letters A, 383(12), 1215-1222. doi:10.1016/j.physleta.2019.01.041Massucci, F. A., & Docampo, D. (2019). Measuring the academic reputation through citation networks via PageRank. Journal of Informetrics, 13(1), 185-201. doi:10.1016/j.joi.2018.12.001Masuda, N., Porter, M. A., & Lambiotte, R. (2017). Random walks and diffusion on networks. Physics Reports, 716-717, 1-58. doi:10.1016/j.physrep.2017.07.007MigallĂłn, H., MigallĂłn, V., & PenadĂ©s, J. (2018). Parallel two-stage algorithms for solving the PageRank problem. Advances in Engineering Software, 125, 188-199. doi:10.1016/j.advengsoft.2018.03.002Newman, M. (2010). Networks. doi:10.1093/acprof:oso/9780199206650.001.0001Nicosia, V., Criado, R., Romance, M., Russo, G., & Latora, V. (2012). Controlling centrality in complex networks. Scientific Reports, 2(1). doi:10.1038/srep00218Pedroche, F., GarcĂ­a, E., Romance, M., & Criado, R. (2018). Sharp estimates for the personalized Multiplex PageRank. Journal of Computational and Applied Mathematics, 330, 1030-1040. doi:10.1016/j.cam.2017.02.013Pedroche, F., Tortosa, L., & Vicent, J. F. (2019). An Eigenvector Centrality for Multiplex Networks with Data. Symmetry, 11(6), 763. doi:10.3390/sym11060763Pedroche, F., Romance, M., & Criado, R. (2016). A biplex approach to PageRank centrality: From classic to multiplex networks. Chaos: An Interdisciplinary Journal of Nonlinear Science, 26(6), 065301. doi:10.1063/1.4952955Sciarra, C., Chiarotti, G., Laio, F., & Ridolfi, L. (2018). A change of perspective in network centrality. Scientific Reports, 8(1). doi:10.1038/s41598-018-33336-8Scholz, M., Pfeiffer, J., & Rothlauf, F. (2017). Using PageRank for non-personalized default rankings in dynamic markets. European Journal of Operational Research, 260(1), 388-401. doi:10.1016/j.ejor.2016.12.022Shen, Y., Gu, C., & Zhao, P. (2019). Structural Vulnerability Assessment of Multi-energy System Using a PageRank Algorithm. Energy Procedia, 158, 6466-6471. doi:10.1016/j.egypro.2019.01.132Shen, Z.-L., Huang, T.-Z., Carpentieri, B., Wen, C., Gu, X.-M., & Tan, X.-Y. (2019). Off-diagonal low-rank preconditioner for difficult PageRank problems. Journal of Computational and Applied Mathematics, 346, 456-470. doi:10.1016/j.cam.2018.07.015Shepelyansky, D. L., & Zhirov, O. V. (2010). Towards Google matrix of brain. Physics Letters A, 374(31-32), 3206-3209. doi:10.1016/j.physleta.2010.06.007SolĂĄ, L., Romance, M., Criado, R., Flores, J., GarcĂ­a del Amo, A., & Boccaletti, S. (2013). Eigenvector centrality of nodes in multiplex networks. Chaos: An Interdisciplinary Journal of Nonlinear Science, 23(3), 033131. doi:10.1063/1.4818544Tian, Z., Liu, Y., Zhang, Y., Liu, Z., & Tian, M. (2019). The general inner-outer iteration method based on regular splittings for the PageRank problem. Applied Mathematics and Computation, 356, 479-501. doi:10.1016/j.amc.2019.02.066Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature, 393(6684), 440-442. doi:10.1038/30918Yun, T.-S., Jeong, D., & Park, S. (2019). «Too central to fail» systemic risk measure using PageRank algorithm. Journal of Economic Behavior & Organization, 162, 251-272. doi:10.1016/j.jebo.2018.12.02

    Acceleration Methods

    Full text link
    This monograph covers some recent advances in a range of acceleration techniques frequently used in convex optimization. We first use quadratic optimization problems to introduce two key families of methods, namely momentum and nested optimization schemes. They coincide in the quadratic case to form the Chebyshev method. We discuss momentum methods in detail, starting with the seminal work of Nesterov and structure convergence proofs using a few master templates, such as that for optimized gradient methods, which provide the key benefit of showing how momentum methods optimize convergence guarantees. We further cover proximal acceleration, at the heart of the Catalyst and Accelerated Hybrid Proximal Extragradient frameworks, using similar algorithmic patterns. Common acceleration techniques rely directly on the knowledge of some of the regularity parameters in the problem at hand. We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates while adapting to unobserved regularity parameters.Comment: Published in Foundation and Trends in Optimization (see https://www.nowpublishers.com/article/Details/OPT-036

    Algebraic Multigrid for Markov Chains and Tensor Decomposition

    Get PDF
    The majority of this thesis is concerned with the development of efficient and robust numerical methods based on adaptive algebraic multigrid to compute the stationary distribution of Markov chains. It is shown that classical algebraic multigrid techniques can be applied in an exact interpolation scheme framework to compute the stationary distribution of irreducible, homogeneous Markov chains. A quantitative analysis shows that algebraically smooth multiplicative error is locally constant along strong connections in a scaled system operator, which suggests that classical algebraic multigrid coarsening and interpolation can be applied to the class of nonsymmetric irreducible singular M-matrices with zero column sums. Acceleration schemes based on fine-level iterant recombination, and over-correction of the coarse-grid correction are developed to improve the rate of convergence and scalability of simple adaptive aggregation multigrid methods for Markov chains. Numerical tests over a wide range of challenging nonsymmetric test problems demonstrate the effectiveness of the proposed multilevel method and the acceleration schemes. This thesis also investigates the application of adaptive algebraic multigrid techniques for computing the canonical decomposition of higher-order tensors. The canonical decomposition is formulated as a least squares optimization problem, for which local minimizers are computed by solving the first-order optimality equations. The proposed multilevel method consists of two phases: an adaptive setup phase that uses a multiplicative correction scheme in conjunction with bootstrap algebraic multigrid interpolation to build the necessary operators on each level, and a solve phase that uses additive correction cycles based on the full approximation scheme to efficiently obtain an accurate solution. The alternating least squares method, which is a standard one-level iterative method for computing the canonical decomposition, is used as the relaxation scheme. Numerical tests show that for certain test problems arising from the discretization of high-dimensional partial differential equations on regular lattices the proposed multilevel method significantly outperforms the standard alternating least squares method when a high level of accuracy is required
    corecore