69 research outputs found

    Supremum-Norm Convergence for Step-Asynchronous Successive Overrelaxation on M-matrices

    Full text link
    Step-asynchronous successive overrelaxation updates the values contained in a single vector using the usual Gau\ss-Seidel-like weighted rule, but arbitrarily mixing old and new values, the only constraint being temporal coherence: you cannot use a value before it has been computed. We show that given a nonnegative real matrix AA, a σρ(A)\sigma\geq\rho(A) and a vector w>0\boldsymbol w>0 such that AwσwA\boldsymbol w\leq\sigma\boldsymbol w, every iteration of step-asynchronous successive overrelaxation for the problem (sIA)x=b(sI- A)\boldsymbol x=\boldsymbol b, with s>σs >\sigma, reduces geometrically the w\boldsymbol w-norm of the current error by a factor that we can compute explicitly. Then, we show that given a σ>ρ(A)\sigma>\rho(A) it is in principle always possible to compute such a w\boldsymbol w. This property makes it possible to estimate the supremum norm of the absolute error at each iteration without any additional hypothesis on AA, even when AA is so large that computing the product AxA\boldsymbol x is feasible, but estimating the supremum norm of (sIA)1(sI-A)^{-1} is not

    Tensor Spectral Clustering for Partitioning Higher-order Network Structures

    Full text link
    Spectral graph theory-based methods represent an important class of tools for studying the structure of networks. Spectral methods are based on a first-order Markov chain derived from a random walk on the graph and thus they cannot take advantage of important higher-order network substructures such as triangles, cycles, and feed-forward loops. Here we propose a Tensor Spectral Clustering (TSC) algorithm that allows for modeling higher-order network structures in a graph partitioning framework. Our TSC algorithm allows the user to specify which higher-order network structures (cycles, feed-forward loops, etc.) should be preserved by the network clustering. Higher-order network structures of interest are represented using a tensor, which we then partition by developing a multilinear spectral method. Our framework can be applied to discovering layered flows in networks as well as graph anomaly detection, which we illustrate on synthetic networks. In directed networks, a higher-order structure of particular interest is the directed 3-cycle, which captures feedback loops in networks. We demonstrate that our TSC algorithm produces large partitions that cut fewer directed 3-cycles than standard spectral clustering algorithms.Comment: SDM 201

    Virtual web for PageRank computing

    Get PDF
    Dr. Xinhua Zhuang, Dissertation Supervisor.Includes vita.Field of study: Computer science."May 2018."The enormous size and fast-evolving nature of World-Wide-Web has been demanding an even more efficient PageRank updating algorithm. Web evolution may involve two kinds: (1) link structure modification; (2) page insertion/deletion. When the web evolution is restricted to only link insertion/deletion, we demonstrate the benefit of using the previous PageRank to initialize the current PageRank computation, theoretically and experimentally. When page insertion/deletion occurs, how to effectively use the previous PageRank information to facilitate the current PageRank computation has long been a challenge. To tackle the general case, a so-called "virtual web" is introduced through adding the inserted nodes to the previous web along with some specific "in-home" link structure, where in-links from the previous web and out-links to the previous web are excluded. Through the virtual web, we are able to work out a virtual initialization, which can be efficiently used to calculate the current PageRank. The introduced virtual initialization is "unbiased", that assumes least under available knowledge. The virtual web is then integrated with the Power-Iteration and Gauss-Southwell method to solve the node insertion/deletion problem, which are named as Virtual Web Power-Iteration (VWPI) method and Virtual Web Gauss-Southwell (VWGS) method, respectively. Further, we proposed an optimized approach based on VWGS method for updating node insertions. The experiment result shows that the VWGS algorithm significantly outperformed the conventional PageRank computation based on the original model. On the dataset Twitter-2010 with 42M nodes and 1.5B edges, for a perturbation of 400k node and 14 million link insertions plus deletions at one time, our algorithm is about 20 times faster on number of iterations and 3 times faster on running-time in comparison to the Gauss-Southwell method starting from scratch. On the soc-LiveJournal dataset with up to a 20% node insertion, the optimized VWGS method received another 28% gain comparing to the original VWGS method. To compare with the prior work proposed by Ohsaka et al. in [32], our method is 1800x faster per link insertion/deletion on the Twitter-2010 dataset under similar experiment environment.Includes bibliographical references (pages 90-93)

    Accurate Measures of Vaccination and Concerns of Vaccine Holdouts from Web Search Logs

    Full text link
    To design effective vaccine policies, policymakers need detailed data about who has been vaccinated, who is holding out, and why. However, existing data in the US are insufficient: reported vaccination rates are often delayed or missing, and surveys of vaccine hesitancy are limited by high-level questions and self-report biases. Here, we show how large-scale search engine logs and machine learning can be leveraged to fill these gaps and provide novel insights about vaccine intentions and behaviors. First, we develop a vaccine intent classifier that can accurately detect when a user is seeking the COVID-19 vaccine on search. Our classifier demonstrates strong agreement with CDC vaccination rates, with correlations above 0.86, and estimates vaccine intent rates to the level of ZIP codes in real time, allowing us to pinpoint more granular trends in vaccine seeking across regions, demographics, and time. To investigate vaccine hesitancy, we use our classifier to identify two groups, vaccine early adopters and vaccine holdouts. We find that holdouts, compared to early adopters matched on covariates, are 69% more likely to click on untrusted news sites. Furthermore, we organize 25,000 vaccine-related URLs into a hierarchical ontology of vaccine concerns, and we find that holdouts are far more concerned about vaccine requirements, vaccine development and approval, and vaccine myths, and even within holdouts, concerns vary significantly across demographic groups. Finally, we explore the temporal dynamics of vaccine concerns and vaccine seeking, and find that key indicators emerge when individuals convert from holding out to preparing to accept the vaccine

    A method to evaluate the reliability of social media data for social network analysis

    Get PDF
    In order to study the effects of Online Social Network (OSN) activity on real-world offline events, researchers need access to OSN data, the reliability of which has particular implications for social network analysis. This relates not only to the completeness of any collected dataset, but also to constructing meaningful social and information networks from them. In this multidisciplinary study, we consider the question of constructing traditional social networks from OSN data and then present a measurement case study showing how the reliability of OSN data affects social network analyses. To this end we developed a systematic comparison methodology, which we applied to two parallel datasets we collected from Twitter. We found considerable differences in datasets collected with different tools and that these variations significantly alter the results of subsequent analyses. Our results lead to a set of guidelines for researchers planning to collect online data streams to infer social networks.Derek Weber, Mehwish Nasim, Lewis Mitchell, Lucia Falzo

    Exploring Algorithmic Literacy for College Students: An Educator’s Roadmap

    Get PDF
    Research shows that college students are largely unaware of the impact of algorithms on their everyday lives. Also, most university students are not being taught about algorithms as part of the regular curriculum. This exploratory, qualitative study aimed to explore subject-matter experts’ insights and perceptions of the knowledge components, coping behaviors, and pedagogical considerations to aid faculty in teaching algorithmic literacy to college students. Eleven individual, semi-structured interviews and one focus group were conducted with scholars and teachers of critical algorithm studies and related fields. Findings suggested three sets of knowledge components that would contribute to students’ algorithmic literacy: general characteristics and distinguishing traits of algorithms, key domains in everyday life using algorithms (including the potential benefits and risks), and ethical considerations for the use and application of algorithms. Findings also suggested five behaviors that students could use to help them better cope with algorithmic systems and nine teaching strategies to help improve students’ algorithmic literacy. Suggestions also surfaced for alternative forms of assessment, potential placement in the curriculum, and how to distinguish between basic algorithmic awareness compared to algorithmic literacy. Recommendations for expanding on the current Association of College and Research Libraries’ Framework for Information Literacy for Higher Education (2016) to more explicitly include algorithmic literacy were presented
    corecore