69 research outputs found
Supremum-Norm Convergence for Step-Asynchronous Successive Overrelaxation on M-matrices
Step-asynchronous successive overrelaxation updates the values contained in a
single vector using the usual Gau\ss-Seidel-like weighted rule, but arbitrarily
mixing old and new values, the only constraint being temporal coherence: you
cannot use a value before it has been computed. We show that given a
nonnegative real matrix , a and a vector such that , every iteration of
step-asynchronous successive overrelaxation for the problem , with , reduces geometrically the -norm of the current error by a factor that we can compute explicitly. Then,
we show that given a it is in principle always possible to
compute such a . This property makes it possible to estimate the
supremum norm of the absolute error at each iteration without any additional
hypothesis on , even when is so large that computing the product
is feasible, but estimating the supremum norm of
is not
Tensor Spectral Clustering for Partitioning Higher-order Network Structures
Spectral graph theory-based methods represent an important class of tools for
studying the structure of networks. Spectral methods are based on a first-order
Markov chain derived from a random walk on the graph and thus they cannot take
advantage of important higher-order network substructures such as triangles,
cycles, and feed-forward loops. Here we propose a Tensor Spectral Clustering
(TSC) algorithm that allows for modeling higher-order network structures in a
graph partitioning framework. Our TSC algorithm allows the user to specify
which higher-order network structures (cycles, feed-forward loops, etc.) should
be preserved by the network clustering. Higher-order network structures of
interest are represented using a tensor, which we then partition by developing
a multilinear spectral method. Our framework can be applied to discovering
layered flows in networks as well as graph anomaly detection, which we
illustrate on synthetic networks. In directed networks, a higher-order
structure of particular interest is the directed 3-cycle, which captures
feedback loops in networks. We demonstrate that our TSC algorithm produces
large partitions that cut fewer directed 3-cycles than standard spectral
clustering algorithms.Comment: SDM 201
Virtual web for PageRank computing
Dr. Xinhua Zhuang, Dissertation Supervisor.Includes vita.Field of study: Computer science."May 2018."The enormous size and fast-evolving nature of World-Wide-Web has been demanding an even more efficient PageRank updating algorithm. Web evolution may involve two kinds: (1) link structure modification; (2) page insertion/deletion. When the web evolution is restricted to only link insertion/deletion, we demonstrate the benefit of using the previous PageRank to initialize the current PageRank computation, theoretically and experimentally. When page insertion/deletion occurs, how to effectively use the previous PageRank information to facilitate the current PageRank computation has long been a challenge. To tackle the general case, a so-called "virtual web" is introduced through adding the inserted nodes to the previous web along with some specific "in-home" link structure, where in-links from the previous web and out-links to the previous web are excluded. Through the virtual web, we are able to work out a virtual initialization, which can be efficiently used to calculate the current PageRank. The introduced virtual initialization is "unbiased", that assumes least under available knowledge. The virtual web is then integrated with the Power-Iteration and Gauss-Southwell method to solve the node insertion/deletion problem, which are named as Virtual Web Power-Iteration (VWPI) method and Virtual Web Gauss-Southwell (VWGS) method, respectively. Further, we proposed an optimized approach based on VWGS method for updating node insertions. The experiment result shows that the VWGS algorithm significantly outperformed the conventional PageRank computation based on the original model. On the dataset Twitter-2010 with 42M nodes and 1.5B edges, for a perturbation of 400k node and 14 million link insertions plus deletions at one time, our algorithm is about 20 times faster on number of iterations and 3 times faster on running-time in comparison to the Gauss-Southwell method starting from scratch. On the soc-LiveJournal dataset with up to a 20% node insertion, the optimized VWGS method received another 28% gain comparing to the original VWGS method. To compare with the prior work proposed by Ohsaka et al. in [32], our method is 1800x faster per link insertion/deletion on the Twitter-2010 dataset under similar experiment environment.Includes bibliographical references (pages 90-93)
Accurate Measures of Vaccination and Concerns of Vaccine Holdouts from Web Search Logs
To design effective vaccine policies, policymakers need detailed data about
who has been vaccinated, who is holding out, and why. However, existing data in
the US are insufficient: reported vaccination rates are often delayed or
missing, and surveys of vaccine hesitancy are limited by high-level questions
and self-report biases. Here, we show how large-scale search engine logs and
machine learning can be leveraged to fill these gaps and provide novel insights
about vaccine intentions and behaviors. First, we develop a vaccine intent
classifier that can accurately detect when a user is seeking the COVID-19
vaccine on search. Our classifier demonstrates strong agreement with CDC
vaccination rates, with correlations above 0.86, and estimates vaccine intent
rates to the level of ZIP codes in real time, allowing us to pinpoint more
granular trends in vaccine seeking across regions, demographics, and time. To
investigate vaccine hesitancy, we use our classifier to identify two groups,
vaccine early adopters and vaccine holdouts. We find that holdouts, compared to
early adopters matched on covariates, are 69% more likely to click on untrusted
news sites. Furthermore, we organize 25,000 vaccine-related URLs into a
hierarchical ontology of vaccine concerns, and we find that holdouts are far
more concerned about vaccine requirements, vaccine development and approval,
and vaccine myths, and even within holdouts, concerns vary significantly across
demographic groups. Finally, we explore the temporal dynamics of vaccine
concerns and vaccine seeking, and find that key indicators emerge when
individuals convert from holding out to preparing to accept the vaccine
A method to evaluate the reliability of social media data for social network analysis
In order to study the effects of Online Social Network (OSN) activity on real-world offline events, researchers need access to OSN data, the reliability of which has particular implications for social network analysis. This relates not only to the completeness of any collected dataset, but also to constructing meaningful social and information networks from them. In this multidisciplinary study, we consider the question of constructing traditional social networks from OSN data and then present a measurement case study showing how the reliability of OSN data affects social network analyses. To this end we developed a systematic comparison methodology, which we applied to two parallel datasets we collected from Twitter. We found considerable differences in datasets collected with different tools and that these variations significantly alter the results of subsequent analyses. Our results lead to a set of guidelines for researchers planning to collect online data streams to infer social networks.Derek Weber, Mehwish Nasim, Lewis Mitchell, Lucia Falzo
Exploring Algorithmic Literacy for College Students: An Educator’s Roadmap
Research shows that college students are largely unaware of the impact of algorithms on their everyday lives. Also, most university students are not being taught about algorithms as part of the regular curriculum. This exploratory, qualitative study aimed to explore subject-matter experts’ insights and perceptions of the knowledge components, coping behaviors, and pedagogical considerations to aid faculty in teaching algorithmic literacy to college students. Eleven individual, semi-structured interviews and one focus group were conducted with scholars and teachers of critical algorithm studies and related fields. Findings suggested three sets of knowledge components that would contribute to students’ algorithmic literacy: general characteristics and distinguishing traits of algorithms, key domains in everyday life using algorithms (including the potential benefits and risks), and ethical considerations for the use and application of algorithms. Findings also suggested five behaviors that students could use to help them better cope with algorithmic systems and nine teaching strategies to help improve students’ algorithmic literacy. Suggestions also surfaced for alternative forms of assessment, potential placement in the curriculum, and how to distinguish between basic algorithmic awareness compared to algorithmic literacy. Recommendations for expanding on the current Association of College and Research Libraries’ Framework for Information Literacy for Higher Education (2016) to more explicitly include algorithmic literacy were presented
- …