13,448 research outputs found

    StructMatrix: large-scale visualization of graphs by means of structure detection and dense matrices

    Get PDF
    Given a large-scale graph with millions of nodes and edges, how to reveal macro patterns of interest, like cliques, bi-partite cores, stars, and chains? Furthermore, how to visualize such patterns altogether getting insights from the graph to support wise decision-making? Although there are many algorithmic and visual techniques to analyze graphs, none of the existing approaches is able to present the structural information of graphs at large-scale. Hence, this paper describes StructMatrix, a methodology aimed at high-scalable visual inspection of graph structures with the goal of revealing macro patterns of interest. StructMatrix combines algorithmic structure detection and adjacency matrix visualization to present cardinality, distribution, and relationship features of the structures found in a given graph. We performed experiments in real, large-scale graphs with up to one million nodes and millions of edges. StructMatrix revealed that graphs of high relevance (e.g., Web, Wikipedia and DBLP) have characterizations that reflect the nature of their corresponding domains; our findings have not been seen in the literature so far. We expect that our technique will bring deeper insights into large graph mining, leveraging their use for decision making.Comment: To appear: 8 pages, paper to be published at the Fifth IEEE ICDM Workshop on Data Mining in Networks, 2015 as Hugo Gualdron, Robson Cordeiro, Jose Rodrigues (2015) StructMatrix: Large-scale visualization of graphs by means of structure detection and dense matrices In: The Fifth IEEE ICDM Workshop on Data Mining in Networks 1--8, IEE

    The most controversial topics in Wikipedia: A multilingual and geographical analysis

    Full text link
    We present, visualize and analyse the similarities and differences between the controversial topics related to "edit wars" identified in 10 different language versions of Wikipedia. After a brief review of the related work we describe the methods developed to locate, measure, and categorize the controversial topics in the different languages. Visualizations of the degree of overlap between the top 100 lists of most controversial articles in different languages and the content related to geographical locations will be presented. We discuss what the presented analysis and visualizations can tell us about the multicultural aspects of Wikipedia and practices of peer-production. Our results indicate that Wikipedia is more than just an encyclopaedia; it is also a window into convergent and divergent social-spatial priorities, interests and preferences.Comment: This is a draft of a book chapter to be published in 2014 by Scarecrow Press. Please cite as: Yasseri T., Spoerri A., Graham M., and Kert\'esz J., The most controversial topics in Wikipedia: A multilingual and geographical analysis. In: Fichman P., Hara N., editors, Global Wikipedia:International and cross-cultural issues in online collaboration. Scarecrow Press (2014

    vrmlgen: An R Package for 3D Data Visualization on the Web

    Get PDF
    The 3-dimensional representation and inspection of complex data is a frequently used strategy in many data analysis domains. Existing data mining software often lacks functionality that would enable users to explore 3D data interactively, especially if one wishes to make dynamic graphical representations directly viewable on the web. In this paper we present vrmlgen, a software package for the statistical programming language R to create 3D data visualizations in web formats like the Virtual Reality Markup Language (VRML) and LiveGraphics3D. vrmlgen can be used to generate 3D charts and bar plots, scatter plots with density estimation contour surfaces, and visualizations of height maps, 3D object models and parametric functions. For greater flexibility, the user can also access low-level plotting methods through a unified interface and freely group different function calls together to create new higher-level plotting methods. Additionally, we present a web tool allowing users to visualize 3D data online and test some of vrmlgen's features without the need to install any software on their computer.

    Social media analytics: a survey of techniques, tools and platforms

    Get PDF
    This paper is written for (social science) researchers seeking to analyze the wealth of social media now available. It presents a comprehensive review of software tools for social networking media, wikis, really simple syndication feeds, blogs, newsgroups, chat and news feeds. For completeness, it also includes introductions to social media scraping, storage, data cleaning and sentiment analysis. Although principally a review, the paper also provides a methodology and a critique of social media tools. Analyzing social media, in particular Twitter feeds for sentiment analysis, has become a major research and business activity due to the availability of web-based application programming interfaces (APIs) provided by Twitter, Facebook and News services. This has led to an ‘explosion’ of data services, software tools for scraping and analysis and social media analytics platforms. It is also a research area undergoing rapid change and evolution due to commercial pressures and the potential for using social media data for computational (social science) research. Using a simple taxonomy, this paper provides a review of leading software tools and how to use them to scrape, cleanse and analyze the spectrum of social media. In addition, it discussed the requirement of an experimental computational environment for social media research and presents as an illustration the system architecture of a social media (analytics) platform built by University College London. The principal contribution of this paper is to provide an overview (including code fragments) for scientists seeking to utilize social media scraping and analytics either in their research or business. The data retrieval techniques that are presented in this paper are valid at the time of writing this paper (June 2014), but they are subject to change since social media data scraping APIs are rapidly changing
    corecore