37,943 research outputs found

    A Unified Community Detection, Visualization and Analysis method

    Full text link
    Community detection in social graphs has attracted researchers' interest for a long time. With the widespread of social networks on the Internet it has recently become an important research domain. Most contributions focus upon the definition of algorithms for optimizing the so-called modularity function. In the first place interest was limited to unipartite graph inputs and partitioned community outputs. Recently bipartite graphs, directed graphs and overlapping communities have been investigated. Few contributions embrace at the same time the three types of nodes. In this paper we present a method which unifies commmunity detection for the three types of nodes and at the same time merges partitionned and overlapping communities. Moreover results are visualized in such a way that they can be analyzed and semantically interpreted. For validation we experiment this method on well known simple benchmarks. It is then applied to real data in three cases. In two examples of photos sets with tagged people we reveal social networks. A second type of application is of particularly interest. After applying our method to Human Brain Tractography Data provided by a team of neurologists, we produce clusters of white fibers in accordance with other well known clustering methods. Moreover our approach for visualizing overlapping clusters allows better understanding of the results by the neurologist team. These last results open up the possibility of applying community detection methods in other domains such as data analysis with original enhanced performances.Comment: Submitted to Advances in Complex System

    Flow-based Influence Graph Visual Summarization

    Full text link
    Visually mining a large influence graph is appealing yet challenging. People are amazed by pictures of newscasting graph on Twitter, engaged by hidden citation networks in academics, nevertheless often troubled by the unpleasant readability of the underlying visualization. Existing summarization methods enhance the graph visualization with blocked views, but have adverse effect on the latent influence structure. How can we visually summarize a large graph to maximize influence flows? In particular, how can we illustrate the impact of an individual node through the summarization? Can we maintain the appealing graph metaphor while preserving both the overall influence pattern and fine readability? To answer these questions, we first formally define the influence graph summarization problem. Second, we propose an end-to-end framework to solve the new problem. Our method can not only highlight the flow-based influence patterns in the visual summarization, but also inherently support rich graph attributes. Last, we present a theoretic analysis and report our experiment results. Both evidences demonstrate that our framework can effectively approximate the proposed influence graph summarization objective while outperforming previous methods in a typical scenario of visually mining academic citation networks.Comment: to appear in IEEE International Conference on Data Mining (ICDM), Shen Zhen, China, December 201

    Heliophysics Event Knowledgebase for the Solar Dynamics Observatory and Beyond

    Get PDF
    The immense volume of data generated by the suite of instruments on SDO requires new tools for efficient identifying and accessing data that is most relevant to research investigations. We have developed the Heliophysics Events Knowledgebase (HEK) to fill this need. The HEK system combines automated data mining using feature-detection methods and high-performance visualization systems for data markup. In addition, web services and clients are provided for searching the resulting metadata, reviewing results, and efficiently accessing the data. We review these components and present examples of their use with SDO data.Comment: 17 pages, 4 figure

    Big Data and Analysis of Data Transfers for International Research Networks Using NetSage

    Get PDF
    Modern science is increasingly data-driven and collaborative in nature. Many scientific disciplines, including genomics, high-energy physics, astronomy, and atmospheric science, produce petabytes of data that must be shared with collaborators all over the world. The National Science Foundation-supported International Research Network Connection (IRNC) links have been essential to enabling this collaboration, but as data sharing has increased, so has the amount of information being collected to understand network performance. New capabilities to measure and analyze the performance of international wide-area networks are essential to ensure end-users are able to take full advantage of such infrastructure for their big data applications. NetSage is a project to develop a unified, open, privacy-aware network measurement, and visualization service to address the needs of monitoring today's high-speed international research networks. NetSage collects data on both backbone links and exchange points, which can be as much as 1Tb per month. This puts a significant strain on hardware, not only in terms storage needs to hold multi-year historical data, but also in terms of processor and memory needs to analyze the data to understand network behaviors. This paper addresses the basic NetSage architecture, its current data collection and archiving approach, and details the constraints of dealing with this big data problem of handling vast amounts of monitoring data, while providing useful, extensible visualization to end users

    GraphCombEx: A Software Tool for Exploration of Combinatorial Optimisation Properties of Large Graphs

    Full text link
    We present a prototype of a software tool for exploration of multiple combinatorial optimisation problems in large real-world and synthetic complex networks. Our tool, called GraphCombEx (an acronym of Graph Combinatorial Explorer), provides a unified framework for scalable computation and presentation of high-quality suboptimal solutions and bounds for a number of widely studied combinatorial optimisation problems. Efficient representation and applicability to large-scale graphs and complex networks are particularly considered in its design. The problems currently supported include maximum clique, graph colouring, maximum independent set, minimum vertex clique covering, minimum dominating set, as well as the longest simple cycle problem. Suboptimal solutions and intervals for optimal objective values are estimated using scalable heuristics. The tool is designed with extensibility in mind, with the view of further problems and both new fast and high-performance heuristics to be added in the future. GraphCombEx has already been successfully used as a support tool in a number of recent research studies using combinatorial optimisation to analyse complex networks, indicating its promise as a research software tool

    The Data Big Bang and the Expanding Digital Universe: High-Dimensional, Complex and Massive Data Sets in an Inflationary Epoch

    Get PDF
    Recent and forthcoming advances in instrumentation, and giant new surveys, are creating astronomical data sets that are not amenable to the methods of analysis familiar to astronomers. Traditional methods are often inadequate not merely because of the size in bytes of the data sets, but also because of the complexity of modern data sets. Mathematical limitations of familiar algorithms and techniques in dealing with such data sets create a critical need for new paradigms for the representation, analysis and scientific visualization (as opposed to illustrative visualization) of heterogeneous, multiresolution data across application domains. Some of the problems presented by the new data sets have been addressed by other disciplines such as applied mathematics, statistics and machine learning and have been utilized by other sciences such as space-based geosciences. Unfortunately, valuable results pertaining to these problems are mostly to be found only in publications outside of astronomy. Here we offer brief overviews of a number of concepts, techniques and developments, some "old" and some new. These are generally unknown to most of the astronomical community, but are vital to the analysis and visualization of complex datasets and images. In order for astronomers to take advantage of the richness and complexity of the new era of data, and to be able to identify, adopt, and apply new solutions, the astronomical community needs a certain degree of awareness and understanding of the new concepts. One of the goals of this paper is to help bridge the gap between applied mathematics, artificial intelligence and computer science on the one side and astronomy on the other.Comment: 24 pages, 8 Figures, 1 Table. Accepted for publication: "Advances in Astronomy, special issue "Robotic Astronomy

    Detection of Side Chain Rearrangements Mediating the Motions of Transmembrane Helices in Molecular Dynamics Simulations of G Protein-Coupled Receptors.

    Get PDF
    Structure and dynamics are essential elements of protein function. Protein structure is constantly fluctuating and undergoing conformational changes, which are captured by molecular dynamics (MD) simulations. We introduce a computational framework that provides a compact representation of the dynamic conformational space of biomolecular simulations. This method presents a systematic approach designed to reduce the large MD simulation spatiotemporal datasets into a manageable set in order to guide our understanding of how protein mechanics emerge from side chain organization and dynamic reorganization. We focus on the detection of side chain interactions that undergo rearrangements mediating global domain motions and vice versa. Side chain rearrangements are extracted from side chain interactions that undergo well-defined abrupt and persistent changes in distance time series using Gaussian mixture models, whereas global domain motions are detected using dynamic cross-correlation. Both side chain rearrangements and global domain motions represent the dynamic components of the protein MD simulation, and are both mapped into a network where they are connected based on their degree of coupling. This method allows for the study of allosteric communication in proteins by mapping out the protein dynamics into an intramolecular network to reduce the large simulation data into a manageable set of communities composed of coupled side chain rearrangements and global domain motions. This computational framework is suitable for the study of tightly packed proteins, such as G protein-coupled receptors, and we present an application on a seven microseconds MD trajectory of CC chemokine receptor 7 (CCR7) bound to its ligand CCL21
    corecore