175 research outputs found
Improved Optimal and Approximate Power Graph Compression for Clearer Visualisation of Dense Graphs
Drawings of highly connected (dense) graphs can be very difficult to read.
Power Graph Analysis offers an alternate way to draw a graph in which sets of
nodes with common neighbours are shown grouped into modules. An edge connected
to the module then implies a connection to each member of the module. Thus, the
entire graph may be represented with much less clutter and without loss of
detail. A recent experimental study has shown that such lossless compression of
dense graphs makes it easier to follow paths. However, computing optimal power
graphs is difficult. In this paper, we show that computing the optimal
power-graph with only one module is NP-hard and therefore likely NP-hard in the
general case. We give an ILP model for power graph computation and discuss why
ILP and CP techniques are poorly suited to the problem. Instead, we are able to
find optimal solutions much more quickly using a custom search method. We also
show how to restrict this type of search to allow only limited back-tracking to
provide a heuristic that has better speed and better results than previously
known heuristics.Comment: Extended technical report accompanying the PacificVis 2013 paper of
the same nam
Deconstructing Human Capital to Construct Hierarchical Nestedness
Modern economies generate immensely diverse complex goods and services by
coordinating efforts and know-how of people in vast networks that span across
the globe. This increasing complexity puts us under the pressure of acquiring
an ever-increasing specialized and yet diverse skill portfolio in order to stay
effective members of a complex economy. Here, we analyze the skill portfolios
of workers in an effort to understand the latent structure and evolution of
these portfolios. Analyzing the U.S. survey data (2003-2019) and 20 million
resumes, we uncover a tree structure of vertical skill dependencies such that
skills that only a few jobs need (specialized) are located at the leaves under
the broadly demanded (general skills). The resulting structure exhibits an
unbalanced tree shape. The unbalanced shape allows the further categorization
of specialized skills: nested branching out of a deeply rooted sturdy trunk
reflecting a dense web of common prerequisites, and un-nested lacking such
support. Our longitudinal analyses show individuals indeed become more
specialized, going down the nested paths as moving up the career ladder to
enjoy higher wage premiums. The specialization, however, is most likely
accompanied by demands for a higher level of general skills, and furthermore,
specialization without the strengthening of general skills is deprived of wage
premiums. We examine the geographic and demographic distribution of skills to
explain disparities in wealth. Finally, historical changes in occupation skill
requirements show these branches have become more fragmented over the decade,
suggesting the increasing labor gap.Comment: 26 pages, 7 figure
Crk and CrkL adaptor proteins: networks for physiological and pathological signaling
The Crk adaptor proteins (Crk and CrkL) constitute an integral part of a network of essential signal transduction pathways in humans and other organisms that act as major convergence points in tyrosine kinase signaling. Crk proteins integrate signals from a wide variety of sources, including growth factors, extracellular matrix molecules, bacterial pathogens, and apoptotic cells. Mounting evidence indicates that dysregulation of Crk proteins is associated with human diseases, including cancer and susceptibility to pathogen infections. Recent structural work has identified new and unusual insights into the regulation of Crk proteins, providing a rationale for how Crk can sense diverse signals and produce a myriad of biological responses
Visual Analytics Methods for Exploring Geographically Networked Phenomena
abstract: The connections between different entities define different kinds of networks, and many such networked phenomena are influenced by their underlying geographical relationships. By integrating network and geospatial analysis, the goal is to extract information about interaction topologies and the relationships to related geographical constructs. In the recent decades, much work has been done analyzing the dynamics of spatial networks; however, many challenges still remain in this field. First, the development of social media and transportation technologies has greatly reshaped the typologies of communications between different geographical regions. Second, the distance metrics used in spatial analysis should also be enriched with the underlying network information to develop accurate models.
Visual analytics provides methods for data exploration, pattern recognition, and knowledge discovery. However, despite the long history of geovisualizations and network visual analytics, little work has been done to develop visual analytics tools that focus specifically on geographically networked phenomena. This thesis develops a variety of visualization methods to present data values and geospatial network relationships, which enables users to interactively explore the data. Users can investigate the connections in both virtual networks and geospatial networks and the underlying geographical context can be used to improve knowledge discovery. The focus of this thesis is on social media analysis and geographical hotspots optimization. A framework is proposed for social network analysis to unveil the links between social media interactions and their underlying networked geospatial phenomena. This will be combined with a novel hotspot approach to improve hotspot identification and boundary detection with the networks extracted from urban infrastructure. Several real world problems have been analyzed using the proposed visual analytics frameworks. The primary studies and experiments show that visual analytics methods can help analysts explore such data from multiple perspectives and help the knowledge discovery process.Dissertation/ThesisDoctoral Dissertation Computer Science 201
The flow of money and interests in policymaking
This dissertation is comprised of three papers that analyze the relationship between political money, elite interests and policies. Individual papers in this work are connected through this overarching theme and the methodology that is used. Each paper employs statistical methods on large-scale datasets with an emphasis on network analysis. The first paper investigates the relationship between the strength of elite connections and the success of renewable energy and emission reduction policies. Based on an original dataset created from social media accounts of the ministers in 34 countries, this analysis uses a stochastic block model and modularity analysis to compare the strength of connections between different types of elites. The quantitative analysis is complemented by in-depth interviews conducted in seven European countries. The second paper explores the relationship between socio-political capital of state-level American politicians and their agenda holding power in legislation. Using a very extensive dataset on campaign contribution records and state-level bill proposals in the United States, this paper employs survival analysis to explore the aforementioned connection. The third paper is a quantitative description of the large datasets on federal- and state-level campaign contribution records and state-level bill proposals. Using visualization, network analysis, and clustering, the last part of the dissertation uncovers some of the connections between big political donors, parties, private sector, and legislation. The last paper in the dissertation also contains a typological identification section for donors and lawmakers. The goal of the dissertation is to expand the literature on elites, to explore what new stories can be told about political money in the United States, and to make use of large-scale datasets for more conclusive arguments in American politics and policy literature
Integration and visualisation of clinical-omics datasets for medical knowledge discovery
In recent decades, the rise of various omics fields has flooded life sciences with unprecedented amounts of high-throughput data, which have transformed the way biomedical research is conducted. This trend will only intensify in the coming decades, as the cost of data acquisition will continue to decrease. Therefore, there is a pressing need to find novel ways to turn this ocean of raw data into waves of information and finally distil those into drops of translational medical knowledge. This is particularly challenging because of the incredible richness of these datasets, the humbling complexity of biological systems and the growing abundance of clinical metadata, which makes the integration of disparate data sources even more difficult.
Data integration has proven to be a promising avenue for knowledge discovery in biomedical research. Multi-omics studies allow us to examine a biological problem through different lenses using more than one analytical platform. These studies not only present tremendous opportunities for the deep and systematic understanding of health and disease, but they also pose new statistical and computational challenges. The work presented in this thesis aims to alleviate this problem with a novel pipeline for omics data integration.
Modern omics datasets are extremely feature rich and in multi-omics studies this complexity is compounded by a second or even third dataset. However, many of these features might be completely irrelevant to the studied biological problem or redundant in the context of others. Therefore, in this thesis, clinical metadata driven feature selection is proposed as a viable option for narrowing down the focus of analyses in biomedical research.
Our visual cortex has been fine-tuned through millions of years to become an outstanding pattern recognition machine. To leverage this incredible resource of the human brain, we need to develop advanced visualisation software that enables researchers to explore these vast biological datasets through illuminating charts and interactivity. Accordingly, a substantial portion of this PhD was dedicated to implementing truly novel visualisation methods for multi-omics studies.Open Acces
- …