2 research outputs found
The State of the Art in Multilayer Network Visualization
Modelling relationships between entities in real-world systems with a simple
graph is a standard approach. However, reality is better embraced as several
interdependent subsystems (or layers). Recently the concept of a multilayer
network model has emerged from the field of complex systems. This model can be
applied to a wide range of real-world datasets. Examples of multilayer networks
can be found in the domains of life sciences, sociology, digital humanities and
more. Within the domain of graph visualization there are many systems which
visualize datasets having many characteristics of multilayer graphs. This
report provides a state of the art and a structured analysis of contemporary
multilayer network visualization, not only for researchers in visualization,
but also for those who aim to visualize multilayer networks in the domain of
complex systems, as well as those developing systems across application
domains. We have explored the visualization literature to survey visualization
techniques suitable for multilayer graph visualization, as well as tools,
tasks, and analytic techniques from within application domains. This report
also identifies the outstanding challenges for multilayer graph visualization
and suggests future research directions for addressing them
Unleashing the power of semantic text analysis: a complex systems approach
In the present information era, a huge amount of machine-readable data is available regarding scientific publications. Such unprecedented wealth of data offers the opportunity to investigate science itself as a complex interacting system by means of quantitative approaches. These kind of studies have the potential to provide new insights on the large-scale organization of science and the driving mechanisms underlying its evolution. A particularly important aspect of these data is the semantic information present within publications as it grants access to the concepts used by scientists to describe their findings. Nevertheless, the presence of the so-called buzzwords, \ie terms that are not specific and are used indistinctly in many contexts, hinders the emerging of the thematic organization of scientific articles.
In this Thesis, I resume my original contribution to the problem of leveraging the semantic information contained in a corpus of documents. Specifically, I have developed an information-theoretic measure, based on the maximum entropy principle, to quantify the information content of scientific concepts. This measure provides an objective and powerful way to identify generic concepts acting as buzzwords, which increase the noise present in the semantic similarity between articles. I prove that the removal of generic concepts is beneficial in terms of the sparsity of the similarity network, thus allowing the detection of communities of articles that are related to more specific themes. The same effect is observed when describing the corpus of articles in terms of topics, namely clusters of concepts that compose the papers as a mixture. Moreover, I applied the method to a collection of web documents obtaining a similar effect despite their differences with scientific articles. Regarding the scientific knowledge, another important aspect I examine is the temporal evolution of the concept generality, as it may potentially describe typical patterns in the evolution of concepts that can highlight the way in which they are consumed over time