23,467 research outputs found
Network communities within and across borders
We investigate the impact of borders on the topology of spatially embedded
networks. Indeed territorial subdivisions and geographical borders
significantly hamper the geographical span of networks thus playing a key role
in the formation of network communities. This is especially important in
scientific and technological policy-making, highlighting the interplay between
pressure for the internationalization to lead towards a global innovation
system and the administrative borders imposed by the national and regional
institutions. In this study we introduce an outreach index to quantify the
impact of borders on the community structure and apply it to the case of the
European and US patent co-inventors networks. We find that (a) the US
connectivity decays as a power of distance, whereas we observe a faster
exponential decay for Europe; (b) European network communities essentially
correspond to nations and contiguous regions while US communities span multiple
states across the whole country without any characteristic geographic scale. We
confirm our findings by means of a set of simulations aimed at exploring the
relationship between different patterns of cross-border community structures
and the outreach index.Comment: Scientific Reports 4, 201
Multimodal Classification of Urban Micro-Events
In this paper we seek methods to effectively detect urban micro-events. Urban
micro-events are events which occur in cities, have limited geographical
coverage and typically affect only a small group of citizens. Because of their
scale these are difficult to identify in most data sources. However, by using
citizen sensing to gather data, detecting them becomes feasible. The data
gathered by citizen sensing is often multimodal and, as a consequence, the
information required to detect urban micro-events is distributed over multiple
modalities. This makes it essential to have a classifier capable of combining
them. In this paper we explore several methods of creating such a classifier,
including early, late, hybrid fusion and representation learning using
multimodal graphs. We evaluate performance on a real world dataset obtained
from a live citizen reporting system. We show that a multimodal approach yields
higher performance than unimodal alternatives. Furthermore, we demonstrate that
our hybrid combination of early and late fusion with multimodal embeddings
performs best in classification of urban micro-events
Community Detection from Location-Tagged Networks
Many real world systems or web services can be represented as a network such
as social networks and transportation networks. In the past decade, many
algorithms have been developed to detect the communities in a network using
connections between nodes. However in many real world networks, the locations
of nodes have great influence on the community structure. For example, in a
social network, more connections are established between geographically
proximate users. The impact of locations on community has not been fully
investigated by the research literature. In this paper, we propose a community
detection method which takes locations of nodes into consideration. The goal is
to detect communities with both geographic proximity and network closeness. We
analyze the distribution of the distances between connected and unconnected
nodes to measure the influence of location on the network structure on two real
location-tagged social networks. We propose a method to determine if a
location-based community detection method is suitable for a given network. We
propose a new community detection algorithm that pushes the location
information into the community detection. We test our proposed method on both
synthetic data and real world network datasets. The results show that the
communities detected by our method distribute in a smaller area compared with
the traditional methods and have the similar or higher tightness on network
connections
Event detection in location-based social networks
With the advent of social networks and the rise of mobile technologies, users have become ubiquitous sensors capable of monitoring various real-world events in a crowd-sourced manner. Location-based social networks have proven to be faster than traditional media channels in reporting and geo-locating breaking news, i.e. Osama Bin Laden’s death was first confirmed on Twitter even before the announcement from the communication department at the White House. However, the deluge of user-generated data on these networks requires intelligent systems capable of identifying and characterizing such events in a comprehensive manner. The data mining community coined the term, event detection , to refer to the task of uncovering emerging patterns in data streams . Nonetheless, most data mining techniques do not reproduce the underlying data generation process, hampering to self-adapt in fast-changing scenarios. Because of this, we propose a probabilistic machine learning approach to event detection which explicitly models the data generation process and enables reasoning about the discovered events. With the aim to set forth the differences between both approaches, we present two techniques for the problem of event detection in Twitter : a data mining technique called Tweet-SCAN and a machine learning technique called Warble. We assess and compare both techniques in a dataset of tweets geo-located in the city of Barcelona during its annual festivities. Last but not least, we present the algorithmic changes and data processing frameworks to scale up the proposed techniques to big data workloads.This work is partially supported by Obra Social “la Caixa”, by the Spanish Ministry of Science and Innovation under contract (TIN2015-65316), by the Severo Ochoa Program (SEV2015-0493), by SGR programs of the Catalan Government (2014-SGR-1051, 2014-SGR-118), Collectiveware (TIN2015-66863-C2-1-R) and BSC/UPC NVIDIA GPU Center of Excellence.We would also like to thank the reviewers for their constructive feedback.Peer ReviewedPostprint (author's final draft
Topology Analysis of International Networks Based on Debates in the United Nations
In complex, high dimensional and unstructured data it is often difficult to
extract meaningful patterns. This is especially the case when dealing with
textual data. Recent studies in machine learning, information theory and
network science have developed several novel instruments to extract the
semantics of unstructured data, and harness it to build a network of relations.
Such approaches serve as an efficient tool for dimensionality reduction and
pattern detection. This paper applies semantic network science to extract
ideological proximity in the international arena, by focusing on the data from
General Debates in the UN General Assembly on the topics of high salience to
international community. UN General Debate corpus (UNGDC) covers all high-level
debates in the UN General Assembly from 1970 to 2014, covering all UN member
states. The research proceeds in three main steps. First, Latent Dirichlet
Allocation (LDA) is used to extract the topics of the UN speeches, and
therefore semantic information. Each country is then assigned a vector
specifying the exposure to each of the topics identified. This intermediate
output is then used in to construct a network of countries based on information
theoretical metrics where the links capture similar vectorial patterns in the
topic distributions. Topology of the networks is then analyzed through network
properties like density, path length and clustering. Finally, we identify
specific topological features of our networks using the map equation framework
to detect communities in our networks of countries
- …