This thesis is mainly motivated by the analysis, understanding, and prediction of human behaviour
by means of the study of their digital fingeprints. Unlike a classical PhD thesis, where
you choose a topic and go further on a deep analysis on a research topic, we carried out a breadth
analysis on the research topic of complex networks, such as those that humans create themselves
with their relationships and interactions. These kinds of digital communities where humans interact
and create relationships are commonly called Online Social Networks. Then, (i) we have
collected their interactions, as text messages they share among each other, in order to analyze the
sentiment and topic of such messages. We have basically applied the state-of-the-art techniques
for Natural Language Processing, widely developed and tested on English texts, in a collection
of Spanish Tweets and we compare the results. Next, (ii) we focused on Topic Detection, creating
our own classifier and applying it to the former Tweets dataset. The breakthroughs are two:
our classifier relies on text-graphs from the input text and we achieved a figure of 70% accuracy,
outperforming previous results. After that, (iii) we moved to analyze the network structure (or
topology) and their data values to detect outliers. We hypothesize that in social networks there
is a large mass of users that behaves similarly, while a reduced set of them behave in a different
way. However, specially among this last group, we try to separate those with high activity, or
low activity, or any other paramater/feature that make them belong to different kind of outliers.
We aim to detect influential users in one of these outliers set. We propose a new unsupervised
method, Massive Unsupervised Outlier Detection (MUOD), labeling the outliers detected os of
shape, magnitude, amplitude or combination of those. We applied this method to a subset of
roughly 400 million Google+ users, identifying and discriminating automatically sets of outlier
users. Finally, (iv) we find interesting to address the monitorization of real complex networks.
We created a framework to dynamically adapt the temporality of large-scale dynamic networks,
reducing compute overhead by at least 76%, data volume by 60% and overall cloud costs by at
least 54%, while always maintaining accuracy above 88%.PublicadoPrograma de Doctorado en Ingeniería Matemática por la Universidad Carlos III de MadridPresidente: Rosa María Benito Zafrilla.- Secretario: Ángel Cuevas Rumín.- Vocal: José Ernesto Jiménez Merin