2,560 research outputs found
Improving Individual Predictions using Social Networks Assortativity
Social networks are known to be assortative with
respect to many attributes, such as age, weight, wealth, level
of education, ethnicity and gender. This can be explained by
influences and homophilies. Independently of its origin, this
assortativity gives us information about each node given its
neighbors. Assortativity can thus be used to improve individual
predictions in a broad range of situations, when data are missing
or inaccurate. This paper presents a general framework based on
probabilistic graphical models to exploit social network structures
for improving individual predictions of node attributes. Using
this framework, we quantify the assortativity range leading to an
accuracy gain in several situations. We finally show how specific
characteristics of the network can improve performances further.
For instance, the gender assortativity in real-world mobile phone
data changes significantly according to some communication
attributes. In this case, individual predictions with 75% accuracy
are improved by up to 3%
Sequences of purchases in credit card data reveal life styles in urban populations
Zipf-like distributions characterize a wide set of phenomena in physics,
biology, economics and social sciences. In human activities, Zipf-laws describe
for example the frequency of words appearance in a text or the purchases types
in shopping patterns. In the latter, the uneven distribution of transaction
types is bound with the temporal sequences of purchases of individual choices.
In this work, we define a framework using a text compression technique on the
sequences of credit card purchases to detect ubiquitous patterns of collective
behavior. Clustering the consumers by their similarity in purchases sequences,
we detect five consumer groups. Remarkably, post checking, individuals in each
group are also similar in their age, total expenditure, gender, and the
diversity of their social and mobility networks extracted by their mobile phone
records. By properly deconstructing transaction data with Zipf-like
distributions, this method uncovers sets of significant sequences that reveal
insights on collective human behavior.Comment: 30 pages, 26 figure
Inference of Socioeconomic Status in a Communication Graph
In this work, we examine the socio-economic correlations present among users in a mobile phone network in Mexico. First, we find that the distribution of income for a subset of users –for which we have income information given by a large bank in Mexico– follows closely, but not exactly, the income distribution for the whole population of Mexico.
We also show the existence of a strong socio-economic homophily in the mobile phone network, where users linked in the network are more likely to have similar income. The main contribution of this work is that we leverage this homophily in order to propose a methodology, based on Bayesian statistics, to infer the socio-economic status for a large subset of users in the network (for which we have no banking information). With our proposed algorithm, we achieve an accuracy of 0.71 in a two-class classification problem (low and high income) which significantly outperforms a simpler method based on a frequentist approach. Finally, we extend the two-class classification problem to multiple classes by using the Dirichlet distribution.Sociedad Argentina de Informática e Investigación Operativa (SADIO
Temporal Social Coordination Through Social Networks
Temporal communication is mainly associated with the concept of time. The social network derived from temporal environment is constantly changing; a communication link can be connected and disconnected highly frequently. Further with the communication technology such as cell phone, time itself has shifted from an absolute time to a relative time. Mobile communication is closely related with temporal communication, due to its micro coordination property and also the constant establishment of links and breakage of links from time to time. To study the network in the temporal domain, we are constrained by the relative time concept. As communication behaviour is highly dynamic, we expect formation of new ties and breakages of existing ties over time. This is especially different when comparing to social network studies conducted through self report surveys as the network through self report survey remains relatively static for the duration of the survey. In our study, we are interested in how a person would be expanding its network only. Thus we use an accumulated network structure to study the total links a person acquires over time and how such influences the network position
- …