531 research outputs found
Contextual Subgraph Discovery With Mobility Models
International audienceStarting from a relational database that gathers information on people mobility – such as origin/destination places, date and time, means of transport – as well as demographic data, we adopt a graph-based representation that results from the aggregation of individual travels. In such a graph, the vertices are places or points of interest (POI) and the edges stand for the trips. Travel information as well as user demographics are labels associated to the edges. We tackle the problem of discovering exceptional contextual subgraphs, i.e., subgraphs related to a context – a restriction on the attribute values – that are unexpected according to a model. Previous work considers a simple model based on the number of trips associated with an edge without taking into account its length or the surrounding demography. In this article, we consider richer models based on statistical physics and demonstrate their ability to capture complex phenomena which were previously ignored
Name Disambiguation from link data in a collaboration graph using temporal and topological features
In a social community, multiple persons may share the same name, phone number
or some other identifying attributes. This, along with other phenomena, such as
name abbreviation, name misspelling, and human error leads to erroneous
aggregation of records of multiple persons under a single reference. Such
mistakes affect the performance of document retrieval, web search, database
integration, and more importantly, improper attribution of credit (or blame).
The task of entity disambiguation partitions the records belonging to multiple
persons with the objective that each decomposed partition is composed of
records of a unique person. Existing solutions to this task use either
biographical attributes, or auxiliary features that are collected from external
sources, such as Wikipedia. However, for many scenarios, such auxiliary
features are not available, or they are costly to obtain. Besides, the attempt
of collecting biographical or external data sustains the risk of privacy
violation. In this work, we propose a method for solving entity disambiguation
task from link information obtained from a collaboration network. Our method is
non-intrusive of privacy as it uses only the time-stamped graph topology of an
anonymized network. Experimental results on two real-life academic
collaboration networks show that the proposed method has satisfactory
performance.Comment: The short version of this paper has been accepted to ASONAM 201
SgWalk: Location Recommendation by User Subgraph-Based Graph Embedding
Popularity of Location-based Social Networks (LBSNs) provides an opportunity to collect massive multi-modal datasets that contain geographical information, as well as time and social interactions. Such data is a useful resource for generating personalized location recommendations. Such heterogeneous data can be further extended with notions of trust between users, the popularity of locations, and the expertise of users. Recently the use of Heterogeneous Information Network (HIN) models and graph neural architectures have proven successful for recommendation problems. One limitation of such a solution is capturing the contextual relationships between the nodes in the heterogeneous network. In location recommendation, spatial context is a frequently used consideration such that users prefer to get recommendations within their spatial vicinity. To solve this challenging problem, we propose a novel Heterogeneous Information Network (HIN) embedding technique, SgWalk, which explores the proximity between users and locations and generates location recommendations via subgraph-based node embedding. SgWalk follows four steps: building users subgraphs according to location context, generating random walk sequences over user subgraphs, learning embeddings of nodes in LBSN graph, and generating location recommendations using vector representation of the nodes. SgWalk is differentiated from existing techniques relying on meta-path or bi-partite graphs by means of utilizing the contextual user subgraph. In this way, it is aimed to capture contextual relationships among heterogeneous nodes more effectively. The recommendation accuracy of SgWalk is analyzed through extensive experiments conducted on benchmark datasets in terms of top-n location recommendations. The accuracy evaluation results indicate minimum 23% (@5 recommendation) average improvement in accuracy compared to baseline techniques and the state-of-the-art heterogeneous graph embedding techniques in the literature
STORM-GAN: Spatio-Temporal Meta-GAN for Cross-City Estimation of Human Mobility Responses to COVID-19
Human mobility estimation is crucial during the COVID-19 pandemic due to its
significant guidance for policymakers to make non-pharmaceutical interventions.
While deep learning approaches outperform conventional estimation techniques on
tasks with abundant training data, the continuously evolving pandemic poses a
significant challenge to solving this problem due to data nonstationarity,
limited observations, and complex social contexts. Prior works on mobility
estimation either focus on a single city or lack the ability to model the
spatio-temporal dependencies across cities and time periods. To address these
issues, we make the first attempt to tackle the cross-city human mobility
estimation problem through a deep meta-generative framework. We propose a
Spatio-Temporal Meta-Generative Adversarial Network (STORM-GAN) model that
estimates dynamic human mobility responses under a set of social and policy
conditions related to COVID-19. Facilitated by a novel spatio-temporal
task-based graph (STTG) embedding, STORM-GAN is capable of learning shared
knowledge from a spatio-temporal distribution of estimation tasks and quickly
adapting to new cities and time periods with limited training samples. The STTG
embedding component is designed to capture the similarities among cities to
mitigate cross-task heterogeneity. Experimental results on real-world data show
that the proposed approach can greatly improve estimation performance and
out-perform baselines.Comment: Accepted at the 22nd IEEE International Conference on Data Mining
(ICDM 2022) Full Pape
Identifying Crisis Response Communities in Online Social Networks for Compound Disasters: The Case of Hurricane Laura and Covid-19
Online social networks allow different agencies and the public to interact
and share the underlying risks and protective actions during major disasters.
This study revealed such crisis communication patterns during hurricane Laura
compounded by the COVID-19 pandemic. Laura was one of the strongest (Category
4) hurricanes on record to make landfall in Cameron, Louisiana. Using the
Application Programming Interface (API), this study utilizes large-scale social
media data obtained from Twitter through the recently released academic track
that provides complete and unbiased observations. The data captured publicly
available tweets shared by active Twitter users from the vulnerable areas
threatened by Laura. Online social networks were based on user influence
feature ( mentions or tags) that allows notifying other users while posting a
tweet. Using network science theories and advanced community detection
algorithms, the study split these networks into twenty-one components of
various sizes, the largest of which contained eight well-defined communities.
Several natural language processing techniques (i.e., word clouds, bigrams,
topic modeling) were applied to the tweets shared by the users in these
communities to observe their risk-taking or risk-averse behavior during a major
compounding crisis. Social media accounts of local news media, radio,
universities, and popular sports pages were among those who involved heavily
and interacted closely with local residents. In contrast, emergency management
and planning units in the area engaged less with the public. The findings of
this study provide novel insights into the design of efficient social media
communication guidelines to respond better in future disasters
Context-Aware Personalized Point-of-Interest Recommendation System
The increasing volume of information has created overwhelming challenges to extract the relevant items manually. Fortunately, the online systems, such as e-commerce (e.g., Amazon), location-based social networks (LBSNs) (e.g., Facebook) among many others have the ability to track end users\u27 browsing and consumption experiences. Such explicit experiences (e.g., ratings) and many implicit contexts (e.g., social, spatial, temporal, and categorical) are useful in preference elicitation and recommendation. As an emerging branch of information filtering, the recommendation systems are already popular in many domains, such as movies (e.g., YouTube), music (e.g., Pandora), and Point-of-Interest (POI) (e.g., Yelp).
The POI domain has many contextual challenges (e.g., spatial (preferences to a near place), social (e.g., friend\u27s influence), temporal (e.g., popularity at certain time), categorical (similar preferences to places with same category), locality of POI, etc.) that can be crucial for an efficient recommendation. The user reviews shared across different social networks provide granularity in users\u27 consumption experience. From the data mining and machine learning perspective, following three research directions are identified and considered relevant to an efficient context-aware POI recommendation, (1) incorporation of major contexts into a single model and a detailed analysis of the impact of those contexts, (2) exploitation of user activity and location influence to model hierarchical preferences, and (3) exploitation of user reviews to formulate the aspect opinion relation and to generate explanation for recommendation.
This dissertation presents different machine learning and data mining-based solutions to address the above-mentioned research problems, including, (1) recommendation models inspired from contextualized ranking and matrix factorization that incorporate the major contexts and help in analysis of their importance, (2) hierarchical and matrix-factorization models that formulate users\u27 activity and POI influences on different localities that model hierarchical preferences and generate individual and sequence recommendations, and (3) graphical models inspired from natural language processing and neural networks to generate recommendations augmented with aspect-based explanations
Analysing Human Mobility Patterns of Hiking Activities through Complex Network Theory
The exploitation of high volume of geolocalized data from social sport
tracking applications of outdoor activities can be useful for natural resource
planning and to understand the human mobility patterns during leisure
activities. This geolocalized data represents the selection of hike activities
according to subjective and objective factors such as personal goals, personal
abilities, trail conditions or weather conditions. In our approach, human
mobility patterns are analysed from trajectories which are generated by hikers.
We propose the generation of the trail network identifying special points in
the overlap of trajectories. Trail crossings and trailheads define our network
and shape topological features. We analyse the trail network of Balearic
Islands, as a case of study, using complex weighted network theory. The
analysis is divided into the four seasons of the year to observe the impact of
weather conditions on the network topology. The number of visited places does
not decrease despite the large difference in the number of samples of the two
seasons with larger and lower activity. It is in summer season where it is
produced the most significant variation in the frequency and localization of
activities from inland regions to coastal areas. Finally, we compare our model
with other related studies where the network possesses a different purpose. One
finding of our approach is the detection of regions with relevant importance
where landscape interventions can be applied in function of the communities.Comment: 20 pages, 9 figures, accepte
- …