97 research outputs found
A Probabilistic Embedding Clustering Method for Urban Structure Detection
Urban structure detection is a basic task in urban geography. Clustering is a
core technology to detect the patterns of urban spatial structure, urban
functional region, and so on. In big data era, diverse urban sensing datasets
recording information like human behaviour and human social activity, suffer
from complexity in high dimension and high noise. And unfortunately, the
state-of-the-art clustering methods does not handle the problem with high
dimension and high noise issues concurrently. In this paper, a probabilistic
embedding clustering method is proposed. Firstly, we come up with a
Probabilistic Embedding Model (PEM) to find latent features from high
dimensional urban sensing data by learning via probabilistic model. By latent
features, we could catch essential features hidden in high dimensional data
known as patterns; with the probabilistic model, we can also reduce uncertainty
caused by high noise. Secondly, through tuning the parameters, our model could
discover two kinds of urban structure, the homophily and structural
equivalence, which means communities with intensive interaction or in the same
roles in urban structure. We evaluated the performance of our model by
conducting experiments on real-world data and experiments with real data in
Shanghai (China) proved that our method could discover two kinds of urban
structure, the homophily and structural equivalence, which means clustering
community with intensive interaction or under the same roles in urban space.Comment: 6 pages, 7 figures, ICSDM201
Why Do Cascade Sizes Follow a Power-Law?
We introduce random directed acyclic graph and use it to model the
information diffusion network. Subsequently, we analyze the cascade generation
model (CGM) introduced by Leskovec et al. [19]. Until now only empirical
studies of this model were done. In this paper, we present the first
theoretical proof that the sizes of cascades generated by the CGM follow the
power-law distribution, which is consistent with multiple empirical analysis of
the large social networks. We compared the assumptions of our model with the
Twitter social network and tested the goodness of approximation.Comment: 8 pages, 7 figures, accepted to WWW 201
Distributed Downlink Resource Allocation in Cellular Networks through Spatial Adaptive Play
International audienceIn this work, we develop mathematical and algorithmic tools for distributed resource allocation in downlink of mobile cellular networks. Our algorithms perform power allocation, subcarrier selection and base station association simultaneously. We aim to maximize the aggregate utility of all the users where users' utilities can be arbitrary increasing functions of their throughputs; this allows us to capture both elastic and inelastic traffics. Our solution is via framing the problem as a potential game among users. We propose a highly scalable, asynchronous algorithm that provably converges to a Nash equilibrium of this game. This algorithm requires only local measurements, limited communication between neighboring nodes and limited computation. This algorithm may at times stuck at a local maximum. To alleviate this problem we propose an enhanced randomized algorithm based on spatial adaptive play, that provably converges to a system optimal resource allocation. We also present simulation results to illustrate convergence and performance of the proposed algorithms
Performance Comparison of Contention- and Schedule-based MAC Protocols in Urban Parking Sensor Networks
Network traffic model is a critical problem for urban applications, mainly
because of its diversity and node density. As wireless sensor network is highly
concerned with the development of smart cities, careful consideration to
traffic model helps choose appropriate protocols and adapt network parameters
to reach best performances on energy-latency tradeoffs. In this paper, we
compare the performance of two off-the-shelf medium access control protocols on
two different kinds of traffic models, and then evaluate their application-end
information delay and energy consumption while varying traffic parameters and
network density. From the simulation results, we highlight some limits induced
by network density and occurrence frequency of event-driven applications. When
it comes to realtime urban services, a protocol selection shall be taken into
account - even dynamically - with a special attention to energy-delay tradeoff.
To this end, we provide several insights on parking sensor networks.Comment: ACM International Workshop on Wireless and Mobile Technologies for
Smart Cities (WiMobCity) (2014
Application du contrôle pour garantir la performance des systèmes Big Data
International audienceNous sommes à l'aube d'une énorme explosion de données et la quantité à traiter par les entreprises est de plus en plus grande. Pour faire face à ce chalenge, Google a développé MapReduce, un modèle de programmation parallèle qui est en train de devenir l'outil de facto pour l'analyse des systèmes Big Data. Bien que dans une certaine mesure son utilisation est déjà très répandue dans l'industrie, garantir les performances d'un système aussi complexe pose de grands problèmes et sa gestion nécessite un haut niveau d'expertise. Cet article répond à ces défis en proposant le premier système autonome qui garantit des contraintes de temps de réponse pour une charge de travail MapReduce simultanée. Nous développons le premier modèle dynamique d'une grappe MapRe- duce. De plus, un contrôle en boucle fermée est conçu et implémenté pour garantir un temps de réponse donné. Un contrôle d'anticipation de type ""feedforward"" est également rajouté pour amé- liorer la réponse du système en présence de perturbations, en l'occurrence, la variation du nombre de clients. L'approche est validée en ligne sur une grappe MapReduce avec 40 nœuds utilisant une charge de travail intensive de type Business Intelligence. Nos expériences montrent que le contrôle ainsi conçu peut garantir les contraintes de temps de réponse
Time-Series Link Prediction Using Support Vector Machines
The prominence of social networks motivates developments in network analysis, such as link prediction, which deals with predicting the existence or emergence of links on a given network. The Vector Auto Regression (VAR) technique has been shown to be one of the best for time-series based link prediction. One VAR technique implementation uses an unweighted adjacency matrix and five additional matrices based on the similarity metrics of Common Neighbor, Adamic-Adar, Jaccard’s Coefficient, Preferential Attachment and Research Allocation Index. In our previous work, we proposed the use of the Support Vector Machines (SVM) for such prediction task, and, using the same set of matrices, we gained better results. A dataset from DBLP was used to test the performance of the VAR and SVM link prediction models for two lags. In this study, we extended the VAR and SVM models by using three, four, and five lags, and these showed that both VAR and SVM improved with more data from the lags. The VAR and SVM models achieved their highest ROC-AUC values of 84.96% and 86.32% respectively using five lags compared to lower AUC values of 84.26% and 84.98% using two lags. Moreover, we identified that improving the predictive abilities of both models is constrained by the difficulty in the prediction of new links, which we define as links that do not exist in any of the corresponding lags. Hence, we created separate VAR and SVM models for the prediction of new links. The highest ROC-AUC was still achieved by using SVM with five lags, although at a lower value of 73.85%. The significant drop in the performance of VAR and SVM predictors for the prediction of new links indicate the need for more research in this problem space. Moreover, results showed that SVM can be used as an alternative method for time-series based link prediction
- …