18,773 research outputs found
Event detection in location-based social networks
With the advent of social networks and the rise of mobile technologies, users have become ubiquitous sensors capable of monitoring various real-world events in a crowd-sourced manner. Location-based social networks have proven to be faster than traditional media channels in reporting and geo-locating breaking news, i.e. Osama Bin Laden’s death was first confirmed on Twitter even before the announcement from the communication department at the White House. However, the deluge of user-generated data on these networks requires intelligent systems capable of identifying and characterizing such events in a comprehensive manner. The data mining community coined the term, event detection , to refer to the task of uncovering emerging patterns in data streams . Nonetheless, most data mining techniques do not reproduce the underlying data generation process, hampering to self-adapt in fast-changing scenarios. Because of this, we propose a probabilistic machine learning approach to event detection which explicitly models the data generation process and enables reasoning about the discovered events. With the aim to set forth the differences between both approaches, we present two techniques for the problem of event detection in Twitter : a data mining technique called Tweet-SCAN and a machine learning technique called Warble. We assess and compare both techniques in a dataset of tweets geo-located in the city of Barcelona during its annual festivities. Last but not least, we present the algorithmic changes and data processing frameworks to scale up the proposed techniques to big data workloads.This work is partially supported by Obra Social “la Caixa”, by the Spanish Ministry of Science and Innovation under contract (TIN2015-65316), by the Severo Ochoa Program (SEV2015-0493), by SGR programs of the Catalan Government (2014-SGR-1051, 2014-SGR-118), Collectiveware (TIN2015-66863-C2-1-R) and BSC/UPC NVIDIA GPU Center of Excellence.We would also like to thank the reviewers for their constructive feedback.Peer ReviewedPostprint (author's final draft
Approximated and User Steerable tSNE for Progressive Visual Analytics
Progressive Visual Analytics aims at improving the interactivity in existing
analytics techniques by means of visualization as well as interaction with
intermediate results. One key method for data analysis is dimensionality
reduction, for example, to produce 2D embeddings that can be visualized and
analyzed efficiently. t-Distributed Stochastic Neighbor Embedding (tSNE) is a
well-suited technique for the visualization of several high-dimensional data.
tSNE can create meaningful intermediate results but suffers from a slow
initialization that constrains its application in Progressive Visual Analytics.
We introduce a controllable tSNE approximation (A-tSNE), which trades off speed
and accuracy, to enable interactive data exploration. We offer real-time
visualization techniques, including a density-based solution and a Magic Lens
to inspect the degree of approximation. With this feedback, the user can decide
on local refinements and steer the approximation level during the analysis. We
demonstrate our technique with several datasets, in a real-world research
scenario and for the real-time analysis of high-dimensional streams to
illustrate its effectiveness for interactive data analysis
A Data-Aided Channel Estimation Scheme for Decoupled Systems in Heterogeneous Networks
Uplink/downlink (UL/DL) decoupling promises more flexible cell association
and higher throughput in heterogeneous networks (HetNets), however, it hampers
the acquisition of DL channel state information (CSI) in time-division-duplex
(TDD) systems due to different base stations (BSs) connected in UL/DL. In this
paper, we propose a novel data-aided (DA) channel estimation scheme to address
this problem by utilizing decoded UL data to exploit CSI from received UL data
signal in decoupled HetNets where a massive multiple-input multiple-output BS
and dense small cell BSs are deployed. We analytically estimate BER performance
of UL decoded data, which are used to derive an approximated normalized mean
square error (NMSE) expression of the DA minimum mean square error (MMSE)
estimator. Compared with the conventional least square (LS) and MMSE, it is
shown that NMSE performances of all estimators are determined by their
signal-to-noise ratio (SNR)-like terms and there is an increment consisting of
UL data power, UL data length and BER values in the SNR-like term of DA method,
which suggests DA method outperforms the conventional ones in any scenarios.
Higher UL data power, longer UL data length and better BER performance lead to
more accurate estimated channels with DA method. Numerical results verify that
the analytical BER and NMSE results are close to the simulated ones and a
remarkable gain in both NMSE and DL rate can be achieved by DA method in
multiple scenarios with different modulations
- …