18,773 research outputs found

    Event detection in location-based social networks

    Get PDF
    With the advent of social networks and the rise of mobile technologies, users have become ubiquitous sensors capable of monitoring various real-world events in a crowd-sourced manner. Location-based social networks have proven to be faster than traditional media channels in reporting and geo-locating breaking news, i.e. Osama Bin Laden’s death was first confirmed on Twitter even before the announcement from the communication department at the White House. However, the deluge of user-generated data on these networks requires intelligent systems capable of identifying and characterizing such events in a comprehensive manner. The data mining community coined the term, event detection , to refer to the task of uncovering emerging patterns in data streams . Nonetheless, most data mining techniques do not reproduce the underlying data generation process, hampering to self-adapt in fast-changing scenarios. Because of this, we propose a probabilistic machine learning approach to event detection which explicitly models the data generation process and enables reasoning about the discovered events. With the aim to set forth the differences between both approaches, we present two techniques for the problem of event detection in Twitter : a data mining technique called Tweet-SCAN and a machine learning technique called Warble. We assess and compare both techniques in a dataset of tweets geo-located in the city of Barcelona during its annual festivities. Last but not least, we present the algorithmic changes and data processing frameworks to scale up the proposed techniques to big data workloads.This work is partially supported by Obra Social “la Caixa”, by the Spanish Ministry of Science and Innovation under contract (TIN2015-65316), by the Severo Ochoa Program (SEV2015-0493), by SGR programs of the Catalan Government (2014-SGR-1051, 2014-SGR-118), Collectiveware (TIN2015-66863-C2-1-R) and BSC/UPC NVIDIA GPU Center of Excellence.We would also like to thank the reviewers for their constructive feedback.Peer ReviewedPostprint (author's final draft

    Approximated and User Steerable tSNE for Progressive Visual Analytics

    Full text link
    Progressive Visual Analytics aims at improving the interactivity in existing analytics techniques by means of visualization as well as interaction with intermediate results. One key method for data analysis is dimensionality reduction, for example, to produce 2D embeddings that can be visualized and analyzed efficiently. t-Distributed Stochastic Neighbor Embedding (tSNE) is a well-suited technique for the visualization of several high-dimensional data. tSNE can create meaningful intermediate results but suffers from a slow initialization that constrains its application in Progressive Visual Analytics. We introduce a controllable tSNE approximation (A-tSNE), which trades off speed and accuracy, to enable interactive data exploration. We offer real-time visualization techniques, including a density-based solution and a Magic Lens to inspect the degree of approximation. With this feedback, the user can decide on local refinements and steer the approximation level during the analysis. We demonstrate our technique with several datasets, in a real-world research scenario and for the real-time analysis of high-dimensional streams to illustrate its effectiveness for interactive data analysis

    A Data-Aided Channel Estimation Scheme for Decoupled Systems in Heterogeneous Networks

    Get PDF
    Uplink/downlink (UL/DL) decoupling promises more flexible cell association and higher throughput in heterogeneous networks (HetNets), however, it hampers the acquisition of DL channel state information (CSI) in time-division-duplex (TDD) systems due to different base stations (BSs) connected in UL/DL. In this paper, we propose a novel data-aided (DA) channel estimation scheme to address this problem by utilizing decoded UL data to exploit CSI from received UL data signal in decoupled HetNets where a massive multiple-input multiple-output BS and dense small cell BSs are deployed. We analytically estimate BER performance of UL decoded data, which are used to derive an approximated normalized mean square error (NMSE) expression of the DA minimum mean square error (MMSE) estimator. Compared with the conventional least square (LS) and MMSE, it is shown that NMSE performances of all estimators are determined by their signal-to-noise ratio (SNR)-like terms and there is an increment consisting of UL data power, UL data length and BER values in the SNR-like term of DA method, which suggests DA method outperforms the conventional ones in any scenarios. Higher UL data power, longer UL data length and better BER performance lead to more accurate estimated channels with DA method. Numerical results verify that the analytical BER and NMSE results are close to the simulated ones and a remarkable gain in both NMSE and DL rate can be achieved by DA method in multiple scenarios with different modulations
    corecore