Search CORE

147,680 research outputs found

Distributed-based massive processing of activity logs for efficient user modeling in a Virtual Campus

Author: Caballé Llobet Santiago
Xhafa Xhafa Fatos
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

This paper reports on a multi-fold approach for the building of user models based on the identification of navigation patterns in a virtual campus, allowing for adapting the campus’ usability to the actual learners’ needs, thus resulting in a great stimulation of the learning experience. However, user modeling in this context implies a constant processing and analysis of user interaction data during long-term learning activities, which produces huge amounts of valuable data stored typically in server log files. Due to the large or very large size of log files generated daily, the massive processing is a foremost step in extracting useful information. To this end, this work studies, first, the viability of processing large log data files of a real Virtual Campus using different distributed infrastructures. More precisely, we study the time performance of massive processing of daily log files implemented following the master-slave paradigm and evaluated using Cluster Computing and PlanetLab platforms. The study reveals the complexity and challenges of massive processing in the big data era, such as the need to carefully tune the log file processing in terms of chunk log data size to be processed at slave nodes as well as the bottleneck in processing in truly geographically distributed infrastructures due to the overhead caused by the communication time among the master and slave nodes. Then, an application of the massive processing approach resulting in log data processed and stored in a well-structured format is presented. We show how to extract knowledge from the log data analysis by using the WEKA framework for data mining purposes showing its usefulness to effectively build user models in terms of identifying interesting navigation patters of on-line learners. The study is motivated and conducted in the context of the actual data logs of the Virtual Campus of the Open University of Catalonia.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

The Oberta in open access

A Generalized Framework on Beamformer Design and CSI Acquisition for Single-Carrier Massive MIMO Systems in Millimeter Wave Channels

Author: Ayanoglu Ender
Guvensen Gokhan M.
Publication venue
Publication date: 01/01/2016
Field of study

In this paper, we establish a general framework on the reduced dimensional channel state information (CSI) estimation and pre-beamformer design for frequency-selective massive multiple-input multiple-output MIMO systems employing single-carrier (SC) modulation in time division duplex (TDD) mode by exploiting the joint angle-delay domain channel sparsity in millimeter (mm) wave frequencies. First, based on a generic subspace projection taking the joint angle-delay power profile and user-grouping into account, the reduced rank minimum mean square error (RR-MMSE) instantaneous CSI estimator is derived for spatially correlated wideband MIMO channels. Second, the statistical pre-beamformer design is considered for frequency-selective SC massive MIMO channels. We examine the dimension reduction problem and subspace (beamspace) construction on which the RR-MMSE estimation can be realized as accurately as possible. Finally, a spatio-temporal domain correlator type reduced rank channel estimator, as an approximation of the RR-MMSE estimate, is obtained by carrying out least square (LS) estimation in a proper reduced dimensional beamspace. It is observed that the proposed techniques show remarkable robustness to the pilot interference (or contamination) with a significant reduction in pilot overhead

arXiv.org e-Print Archive

Crossref

OpenMETU (Middle East Technical University)

Off-Policy Evaluation of Probabilistic Identity Data in Lookalike Modeling

Author: Cotta Randell
Hu Mingyang
Jiang Dan
Liao Peizhou
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/01/2019
Field of study

We evaluate the impact of probabilistically-constructed digital identity data collected from Sep. to Dec. 2017 (approx.), in the context of Lookalike-targeted campaigns. The backbone of this study is a large set of probabilistically-constructed "identities", represented as small bags of cookies and mobile ad identifiers with associated metadata, that are likely all owned by the same underlying user. The identity data allows to generate "identity-based", rather than "identifier-based", user models, giving a fuller picture of the interests of the users underlying the identifiers. We employ off-policy techniques to evaluate the potential of identity-powered lookalike models without incurring the risk of allowing untested models to direct large amounts of ad spend or the large cost of performing A/B tests. We add to historical work on off-policy evaluation by noting a significant type of "finite-sample bias" that occurs for studies combining modestly-sized datasets and evaluation metrics involving rare events (e.g., conversions). We illustrate this bias using a simulation study that later informs the handling of inverse propensity weights in our analyses on real data. We demonstrate significant lift in identity-powered lookalikes versus an identity-ignorant baseline: on average ~70% lift in conversion rate. This rises to factors of ~(4-32)x for identifiers having little data themselves, but that can be inferred to belong to users with substantial data to aggregate across identifiers. This implies that identity-powered user modeling is especially important in the context of identifiers having very short lifespans (i.e., frequently churned cookies). Our work motivates and informs the use of probabilistically-constructed identities in marketing. It also deepens the canon of examples in which off-policy learning has been employed to evaluate the complex systems of the internet economy.Comment: Accepted by WSDM 201

arXiv.org e-Print Archive

Crossref