22,842 research outputs found

    Pyramid: Enhancing Selectivity in Big Data Protection with Count Featurization

    Full text link
    Protecting vast quantities of data poses a daunting challenge for the growing number of organizations that collect, stockpile, and monetize it. The ability to distinguish data that is actually needed from data collected "just in case" would help these organizations to limit the latter's exposure to attack. A natural approach might be to monitor data use and retain only the working-set of in-use data in accessible storage; unused data can be evicted to a highly protected store. However, many of today's big data applications rely on machine learning (ML) workloads that are periodically retrained by accessing, and thus exposing to attack, the entire data store. Training set minimization methods, such as count featurization, are often used to limit the data needed to train ML workloads to improve performance or scalability. We present Pyramid, a limited-exposure data management system that builds upon count featurization to enhance data protection. As such, Pyramid uniquely introduces both the idea and proof-of-concept for leveraging training set minimization methods to instill rigor and selectivity into big data management. We integrated Pyramid into Spark Velox, a framework for ML-based targeting and personalization. We evaluate it on three applications and show that Pyramid approaches state-of-the-art models while training on less than 1% of the raw data

    Parsimonious and Adaptive Contextual Information Acquisition in Recommender Systems

    Full text link
    Also published online by CEUR Workshop Proceedings (CEUR-WS.org, ISSN 1613-0073) Context-Aware Recommender System (CARS) models are trained on datasets of context-dependent user preferences (ratings and context information). Since the number of context-dependent preferences increases exponentially with the number of contextual factors, and certain contextual in- formation is still hard to acquire automatically (e.g., the user's mood or for whom the user is buying the searched item) it is fundamental to identify and acquire those factors that truly in uence the user preferences and the ratings. In particular, this ensures that (i) the user e ort in specifying contextual information is kept to a minimum, and (ii) the system's performance is not negatively impacted by irrele- vant contextual information. In this paper, we propose a novel method which, unlike existing ones, directly estimates the impact of context on rating predictions and adaptively identi es the contextual factors that are deemed to be useful to be elicited from the users. Our experimental evaluation shows that it compares favourably to various state-of-the-art context selection methods

    Context-Aware Service Recommendation System for the Social Internet of Things

    Full text link
    The Social Internet of Things (SIoT) enables interconnected smart devices to share data and services, opening up opportunities for personalized service recommendations. However, existing research often overlooks crucial aspects that can enhance the accuracy and relevance of recommendations in the SIoT context. Specifically, existing techniques tend to consider the extraction of social relationships between devices and neglect the contextual presentation of service reviews. This study aims to address these gaps by exploring the contextual representation of each device-service pair. Firstly, we propose a latent features combination technique that can capture latent feature interactions, by aggregating the device-device relationships within the SIoT. Then, we leverage Factorization Machines to model higher-order feature interactions specific to each SIoT device-service pair to accomplish accurate rating prediction. Finally, we propose a service recommendation framework for SIoT based on review aggregation and feature learning processes. The experimental evaluation demonstrates the framework's effectiveness in improving service recommendation accuracy and relevance

    Facing fear: Expression of fear facilitates processing of emotional information

    Get PDF
    Evidence shows that manipulating the expressive component of fear can influence the processing of emotional information. Participants unobtrusively produced the expressive behaviors typical of fear, anger or happiness. Participants producing the expression of fear were faster at classifying verbal material with emotional content than participants producing the expressions of happiness or anger. These effects were especially pronounced for participants who were generally sensitive to their own bodily cues, as indicated by their degree of field-dependence measured by the Rod-and-Frame Task (Witkin &amp; Asch, 1948). The results suggest that one way of eliciting the cognitive consequences of fear is by inducing the embodied expressive behavior.</jats:p

    On the Relation between Sensitivity and Accuracy in In-context Learning

    Full text link
    In-context learning (ICL) suffers from oversensitivity to the prompt, which makes it unreliable in real-world scenarios. We study the sensitivity of ICL with respect to multiple types of perturbations. First, we find that label bias obscures true ICL sensitivity, and hence prior work may have significantly underestimated the true ICL sensitivity. Second, we observe a strong negative correlation between ICL sensitivity and accuracy, with sensitive predictions less likely to be correct. Motivated by these observations, we propose \textsc{SenSel}, a few-shot selective prediction method based on ICL sensitivity. Experiments on ten classification benchmarks show that \textsc{SenSel} consistently outperforms a commonly used confidence-based selective prediction baseline

    Listening comprehension and strategy use: a longitudinal exploration

    Get PDF
    This paper examines the development of strategy use over 6 months in two lower-intermediate learners of L2 French in secondary schools in England. These learners were selected from a larger sample on the basis of their scores on a recall protocol completed after listening to short passages at two time points: one was consistently a high scorer; the other one, a low scorer. Qualitative data on these two learners’ strategic behaviour were gathered at the two time points from verbal reports made by learners while they were completing a multiple-choice listening task. Our results show a high degree of stability of strategy use over the time period, with pre-existing differences between the high and low scorer persisting. The theoretical and pedagogical implications of these findings are discussed

    Revisiting the epistemology of fact-checking

    Full text link
    Joseph E. Uscinski and Ryden W. Butler (2013) argue that fact-checking should be condemned to the dustbin of history because the methods fact-checkers use to select statements, consider evidence, and render judgment fail to stand up to the rigors of scientific inquiry and threaten to stifle political debate. However, the premises upon which they build their arguments are flawed. By sampling from multiple “fact-checking agencies” that do not practice fact-checking on a regular basis in a consistent manner, they perpetuate the selection effects they criticize and thus undermine their own position. Furthermore, not only do their arguments suffer from overgeneralization, they fail to offer empirical quantification to support some of their anecdotal criticisms. This rejoinder offers a study demonstrating a high level of consistency in fact-checking and argues that as long as unambiguous practices of deception continue, fact-checking has an important role to play in the United States and around the world

    Using Grouped Linear Prediction and Accelerated Reinforcement Learning for Online Content Caching

    Full text link
    Proactive caching is an effective way to alleviate peak-hour traffic congestion by prefetching popular contents at the wireless network edge. To maximize the caching efficiency requires the knowledge of content popularity profile, which however is often unavailable in advance. In this paper, we first propose a new linear prediction model, named grouped linear model (GLM) to estimate the future content requests based on historical data. Unlike many existing works that assumed the static content popularity profile, our model can adapt to the temporal variation of the content popularity in practical systems due to the arrival of new contents and dynamics of user preference. Based on the predicted content requests, we then propose a reinforcement learning approach with model-free acceleration (RLMA) for online cache replacement by taking into account both the cache hits and replacement cost. This approach accelerates the learning process in non-stationary environment by generating imaginary samples for Q-value updates. Numerical results based on real-world traces show that the proposed prediction and learning based online caching policy outperform all considered existing schemes.Comment: 6 pages, 4 figures, ICC 2018 worksho
    • …
    corecore