7,959 research outputs found

    Shortest Path Computation with No Information Leakage

    Get PDF
    Shortest path computation is one of the most common queries in location-based services (LBSs). Although particularly useful, such queries raise serious privacy concerns. Exposing to a (potentially untrusted) LBS the client's position and her destination may reveal personal information, such as social habits, health condition, shopping preferences, lifestyle choices, etc. The only existing method for privacy-preserving shortest path computation follows the obfuscation paradigm; it prevents the LBS from inferring the source and destination of the query with a probability higher than a threshold. This implies, however, that the LBS still deduces some information (albeit not exact) about the client's location and her destination. In this paper we aim at strong privacy, where the adversary learns nothing about the shortest path query. We achieve this via established private information retrieval techniques, which we treat as black-box building blocks. Experiments on real, large-scale road networks assess the practicality of our schemes.Comment: VLDB201

    Memories for Life: A Review of the Science and Technology

    No full text
    This paper discusses scientific, social and technological aspects of memory. Recent developments in our understanding of memory processes and mechanisms, and their digital implementation, have placed the encoding, storage, management and retrieval of information at the forefront of several fields of research. At the same time, the divisions between the biological, physical and the digital worlds seem to be dissolving. Hence opportunities for interdisciplinary research into memory are being created, between the life sciences, social sciences and physical sciences. Such research may benefit from immediate application into information management technology as a testbed. The paper describes one initiative, Memories for Life, as a potential common problem space for the various interested disciplines

    Text Embeddings Reveal (Almost) As Much As Text

    Full text link
    How much private information do text embeddings reveal about the original text? We investigate the problem of embedding \textit{inversion}, reconstructing the full text represented in dense text embeddings. We frame the problem as controlled generation: generating text that, when reembedded, is close to a fixed point in latent space. We find that although a na\"ive model conditioned on the embedding performs poorly, a multi-step method that iteratively corrects and re-embeds text is able to recover 92%92\% of 32-token32\text{-token} text inputs exactly. We train our model to decode text embeddings from two state-of-the-art embedding models, and also show that our model can recover important personal information (full names) from a dataset of clinical notes. Our code is available on Github: \href{https://github.com/jxmorris12/vec2text}{github.com/jxmorris12/vec2text}.Comment: Accepted at EMNLP 202

    Repetition, pattern and the domestic: notes on the relationship between pattern and home-making

    Full text link
    Repetition constitutes the very essence of pattern. Repetition is also the basis of our most ordinary actions. Repetitive gestures are usually so integrated in our lives that we tend to take them for granted. It is only when repetition is excessive or absent that we become aware of its importance to us. Not least because of their everyday properties, pattern and repetition are also closely related to the domain of the domestic. On the one hand, patterned artifacts, such as wallpapers, rugs, latticed curtains, and other fabrics seem to operate naturally as signifiers of an idea of domesticity, denoting privacy, comfort and, eventually, also seclusion and confinement. On the other hand, the repetitive rituals of pattern fabrication bear strong resonance with the traditional routines of household maintenance—cleaning, sorting, laundering, and so on. Not only are both dependent on a logic of continuous reiteration, but they also tend to be considered equally mindless and prosaic, as their processes are often rated inferior in comparison to less repetitive forms of production. In “Repetition, Pattern, and the Domestic” I investigate the foundations and implications of the identification between pattern and the home, drawing on material from historical, mythological, and psychological sources. This investigation aims to show how the repetitive mechanisms of pattern-making integrate the very dynamics of inhabitation, being essentially entangled, if sometimes inconspicuously, with the practice of spatial design

    SimplyRetrieve: A Private and Lightweight Retrieval-Centric Generative AI Tool

    Full text link
    Large Language Model (LLM) based Generative AI systems have seen significant progress in recent years. Integrating a knowledge retrieval architecture allows for seamless integration of private data into publicly available Generative AI systems using pre-trained LLM without requiring additional model fine-tuning. Moreover, Retrieval-Centric Generation (RCG) approach, a promising future research direction that explicitly separates roles of LLMs and retrievers in context interpretation and knowledge memorization, potentially leads to more efficient implementation. SimplyRetrieve is an open-source tool with the goal of providing a localized, lightweight, and user-friendly interface to these sophisticated advancements to the machine learning community. SimplyRetrieve features a GUI and API based RCG platform, assisted by a Private Knowledge Base Constructor and a Retrieval Tuning Module. By leveraging these capabilities, users can explore the potential of RCG for improving generative AI performance while maintaining privacy standards. The tool is available at https://github.com/RCGAI/SimplyRetrieve with an MIT license.Comment: 12 pages, 6 figure

    Talk the Walk: Synthetic Data Generation for Conversational Music Recommendation

    Full text link
    Recommendation systems are ubiquitous yet often difficult for users to control and adjust when recommendation quality is poor. This has motivated the development of conversational recommendation systems (CRSs), with control over recommendations provided through natural language feedback. However, building conversational recommendation systems requires conversational training data involving user utterances paired with items that cover a diverse range of preferences. Such data has proved challenging to collect scalably using conventional methods like crowdsourcing. We address it in the context of item-set recommendation, noting the increasing attention to this task motivated by use cases like music, news and recipe recommendation. We present a new technique, TalkTheWalk, that synthesizes realistic high-quality conversational data by leveraging domain expertise encoded in widely available curated item collections, showing how these can be transformed into corresponding item set curation conversations. Specifically, TalkTheWalk generates a sequence of hypothetical yet plausible item sets returned by a system, then uses a language model to produce corresponding user utterances. Applying TalkTheWalk to music recommendation, we generate over one million diverse playlist curation conversations. A human evaluation shows that the conversations contain consistent utterances with relevant item sets, nearly matching the quality of small human-collected conversational data for this task. At the same time, when the synthetic corpus is used to train a CRS, it improves Hits@100 by 10.5 points on a benchmark dataset over standard baselines and is preferred over the top-performing baseline in an online evaluation
    corecore