1 research outputs found
Privacy-Preserving Multi-Document Summarization
State-of-the-art extractive multi-document summarization systems are usually
designed without any concern about privacy issues, meaning that all documents
are open to third parties. In this paper we propose a privacy-preserving
approach to multi-document summarization. Our approach enables other parties to
obtain summaries without learning anything else about the original documents'
content. We use a hashing scheme known as Secure Binary Embeddings to convert
documents representation containing key phrases and bag-of-words into bit
strings, allowing the computation of approximate distances, instead of exact
ones. Our experiments indicate that our system yields similar results to its
non-private counterpart on standard multi-document evaluation datasets.Comment: 4 pages, In Proceedings of 2nd ACM SIGIR Workshop on
Privacy-Preserving Information Retrieval, August 2015. arXiv admin note: text
overlap with arXiv:1407.541