Search CORE

65 research outputs found

Analysing Compression Techniques for In-Memory Collaborative Filtering

Author: Macdonald Craig
Ounis Iadh
Vargas Saul
Publication venue
Publication date: 01/01/2015
Field of study

Following the recent trend of in-memory data processing, it is a usual practice to maintain collaborative filtering data in the main memory when generating recommendations in academic and industrial recommender systems. In this paper, we study the impact of integer compression techniques for in-memory collaborative filtering data in terms of space and time efficiency. Our results provide relevant observations about when and how to compress collaborative filtering data. First, we observe that, depending on the memory constraints, compression techniques may speed up or slow down the performance of state-of-the art collaborative filtering algorithms. Second, after comparing different compression techniques, we find the Frame of Reference (FOR) technique to be the best option in terms of space and time efficiency under different memory constraints

Enlighten

Processing and Transmission of Information

Author: Elias Peter
Kennedy Robert S.
Shapiro J. H.
Yuen Horace P. H.
Publication venue: Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT)
Publication date: 15/01/1974
Field of study

Contains research objectives and summary of research.U. S. Army Research Office - Durham (Contract DAHC04-69-C-0042)U. S. Army Research Office - Durham Contract DAHCO4-71 -C-0039)Joint Services Electronics Program (Contract DAAB07-71-C-0300)National Science Foundation (Grant GK-37582)National Aeronautics and Space Administration (Grant NGL 22-009-013

DSpace@MIT

Processing and Transmission of Information

Author: Elias Peter
Publication venue: Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT)
Publication date: 15/07/1974
Field of study

Contains reports on one research project.Joint Services Electronics Program (Contract DAAB07-71-C-0300)National Science Foundation (Grant GK-37582

DSpace@MIT

Succinct Dictionary Matching With No Slowdown

Author: A.V. Aho
J.I. Munro
K. Sadakane
P. Elias
R.M. Fano
S. Dori
W.-K. Hon
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

The problem of dictionary matching is a classical problem in string matching: given a set S of d strings of total length n characters over an (not necessarily constant) alphabet of size sigma, build a data structure so that we can match in a any text T all occurrences of strings belonging to S. The classical solution for this problem is the Aho-Corasick automaton which finds all occ occurrences in a text T in time O(|T| + occ) using a data structure that occupies O(m log m) bits of space where m <= n + 1 is the number of states in the automaton. In this paper we show that the Aho-Corasick automaton can be represented in just m(log sigma + O(1)) + O(d log(n/d)) bits of space while still maintaining the ability to answer to queries in O(|T| + occ) time. To the best of our knowledge, the currently fastest succinct data structure for the dictionary matching problem uses space O(n log sigma) while answering queries in O(|T|log log n + occ) time. In this paper we also show how the space occupancy can be reduced to m(H0 + O(1)) + O(d log(n/d)) where H0 is the empirical entropy of the characters appearing in the trie representation of the set S, provided that sigma < m^epsilon for any constant 0 < epsilon < 1. The query time remains unchanged.Comment: Corrected typos and other minor error

arXiv.org e-Print Archive

CiteSeerX

Crossref

Fast and Compact Set Intersection through Recursive Universe Partitioning

Author: Pibiri G. E.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

We present a data structure that encodes a sorted integer sequence in small space allowing, at the same time, fast intersection operations. The data layout is carefully designed to exploit word-level parallelism and SIMD instructions, hence providing good practical performance. The core algorithmic idea is that of recursive partitioning the universe of representation: A markedly different paradigm than the widespread strategy of partitioning the sequence based on its length. Extensive experimentation and comparison against several competitive techniques shows that the proposed solution embodies an improved space/time trade-off for the set intersection problem

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari