3 research outputs found

    Optimizing Learned Bloom Filters: How Much Should Be Learned?

    Get PDF
    The learned Bloom filter (LBF) combines a machine learning model (learner) with a traditional Bloom filter to improve the false positive rate (FPR) that can be achieved for a given memory budget. The LBF has recently been generalized by making use of the full spectrum of the learner's prediction score. However, in all those designs, the machine learning model is fixed. In this letter, for the first time, the design of LBFs is proposed and evaluated by considering the machine learning model as one of the variables in the process. In detail, for a given memory budget, several LBFs are constructed using different machine learning models and the one with the lowest FPR is selected. We demonstrate that our approach can achieve much better performance than existing LBF designs providing reductions of the FPR of up to 90% in some settings.This work was supported by the EU H2020 Project PIMCITY under Grant H2020-871370. This manuscript was recommended for publication by A. Kumar

    The Tandem Counting Bloom Filter: it takes two counters to Tango

    Get PDF
    Set representation is a crucial functionality in various areas such as networking and databases. In many applications, memory and time constraints allow only an approximate representation where errors can appear for some queried elements. The Variable-Increment Counting Bloom Filter (VI-CBF) is a popular data structure for the representation of dynamically-changing sets, achieving a good tradeoff between memory efficiency and queries accuracy. For some applications, the required accuracy is higher than that enabled by the VI-CBF. In this paper, we present the Tandem Counting Bloom Filter (T-CBF), a new data structure that relies on the interaction among counters to describe sets with higher accuracy. We analyze its performance and show that by a joint consideration of counters, the T-CBF always performs better than the VI-CBF and it can for some configurations reduce its false positive probability by an order of magnitude. The overhead of such an approach is expressed upon an element insertion or query as read or write operations to a pair of counters rather than a single counter in each hash location. The operations themselves also require considering a larger number of scenarios.Publicad

    Lightweight Frequency-Based Tiering for CXL Memory Systems

    Full text link
    Modern workloads are demanding increasingly larger memory capacity. Compute Express Link (CXL)-based memory tiering has emerged as a promising solution for addressing this trend by utilizing traditional DRAM alongside slow-tier CXL-memory devices in the same system. Unfortunately, most prior tiering systems are recency-based, which cannot accurately identify hot and cold pages, since a recently accessed page is not necessarily a hot page. On the other hand, more accurate frequency-based systems suffer from high memory and runtime overhead as a result of tracking large memories. In this paper, we propose FreqTier, a fast and accurate frequency-based tiering system for CXL memory. We observe that memory tiering systems can tolerate a small amount of tracking inaccuracy without compromising the overall application performance. Based on this observation, FreqTier probabilistically tracks the access frequency of each page, enabling accurate identification of hot and cold pages while maintaining minimal memory overhead. Finally, FreqTier intelligently adjusts the intensity of tiering operations based on the application's memory access behavior, thereby significantly reducing the amount of migration traffic and application interference. We evaluate FreqTier on two emulated CXL memory devices with different bandwidths. On the high bandwidth CXL device, FreqTier can outperform state-of-the-art tiering systems while using 4×\times less local DRAM memory for in-memory caching workloads. On GAP graph analytics and XGBoost workloads with 1:32 local DRAM to CXL-memory ratio, FreqTier outperforms prior works by 1.04−-2.04×\times (1.39×\times on average). Even on the low bandwidth CXL device, FreqTier outperforms AutoNUMA by 1.14×\times on average
    corecore