Search CORE

42 research outputs found

LeanContext: Cost-Efficient Domain-Specific Question Answering Using LLMs

Author: Arefeen Md Adnan
Chakradhar Srimat
Debnath Biplob
Publication venue
Publication date: 02/09/2023
Field of study

Question-answering (QA) is a significant application of Large Language Models (LLMs), shaping chatbot capabilities across healthcare, education, and customer service. However, widespread LLM integration presents a challenge for small businesses due to the high expenses of LLM API usage. Costs rise rapidly when domain-specific data (context) is used alongside queries for accurate domain-specific LLM responses. One option is to summarize the context by using LLMs and reduce the context. However, this can also filter out useful information that is necessary to answer some domain-specific queries. In this paper, we shift from human-oriented summarizers to AI model-friendly summaries. Our approach, LeanContext, efficiently extracts

k

key sentences from the context that are closely aligned with the query. The choice of

k

is neither static nor random; we introduce a reinforcement learning technique that dynamically determines

k

based on the query and context. The rest of the less important sentences are reduced using a free open source text reduction method. We evaluate LeanContext against several recent query-aware and query-unaware context reduction approaches on prominent datasets (arxiv papers and BBC news articles). Despite cost reductions of

37.29\%

67.81\%

, LeanContext's ROUGE-1 score decreases only by

1.41\%

2.65\%

compared to a baseline that retains the entire context (no summarization). Additionally, if free pretrained LLM-based summarizers are used to reduce context (into human consumable summaries), LeanContext can further modify the reduced context to enhance the accuracy (ROUGE-1 score) by

13.22\%

24.61\%

.Comment: The paper is under revie

arXiv.org e-Print Archive

Differentiable JPEG: The Devil is in the Details

Author: Chakradhar Srimat
Debnath Biplob
Patel Deep
Reich Christoph
Publication venue
Publication date: 22/12/2023
Field of study

JPEG remains one of the most widespread lossy image coding methods. However, the non-differentiable nature of JPEG restricts the application in deep learning pipelines. Several differentiable approximations of JPEG have recently been proposed to address this issue. This paper conducts a comprehensive review of existing diff. JPEG approaches and identifies critical details that have been missed by previous methods. To this end, we propose a novel diff. JPEG approach, overcoming previous limitations. Our approach is differentiable w.r.t. the input image, the JPEG quality, the quantization tables, and the color conversion parameters. We evaluate the forward and backward performance of our diff. JPEG approach against existing methods. Additionally, extensive ablations are performed to evaluate crucial design choices. Our proposed diff. JPEG resembles the (non-diff.) reference implementation best, significantly surpassing the recent-best diff. approach by

3.47

dB (PSNR) on average. For strong compression rates, we can even improve PSNR by

9.51

dB. Strong adversarial attack results are yielded by our diff. JPEG, demonstrating the effective gradient approximation. Our code is available at https://github.com/necla-ml/Diff-JPEG.Comment: Accepted at WACV 2024. Project page: https://christophreich1996.github.io/differentiable_jpeg/ WACV paper: https://openaccess.thecvf.com/content/WACV2024/html/Reich_Differentiable_JPEG_The_Devil_Is_in_the_Details_WACV_2024_paper.htm

arXiv.org e-Print Archive

Deep Video Codec Control

Author: Chakradhar Srimat
Debnath Biplob
Patel Deep
Prangemeier Tim
Reich Christoph
Publication venue
Publication date: 16/09/2023
Field of study

Lossy video compression is commonly used when transmitting and storing video data. Unified video codecs (e.g., H.264 or H.265) remain the de facto standard, despite the availability of advanced (neural) compression approaches. Transmitting videos in the face of dynamic network bandwidth conditions requires video codecs to adapt to vastly different compression strengths. Rate control modules augment the codec's compression such that bandwidth constraints are satisfied and video distortion is minimized. While, both standard video codes and their rate control modules are developed to minimize video distortion w.r.t. human quality assessment, preserving the downstream performance of deep vision models is not considered. In this paper, we present the first end-to-end learnable deep video codec control considering both bandwidth constraints and downstream vision performance, while not breaking existing standardization. We demonstrate for two common vision tasks (semantic segmentation and optical flow estimation) and on two different datasets that our deep codec control better preserves downstream performance than using 2-pass average bit rate control while meeting dynamic bandwidth constraints and adhering to standardizations.Comment: 22 pages, 26 figures, 6 table

arXiv.org e-Print Archive

A survey and classification of storage deduplication systems

Author: Anand Ashok
Arcangeli Andrea
Berliner Brian
Bolosky William J.
Broder Andrei
Chen Feng
Chute Christopher
Clements Austin T.
Collberg Christian
Debnath Biplob
Dong Wei
Douglis Fred
Douglis Fred
Dubnicki Cezary
Dutch
El-Shimi Ahmed
Eshghi Kave
Guo Fanglu
Gupta Aayush
Hong Bo
José Pereira
João Paulo
Kruus Erik
Liguori Anthony
Lillibridge Mark
Lu Guanlin
Manber Udi
Milos Grzegorz
Nath Partho
Ng Chun-Ho
Quinlan Sean
Rhea Sean
Shilane Philip
Srinivasan Kiran
Suzaki Kuniyasu
Tarasov Vasily
Ungureanu Cristian
Wright Jeff
Xia Wen
You Lawrence
Zhu Benjamin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

The automatic elimination of duplicate data in a storage system commonly known as deduplication is increasingly accepted as an effective technique to reduce storage costs. Thus, it has been applied to different storage types, including archives and backups, primary storage, within solid state disks, and even to random access memory. Although the general approach to deduplication is shared by all storage types, each poses specific challenges and leads to different trade-offs and solutions. This diversity is often misunderstood, thus underestimating the relevance of new research and development. The first contribution of this paper is a classification of deduplication systems according to six criteria that correspond to key design decisions: granularity, locality, timing, indexing, technique, and scope. This classification identifies and describes the different approaches used for each of them. As a second contribution, we describe which combinations of these design decisions have been proposed and found more useful for challenges in each storage type. Finally, outstanding research challenges and unexplored design points are identified and discussed.This work is funded by the European Regional Development Fund (EDRF) through the COMPETE Programme (operational programme for competitiveness) and by National Funds through the Fundacao para a Ciencia e a Tecnologia (FCT; Portuguese Foundation for Science and Technology) within project RED FCOMP-01-0124-FEDER-010156 and the FCT by PhD scholarship SFRH-BD-71372-2010

Universidade do Minho: RepositoriUM

Crossref

INESC TEC Repository

Integrating flash memory into the storage hierarchy.

Author: Debnath Biplob Kumar
Publication venue
Publication date: 01/10/2010
Field of study

University of Minnesota Ph.D. dissertation. October 2010. Major: Electrical engineering. Advisors: David J. Lilja, Mohamed F. Mokbel. 1 computer file (PDF); xii, 158 pagesWith the continually accelerating growth of data, the performance of storage systems is increasingly becoming a bottleneck to improving overall system performance. Many applications, such as transaction processing systems, weather forecasting, large-scale scientific simulations, and on-demand services are limited by the performance of the underlying storage systems. The limited bandwidth, high power consumption, and low reliability of widely used magnetic disk-based storage systems impose a significant hurdle in scaling these applications to satisfy the increasing growth of data. These limitations and bottlenecks are especially acute for large-scale high-performance computing systems. Flash memory is an emerging storage technology that shows tremendous promise to compensate for the limitations of current storage devices. Flash memory's relatively high cost, however, combined with its slow write performance and limited number of erase cycles requires new and innovative solutions to integrate flash memory-based storage devices into a high-performance storage hierarchy. The first part of this thesis develops new algorithms, data structures, and storage architectures to address the fundamental issues that limit the use of flash-based storage devices in high-performance computing systems. The second part of the thesis demonstrates two innovative applications of the flash-based storage. In particular, the first part addresses a set of fundamental issues including new write caching techniques, sampling-based RAM-space efficient garbage collection scheme, and writing strategies for improving the performance of flash memory for write-intensive applications. This effort will improve the fundamental understanding of flash memory, will remedy the major limitations of using flash-based storage devices, and will extend the capability of flash memory to support many critical applications. On the other hand, the second part demonstrates how flash memory can be used to speed up server applications including Bloom Filter and online deduplication system. This effort will use flash-aware data structures and algorithms, and will show innovative uses of flash-based storage.Debnath, Biplob Kumar. (2010). Integrating flash memory into the storage hierarchy.. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/117595

University of Minnesota Digital Conservancy

CFTL: A Convertible Flash Translation Layer with Consideration of Data Access Patterns

Author: Debnath Biplob
DuHung-Chang David
Park Dongchul
Publication venue
Publication date: 14/09/2009
Field of study

NAND flash memory-based storage devices are increasingly adopted as one of the main alternatives for magnetic disk drives. The flash translation layer (FTL) is a software/hardware interface inside NAND flash memory, which allows existing disk-based applications to use it without any significant modifications. Since FTL has a critical impact on the performance of NAND flash-based devices, a variety of FTL schemes have been proposed to improve their performance. However, existing FTLs perform well for either a read intensive workload or a write intensive workload, not for both of them due to their static address mapping schemes. To overcome this limitation, in this paper, we propose a novel FTL addressing scheme named Convertible Flash Translation Layer (CFTL, for short). CFTL is adaptive to data access patterns so that it can dynamically switch the mapping of a data block to either read-optimized or write-optimized mapping scheme in order to fully exploit the benefits of both schemes. By judiciously taking advantage of both schemes, CFTL resolves the intrinsic problems of the existing FTLs. In addition to this convertible scheme, we propose an efficient caching strategy so as to considerably improve the CFTL performance further with only a simple hint. Consequently, both of the convertible feature and caching strategy empower CFTL to achieve good read performance as well as good write performance. Our experimental evaluation with a variety of realistic workloads demonstrates that the proposed CFTL scheme outperforms other FTL schemes

University of Minnesota Digital Conservancy

CFTL

Author: Biplob Debnath
David Du
Dongchul Park
Publication venue: ACM
Publication date: 14/06/2010
Field of study

Crossref

A Workload-Aware Adaptive Hybrid Flash Translation Layer with an Efficient Caching Strategy

Author: Biplob Debnath
David H.C. Du
Dongchul Park
Publication venue: IEEE
Publication date: 01/07/2011
Field of study

Crossref