Search CORE

61 research outputs found

A solvable class of quadratic 0–1 programming

Author: Bushnell Michael L.
Chakradhar Srimat T.
Publication venue: Published by Elsevier B.V.
Publication date: 28/05/1992
Field of study

AbstractWe show that the minimum of the pseudo-Boolean quadratic function ƒ(x) = xTQx + cTx can be found in linear time when the graph defined by Q is transformable into a combinatorial circuit of AND, OR, NAND, NOR or NOT logic gates. A novel modeling technique is used to transform the graph defined by Q into a logic circuit. A consistent labeling of the signals in the logic circuit from the set {0, 1} corresponds to the global minimum of ƒ and the labeling is determined through logic simulation of the circuit. Our approach establishes a direct and constructive relationship between pseudo-Boolean functions and logic circuits.In the restricted case when all the elements of Q are nonpositive, the minimum of ƒ can be obtained in polynomial time [15]. We show that the problem of finding the minimum of ƒ, even in the special case when all the elements of Q are positive, is NP-complete

Elsevier - Publisher Connector

LeanContext: Cost-Efficient Domain-Specific Question Answering Using LLMs

Author: Arefeen Md Adnan
Chakradhar Srimat
Debnath Biplob
Publication venue
Publication date: 02/09/2023
Field of study

Question-answering (QA) is a significant application of Large Language Models (LLMs), shaping chatbot capabilities across healthcare, education, and customer service. However, widespread LLM integration presents a challenge for small businesses due to the high expenses of LLM API usage. Costs rise rapidly when domain-specific data (context) is used alongside queries for accurate domain-specific LLM responses. One option is to summarize the context by using LLMs and reduce the context. However, this can also filter out useful information that is necessary to answer some domain-specific queries. In this paper, we shift from human-oriented summarizers to AI model-friendly summaries. Our approach, LeanContext, efficiently extracts

k

key sentences from the context that are closely aligned with the query. The choice of

k

is neither static nor random; we introduce a reinforcement learning technique that dynamically determines

k

based on the query and context. The rest of the less important sentences are reduced using a free open source text reduction method. We evaluate LeanContext against several recent query-aware and query-unaware context reduction approaches on prominent datasets (arxiv papers and BBC news articles). Despite cost reductions of

37.29\%

67.81\%

, LeanContext's ROUGE-1 score decreases only by

1.41\%

2.65\%

compared to a baseline that retains the entire context (no summarization). Additionally, if free pretrained LLM-based summarizers are used to reduce context (into human consumable summaries), LeanContext can further modify the reduced context to enhance the accuracy (ROUGE-1 score) by

13.22\%

24.61\%

.Comment: The paper is under revie

arXiv.org e-Print Archive

Differentiable JPEG: The Devil is in the Details

Author: Chakradhar Srimat
Debnath Biplob
Patel Deep
Reich Christoph
Publication venue
Publication date: 13/09/2023
Field of study

JPEG remains one of the most widespread lossy image coding methods. However, the non-differentiable nature of JPEG restricts the application in deep learning pipelines. Several differentiable approximations of JPEG have recently been proposed to address this issue. This paper conducts a comprehensive review of existing diff. JPEG approaches and identifies critical details that have been missed by previous methods. To this end, we propose a novel diff. JPEG approach, overcoming previous limitations. Our approach is differentiable w.r.t. the input image, the JPEG quality, the quantization tables, and the color conversion parameters. We evaluate the forward and backward performance of our diff. JPEG approach against existing methods. Additionally, extensive ablations are performed to evaluate crucial design choices. Our proposed diff. JPEG resembles the (non-diff.) reference implementation best, significantly surpassing the recent-best diff. approach by

3.47

dB (PSNR) on average. For strong compression rates, we can even improve PSNR by

9.51

dB. Strong adversarial attack results are yielded by our diff. JPEG, demonstrating the effective gradient approximation. Our code is available at https://github.com/necla-ml/Diff-JPEG.Comment: Accepted at WACV 2024. Project page: https://christophreich1996.github.io/differentiable_jpeg

arXiv.org e-Print Archive

Deep Video Codec Control

Author: Chakradhar Srimat
Debnath Biplob
Patel Deep
Prangemeier Tim
Reich Christoph
Publication venue
Publication date: 16/09/2023
Field of study

Lossy video compression is commonly used when transmitting and storing video data. Unified video codecs (e.g., H.264 or H.265) remain the de facto standard, despite the availability of advanced (neural) compression approaches. Transmitting videos in the face of dynamic network bandwidth conditions requires video codecs to adapt to vastly different compression strengths. Rate control modules augment the codec's compression such that bandwidth constraints are satisfied and video distortion is minimized. While, both standard video codes and their rate control modules are developed to minimize video distortion w.r.t. human quality assessment, preserving the downstream performance of deep vision models is not considered. In this paper, we present the first end-to-end learnable deep video codec control considering both bandwidth constraints and downstream vision performance, while not breaking existing standardization. We demonstrate for two common vision tasks (semantic segmentation and optical flow estimation) and on two different datasets that our deep codec control better preserves downstream performance than using 2-pass average bit rate control while meeting dynamic bandwidth constraints and adhering to standardizations.Comment: 22 pages, 26 figures, 6 table

arXiv.org e-Print Archive

Deep Learning-Based Real-Time Quality Control of Standard Video Compression for Live Streaming

Author: Chakradhar Srimat T.
Khojastepour Mohammad A. Amir
Mortaheb Matin
Ulukus Sennur
Publication venue
Publication date: 21/11/2023
Field of study

Ensuring high-quality video content for wireless users has become increasingly vital. Nevertheless, maintaining a consistent level of video quality faces challenges due to the fluctuating encoded bitrate, primarily caused by dynamic video content, especially in live streaming scenarios. Video compression is typically employed to eliminate unnecessary redundancies within and between video frames, thereby reducing the required bandwidth for video transmission. The encoded bitrate and the quality of the compressed video depend on encoder parameters, specifically, the quantization parameter (QP). Poor choices of encoder parameters can result in reduced bandwidth efficiency and high likelihood of non-conformance. Non-conformance refers to the violation of the peak signal-to-noise ratio (PSNR) constraint for an encoded video segment. To address these issues, a real-time deep learning-based H.264 controller is proposed. This controller dynamically estimates the optimal encoder parameters based on the content of a video chunk with minimal delay. The objective is to maintain video quality in terms of PSNR above a specified threshold while minimizing the average bitrate of the compressed video. Experimental results, conducted on both QCIF dataset and a diverse range of random videos from public datasets, validate the effectiveness of this approach. Notably, it achieves improvements of up to 2.5 times in average bandwidth usage compared to the state-of-the-art adaptive bitrate video streaming, with a negligible non-conformance probability below

10^{-2}

.Comment: arXiv admin note: text overlap with arXiv:2310.0685

arXiv.org e-Print Archive

Semantic Multi-Resolution Communications

Author: Chakradhar Srimat T.
Khojastepour Mohammad A. Amir
Mortaheb Matin
Ulukus Sennur
Publication venue
Publication date: 22/08/2023
Field of study

Deep learning based joint source-channel coding (JSCC) has demonstrated significant advancements in data reconstruction compared to separate source-channel coding (SSCC). This superiority arises from the suboptimality of SSCC when dealing with finite block-length data. Moreover, SSCC falls short in reconstructing data in a multi-user and/or multi-resolution fashion, as it only tries to satisfy the worst channel and/or the highest quality data. To overcome these limitations, we propose a novel deep learning multi-resolution JSCC framework inspired by the concept of multi-task learning (MTL). This proposed framework excels at encoding data for different resolutions through hierarchical layers and effectively decodes it by leveraging both current and past layers of encoded data. Moreover, this framework holds great potential for semantic communication, where the objective extends beyond data reconstruction to preserving specific semantic attributes throughout the communication process. These semantic features could be crucial elements such as class labels, essential for classification tasks, or other key attributes that require preservation. Within this framework, each level of encoded data can be carefully designed to retain specific data semantics. As a result, the precision of a semantic classifier can be progressively enhanced across successive layers, emphasizing the preservation of targeted semantics throughout the encoding and decoding stages. We conduct experiments on MNIST and CIFAR10 dataset. The experiment with both datasets illustrates that our proposed method is capable of surpassing the SSCC method in reconstructing data with different resolutions, enabling the extraction of semantic features with heightened confidence in successive layers. This capability is particularly advantageous for prioritizing and preserving more crucial semantic features within the datasets

arXiv.org e-Print Archive

Why is the video analytics accuracy fluctuating, and what can we do about it?

Author: Chakradhar Srimat
Coviello Giuseppe
Hu Y. Charlie
Paul Sibendu
Po Oliver
Rao Kunal
Sankaradas Murugan
Publication venue
Publication date: 15/09/2022
Field of study

It is a common practice to think of a video as a sequence of images (frames), and re-use deep neural network models that are trained only on images for similar analytics tasks on videos. In this paper, we show that this leap of faith that deep learning models that work well on images will also work well on videos is actually flawed. We show that even when a video camera is viewing a scene that is not changing in any human-perceptible way, and we control for external factors like video compression and environment (lighting), the accuracy of video analytics application fluctuates noticeably. These fluctuations occur because successive frames produced by the video camera may look similar visually, but these frames are perceived quite differently by the video analytics applications. We observed that the root cause for these fluctuations is the dynamic camera parameter changes that a video camera automatically makes in order to capture and produce a visually pleasing video. The camera inadvertently acts as an unintentional adversary because these slight changes in the image pixel values in consecutive frames, as we show, have a noticeably adverse impact on the accuracy of insights from video analytics tasks that re-use image-trained deep learning models. To address this inadvertent adversarial effect from the camera, we explore the use of transfer learning techniques to improve learning in video analytics tasks through the transfer of knowledge from learning on image analytics tasks. In particular, we show that our newly trained Yolov5 model reduces fluctuation in object detection across frames, which leads to better tracking of objects(40% fewer mistakes in tracking). Our paper also provides new directions and techniques to mitigate the camera's adversarial effect on deep learning models used for video analytics applications

arXiv.org e-Print Archive