93,693 research outputs found
Malleable Coding with Fixed Reuse
In cloud computing, storage area networks, remote backup storage, and similar
settings, stored data is modified with updates from new versions. Representing
information and modifying the representation are both expensive. Therefore it
is desirable for the data to not only be compressed but to also be easily
modified during updates. A malleable coding scheme considers both compression
efficiency and ease of alteration, promoting codeword reuse. We examine the
trade-off between compression efficiency and malleability cost-the difficulty
of synchronizing compressed versions-measured as the length of a reused prefix
portion. Through a coding theorem, the region of achievable rates and
malleability is expressed as a single-letter optimization. Relationships to
common information problems are also described
Boosting Backward Search Throughput for FM-Index Using a Compressed Encoding
The rapid development of DNA sequencing technologies has demanded for com-
pressed data structures supporting fast pattern matching queries. FM-index is a
widely-used compressed data structure that also supports fast pattern matching
queries. It is common for the exact matching algorithm to be memory bound, resulting
in poor performance. Searching several symbols in a single step improves data locality,
although the memory bandwidth requirements remains the same.
We propose a new data-layout of FM-index, called Split bit-vector, that compacts
all data needed to search k symbols in a single step (k-step), reducing both memory
movement and computing requirements at the cost of increasing memory footprint.Universidad de Málaga. Campus de Excelencia Internacional AndalucÃa Tech
Hybrid Compressed Hash Based Homomorphic AB Encryption Algorithm for Security of data in the Cloud Environment
Cloud computing is an emerging technology in the world of computing. It provides a convenient virtual environment for on-demand access to different type of services and computing resources such as applications, networks and storage space in an efficient way. The virtual environment is a massive compound structure in terms of accessibility that made easy in a compact way and familiar of functional components. The complexity in virtual environment generates several issues related to data storage, data security, authorization and authentication in cloud computing. With the size of the data, it becomes difficult to the cloud user to store large amounts of information in the remote cloud servers due to high computational cost, insecurity and costs high per hour proportional to the volume of information. In this paper, we propose compressed hash based encrypted model for the virtual environment. The aim of this paper is to store huge amount of data in the cloud environment in the form of compressed and encrypted data in a secure way
Malleable coding for updatable cloud caching
In software-as-a-service applications provisioned through cloud computing, locally cached data are often modified with updates from new versions. In some cases, with each edit, one may want to preserve both the original and new versions. In this paper, we focus on cases in which only the latest version must be preserved. Furthermore, it is desirable for the data to not only be compressed but to also be easily modified during updates, since representing information and modifying the representation both incur cost. We examine whether it is possible to have both compression efficiency and ease of alteration, in order to promote codeword reuse. In other words, we study the feasibility of a malleable and efficient coding scheme. The tradeoff between compression efficiency and malleability cost-the difficulty of synchronizing compressed versions-is measured as the length of a reused prefix portion. The region of achievable rates and malleability is found. Drawing from prior work on common information problems, we show that efficient data compression may not be the best engineering design principle when storing software-as-a-service data. In the general case, goals of efficiency and malleability are fundamentally in conflict.This work was supported in part by an NSF Graduate Research Fellowship (LRV), Grant CCR-0325774, and Grant CCF-0729069. This work was presented at the 2011 IEEE International Symposium on Information Theory [1] and the 2014 IEEE International Conference on Cloud Engineering [2]. The associate editor coordinating the review of this paper and approving it for publication was R. Thobaben. (CCR-0325774 - NSF Graduate Research Fellowship; CCF-0729069 - NSF Graduate Research Fellowship)Accepted manuscrip
The Case for Universal Basic Computing Power
The Universal Basic Computing Power (UBCP) initiative ensures global, free
access to a set amount of computing power specifically for AI research and
development (R&D). This initiative comprises three key elements. First, UBCP
must be cost free, with its usage limited to AI R&D and minimal additional
conditions. Second, UBCP should continually incorporate the state of the art AI
advancements, including efficiently distilled, compressed, and deployed
training data, foundational models, benchmarks, and governance tools. Lastly,
it's essential for UBCP to be universally accessible, ensuring convenience for
all users. We urge major stakeholders in AI development large platforms, open
source contributors, and policymakers to prioritize the UBCP initiative
Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor
We investigate video classification via a two-stream convolutional neural
network (CNN) design that directly ingests information extracted from
compressed video bitstreams. Our approach begins with the observation that all
modern video codecs divide the input frames into macroblocks (MBs). We
demonstrate that selective access to MB motion vector (MV) information within
compressed video bitstreams can also provide for selective, motion-adaptive, MB
pixel decoding (a.k.a., MB texture decoding). This in turn allows for the
derivation of spatio-temporal video activity regions at extremely high speed in
comparison to conventional full-frame decoding followed by optical flow
estimation. In order to evaluate the accuracy of a video classification
framework based on such activity data, we independently train two CNN
architectures on MB texture and MV correspondences and then fuse their scores
to derive the final classification of each test video. Evaluation on two
standard datasets shows that the proposed approach is competitive to the best
two-stream video classification approaches found in the literature. At the same
time: (i) a CPU-based realization of our MV extraction is over 977 times faster
than GPU-based optical flow methods; (ii) selective decoding is up to 12 times
faster than full-frame decoding; (iii) our proposed spatial and temporal CNNs
perform inference at 5 to 49 times lower cloud computing cost than the fastest
methods from the literature.Comment: Accepted in IEEE Transactions on Circuits and Systems for Video
Technology. Extension of ICIP 2017 conference pape
Efficient classification using parallel and scalable compressed model and Its application on intrusion detection
In order to achieve high efficiency of classification in intrusion detection,
a compressed model is proposed in this paper which combines horizontal
compression with vertical compression. OneR is utilized as horizontal
com-pression for attribute reduction, and affinity propagation is employed as
vertical compression to select small representative exemplars from large
training data. As to be able to computationally compress the larger volume of
training data with scalability, MapReduce based parallelization approach is
then implemented and evaluated for each step of the model compression process
abovementioned, on which common but efficient classification methods can be
directly used. Experimental application study on two publicly available
datasets of intrusion detection, KDD99 and CMDC2012, demonstrates that the
classification using the compressed model proposed can effectively speed up the
detection procedure at up to 184 times, most importantly at the cost of a
minimal accuracy difference with less than 1% on average
Fast, Small and Exact: Infinite-order Language Modelling with Compressed Suffix Trees
Efficient methods for storing and querying are critical for scaling
high-order n-gram language models to large corpora. We propose a language model
based on compressed suffix trees, a representation that is highly compact and
can be easily held in memory, while supporting queries needed in computing
language model probabilities on-the-fly. We present several optimisations which
improve query runtimes up to 2500x, despite only incurring a modest increase in
construction time and memory usage. For large corpora and high Markov orders,
our method is highly competitive with the state-of-the-art KenLM package. It
imposes much lower memory requirements, often by orders of magnitude, and has
runtimes that are either similar (for training) or comparable (for querying).Comment: 14 pages in Transactions of the Association for Computational
Linguistics (TACL) 201
- …