1,114 research outputs found
Comparing the Pay of Federal and Nonprofit Executives: An Update
A CBO PaperCBOComparePayFedandNonProfitExecutives2003_1.pdf: 508 downloads, before Oct. 1, 2020
Optimal-Time Text Indexing in BWT-runs Bounded Space
Indexing highly repetitive texts --- such as genomic databases, software
repositories and versioned text collections --- has become an important problem
since the turn of the millennium. A relevant compressibility measure for
repetitive texts is , the number of runs in their Burrows-Wheeler Transform
(BWT). One of the earliest indexes for repetitive collections, the Run-Length
FM-index, used space and was able to efficiently count the number of
occurrences of a pattern of length in the text (in loglogarithmic time per
pattern symbol, with current techniques). However, it was unable to locate the
positions of those occurrences efficiently within a space bounded in terms of
. Since then, a number of other indexes with space bounded by other measures
of repetitiveness --- the number of phrases in the Lempel-Ziv parse, the size
of the smallest grammar generating the text, the size of the smallest automaton
recognizing the text factors --- have been proposed for efficiently locating,
but not directly counting, the occurrences of a pattern. In this paper we close
this long-standing problem, showing how to extend the Run-Length FM-index so
that it can locate the occurrences efficiently within space (in
loglogarithmic time each), and reaching optimal time within
space, on a RAM machine of bits. Within
space, our index can also count in optimal time .
Raising the space to , we support count and locate in
and time, which is optimal in the
packed setting and had not been obtained before in compressed space. We also
describe a structure using space that replaces the text and
extracts any text substring of length in almost-optimal time
. (...continues...
Bidirectional Text Compression in External Memory
Bidirectional compression algorithms work by substituting repeated substrings by references that, unlike in the famous LZ77-scheme, can point to either direction. We present such an algorithm that is particularly suited for an external memory implementation. We evaluate it experimentally on large data sets of size up to 128 GiB (using only 16 GiB of RAM) and show that it is significantly faster than all known LZ77 compressors, while producing a roughly similar number of factors. We also introduce an external memory decompressor for texts compressed with any uni- or bidirectional compression scheme
Coverage, Continuity and Visual Cortical Architecture
The primary visual cortex of many mammals contains a continuous
representation of visual space, with a roughly repetitive aperiodic map of
orientation preferences superimposed. It was recently found that orientation
preference maps (OPMs) obey statistical laws which are apparently invariant
among species widely separated in eutherian evolution. Here, we examine whether
one of the most prominent models for the optimization of cortical maps, the
elastic net (EN) model, can reproduce this common design. The EN model
generates representations which optimally trade of stimulus space coverage and
map continuity. While this model has been used in numerous studies, no
analytical results about the precise layout of the predicted OPMs have been
obtained so far. We present a mathematical approach to analytically calculate
the cortical representations predicted by the EN model for the joint mapping of
stimulus position and orientation. We find that in all previously studied
regimes, predicted OPM layouts are perfectly periodic. An unbiased search
through the EN parameter space identifies a novel regime of aperiodic OPMs with
pinwheel densities lower than found in experiments. In an extreme limit,
aperiodic OPMs quantitatively resembling experimental observations emerge.
Stabilization of these layouts results from strong nonlocal interactions rather
than from a coverage-continuity-compromise. Our results demonstrate that
optimization models for stimulus representations dominated by nonlocal
suppressive interactions are in principle capable of correctly predicting the
common OPM design. They question that visual cortical feature representations
can be explained by a coverage-continuity-compromise.Comment: 100 pages, including an Appendix, 21 + 7 figure
Developing an Efficient Secure Query Processing Algorithm on Encrypted Databases using Data Compression
Distributed computing includes putting aside the data utilizing outsider storage and being able to get to this information from a place at any time. Due to the advancement of distributed computing and databases, high critical data are put in databases. However, the information is saved in outsourced services like Database as a Service (DaaS), security issues are raised from both server and client-side. Also, query processing on the database by different clients through the time-consuming methods and shared resources environment may cause inefficient data processing and retrieval. Secure and efficient data regaining can be obtained with the help of an efficient data processing algorithm among different clients. This method proposes a well-organized through an Efficient Secure Query Processing Algorithm (ESQPA) for query processing efficiently by utilizing the concepts of data compression before sending the encrypted results from the server to clients. We have addressed security issues through securing the data at the server-side by an encrypted database using CryptDB. Encryption techniques have recently been proposed to present clients with confidentiality in terms of cloud storage. This method allows the queries to be processed using encrypted data without decryption. To analyze the performance of ESQPA, it is compared with the current query processing algorithm in CryptDB. Results have proven the efficiency of storage space is less and it saves up to 63% of its space.
A Cognitive Information Theory of Music: A Computational Memetics Approach
This thesis offers an account of music cognition based on information theory and memetics. My research strategy is to split the memetic modelling into four layers: Data, Information, Psychology and Application. Multiple cognitive models are proposed for the Information and Psychology layers, and the MDL best-fit models with published human data are selected. Then, for the Psychology layer only, new experiments are conducted to validate the best-fit models.
In the information chapter, an information-theoretic model of musical memory is proposed, along with two competing models. The proposed model exhibited a better fit with human data than the competing models. Higher-level psychological theories are then built on top of this information layer. In the similarity chapter, I proposed three competing models of musical similarity, and conducted a new experiment to validate the best-fit model. In the fitness chapter, I again proposed three competing models of musical fitness, and conducted a new experiment to validate the best-fit model. In both cases, the correlations with human data are statistically significant.
All in all, my research has shown that the memetic strategy is sound, and the modelling results are encouraging. Implications of this research are discussed
Multilevel security in data compression and restricted character set translation
Multilevel military communications security can be implemented with the notion of masterkeys. Naval message traffic is transmitted with restricted character set and optionally files are compressed. Both character translation and data compression can be used as add-on data encryption. A masterkey is constructed from a set of service keys from which masterkey is allowed to access. This thesis presents the principles of multilevel security with restricted character translation, data compression, and masterkey implementation.http://archive.org/details/multilevelsecuri00tsaiLieutenant Colonel, Taiwan Republic of China ArmyApproved for public release; distribution is unlimited
- …