41 research outputs found
LeCo: Lightweight Compression via Learning Serial Correlations
Lightweight data compression is a key technique that allows column stores to
exhibit superior performance for analytical queries. Despite a comprehensive
study on dictionary-based encodings to approach Shannon's entropy, few prior
works have systematically exploited the serial correlation in a column for
compression. In this paper, we propose LeCo (i.e., Learned Compression), a
framework that uses machine learning to remove the serial redundancy in a value
sequence automatically to achieve an outstanding compression ratio and
decompression performance simultaneously. LeCo presents a general approach to
this end, making existing (ad-hoc) algorithms such as Frame-of-Reference (FOR),
Delta Encoding, and Run-Length Encoding (RLE) special cases under our
framework. Our microbenchmark with three synthetic and six real-world data sets
shows that a prototype of LeCo achieves a Pareto improvement on both
compression ratio and random access speed over the existing solutions. When
integrating LeCo into widely-used applications, we observe up to 3.9x speed up
in filter-scanning a Parquet file and a 16% increase in Rocksdb's throughput
HotRAP: Hot Record Retention and Promotion for LSM-trees with tiered storage
The multi-level design of Log-Structured Merge-trees (LSM-trees) naturally
fits the tiered storage architecture: the upper levels (recently
inserted/updated records) are kept in fast storage to guarantee performance
while the lower levels (the majority of records) are placed in slower but
cheaper storage to reduce cost. However, frequently accessed records may have
been compacted and reside in slow storage, and existing algorithms are
inefficient in promoting these ``hot'' records to fast storage, leading to
compromised read performance. We present HotRAP, a key-value store based on
RocksDB that can timely promote hot records individually from slow to fast
storage and keep them in fast storage while they are hot. HotRAP uses an
on-disk data structure (a specially-made LSM-tree) to track the hotness of keys
and includes three pathways to ensure that hot records reach fast storage with
short delays. Our experiments show that HotRAP outperforms state-of-the-art
LSM-trees on tiered storage by up to 3.3 compared to the second best
for read-only and read-write-balanced workloads with common access skew
patterns
Systematic electronic structure in the cuprate parent state from quantum many-body simulations
The quantitative description of correlated electron materials remains a
modern computational challenge. We demonstrate a numerical strategy to simulate
correlated materials at the fully ab initio level beyond the solution of
effective low-energy models, and apply it to gain a detailed microscopic
understanding across a family of cuprate superconducting materials in their
parent undoped states. We uncover microscopic trends in the electron
correlations and reveal the link between the material composition and magnetic
energy scales via a many-body picture of excitation processes involving the
buffer layers. Our work illustrates a path towards a quantitative and reliable
understanding of more complex states of correlated materials at the ab initio
many-body level.Comment: 21 pages, 5 figures, with Supplementary Material
SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores
The ever-growing complexity of reinforcement learning (RL) tasks demands a
distributed RL system to efficiently generate and process a massive amount of
data to train intelligent agents. However, existing open-source libraries
suffer from various limitations, which impede their practical use in
challenging scenarios where large-scale training is necessary. While industrial
systems from OpenAI and DeepMind have achieved successful large-scale RL
training, their system architecture and implementation details remain
undisclosed to the community. In this paper, we present a novel abstraction on
the dataflows of RL training, which unifies practical RL training across
diverse applications into a general framework and enables fine-grained
optimizations. Following this abstraction, we develop a scalable, efficient,
and extensible distributed RL system called ReaLly Scalable RL (SRL). The
system architecture of SRL separates major RL computation components and allows
massively parallelized training. Moreover, SRL offers user-friendly and
extensible interfaces for customized algorithms. Our evaluation shows that SRL
outperforms existing academic libraries in both a single machine and a
medium-sized cluster. In a large-scale cluster, the novel architecture of SRL
leads to up to 3.7x speedup compared to the design choices adopted by the
existing libraries. We also conduct a direct benchmark comparison to OpenAI's
industrial system, Rapid, in the challenging hide-and-seek environment. SRL
reproduces the same solution as reported by OpenAI with up to 5x speedup in
wall-clock time. Furthermore, we also examine the performance of SRL in a much
harder variant of the hide-and-seek environment and achieve substantial
learning speedup by scaling SRL to over 15k CPU cores and 32 A100 GPUs.
Notably, SRL is the first in the academic community to perform RL experiments
at such a large scale.Comment: 15 pages, 12 figures, 6 table
An Empirical Evaluation of Columnar Storage Formats
Columnar storage is one of the core components of a modern data analytics
system. Although many database management systems (DBMSs) have proprietary
storage formats, most provide extensive support to open-source storage formats
such as Parquet and ORC to facilitate cross-platform data sharing. But these
formats were developed over a decade ago, in the early 2010s, for the Hadoop
ecosystem. Since then, both the hardware and workload landscapes have changed
significantly.
In this paper, we revisit the most widely adopted open-source columnar
storage formats (Parquet and ORC) with a deep dive into their internals. We
designed a benchmark to stress-test the formats' performance and space
efficiency under different workload configurations. From our comprehensive
evaluation of Parquet and ORC, we identify design decisions advantageous with
modern hardware and real-world data distributions. These include using
dictionary encoding by default, favoring decoding speed over compression ratio
for integer encoding algorithms, making block compression optional, and
embedding finer-grained auxiliary data structures. Our analysis identifies
important considerations that may guide future formats to better fit modern
technology trends
Research on strategy of load-side resonant soft-switching inverter based on interconnection and damping assignment-passivity based control
Soft-switching technologies can effectively solve the problem of switching losses caused by increasing switching frequency of grid-connected inverters. As a branch of soft-switching technologies, load-side resonant soft-switching is a hotspot for applications of high-frequency inverters, because it has the advantage of achieving soft-switching without using additional components. However, the traditional PI control strategy based on the linear model is prone to destabilization and non-robust dynamic performance when large signal perturbation occurs. In this paper, a novel Passivity-Based Control (PBC) method is proposed to improve the dynamic performance of load-side resonant soft-switching grid-connected inverter. Besides, the model based on the Port Controlled Hamiltonian (PCH) model of the soft switching inverter is carried out, and the passivity-based controller is designed based on the established model using the way of interconnection and damping assignmentpassivity based control (IDA-PBC). Both stable performance and dynamic performance of the load-side resonant soft-switching inverter can be improved over the whole operating range. Finally, a 750 W load-side resonant soft-switching inverter simulation model is built and the output performance is compared with the traditional PI control strategy under stable and dynamic conditions. The simulation results show that the proposed control strategy reduces the harmonic distortion rate and improves the quality of the output waveforms
SALI: A Scalable Adaptive Learned Index Framework based on Probability Models
The growth in data storage capacity and the increasing demands for high
performance have created several challenges for concurrent indexing structures.
One promising solution is learned indexes, which use a learning-based approach
to fit the distribution of stored data and predictively locate target keys,
significantly improving lookup performance. Despite their advantages,
prevailing learned indexes exhibit constraints and encounter issues of
scalability on multi-core data storage.
This paper introduces SALI, the Scalable Adaptive Learned Index framework,
which incorporates two strategies aimed at achieving high scalability,
improving efficiency, and enhancing the robustness of the learned index.
Firstly, a set of node-evolving strategies is defined to enable the learned
index to adapt to various workload skews and enhance its concurrency
performance in such scenarios. Secondly, a lightweight strategy is proposed to
maintain statistical information within the learned index, with the goal of
further improving the scalability of the index. Furthermore, to validate their
effectiveness, SALI applied the two strategies mentioned above to the learned
index structure that utilizes fine-grained write locks, known as LIPP. The
experimental results have demonstrated that SALI significantly enhances the
insertion throughput with 64 threads by an average of 2.04x compared to the
second-best learned index. Furthermore, SALI accomplishes a lookup throughput
similar to that of LIPP+.Comment: Accepted by Conference SIGMOD 24, June 09-15, 2024, Santiago, Chil
Ab initio quantum many-body description of superconducting trends in the cuprates
Using a systematic ab initio quantum many-body approach that goes beyond
low-energy models, we directly compute the superconducting pairing order of
several doped cuprate materials and structures. We find that we can correctly
capture two well-known trends: the pressure effect, where pairing order
increases with intra-layer pressure, and the layer effect, where the pairing
order varies with the number of copper-oxygen layers. From these calculations,
we observe that the strength of superexchange and the covalency at optimal
doping are the best descriptors of the maximal pairing order. Our microscopic
analysis further identifies short-range copper spin fluctuations, together with
multi-orbital charge fluctuations, as central to the pairing trends. Our work
illustrates the possibility of a quantitative computational understanding of
high-temperature superconducting materials.Comment: 10 pages, 5 figures, with supplementary material
Crystallographic and Nuclear Magnetic Resonance Evaluation of the Impact of Peptide Binding to the Second PDZ Domain of Protein Tyrosine Phosphatase 1E
PDZ (PSD95/Discs large/ZO-1) domains are ubiquitous protein interaction motifs found in scaffolding proteins involved in signal transduction. Despite the fact that many PDZs show a limited tendency to undergo structural change, the PDZ family has been associated with long-range communication and allostery. One of the PDZ domains studied most in terms of structure and biophysical properties is the second PDZ (“PDZ2”) domain from protein tyrosine phophatase 1E (PTP1E, also known as PTPL1). Previously we showed through NMR relaxation studies that binding of the RA-GEF2 C-terminal peptide substrate results in long-range propagation of side-chain dynamic changes in human PDZ2 [Fuentes, et al., J. Mol. Biol. (2004), 335, 1105-1115]. Here, we present the first X-ray crystal structures of PDZ2 in the absence and presence of RA-GEF2 ligand, solved to resolutions of 1.65 and 1.3 Å, respectively. These structures deviate somewhat from previously determined NMR structures, and indicate that very minor structural changes in PDZ2 accompany peptide binding. NMR residual dipolar couplings confirm the crystal structures to be accurate models of the time-averaged atomic coordinates of PDZ2. The impact on side-chain dynamics was further tested with a C-terminal peptide from APC, which showed near-identical results to that of RA-GEF2. Thus, allosteric transmission in PDZ2 induced by peptide binding is conveyed purely and robustly by dynamics. 15N relaxation dispersion measurements did not detect appreciable populations of a kinetic structural intermediate. Collectively, for ligand binding to PDZ2, these data support a lock-and-key binding model from a structural perspective and an allosteric model from a dynamical perspective, which together suggest a complex energy landscape for functional transitions within the ensemble