327 research outputs found
BestConfig: Tapping the Performance Potential of Systems via Automatic Configuration Tuning
An ever increasing number of configuration parameters are provided to system
users. But many users have used one configuration setting across different
workloads, leaving untapped the performance potential of systems. A good
configuration setting can greatly improve the performance of a deployed system
under certain workloads. But with tens or hundreds of parameters, it becomes a
highly costly task to decide which configuration setting leads to the best
performance. While such task requires the strong expertise in both the system
and the application, users commonly lack such expertise.
To help users tap the performance potential of systems, we present
BestConfig, a system for automatically finding a best configuration setting
within a resource limit for a deployed system under a given application
workload. BestConfig is designed with an extensible architecture to automate
the configuration tuning for general systems. To tune system configurations
within a resource limit, we propose the divide-and-diverge sampling method and
the recursive bound-and-search algorithm. BestConfig can improve the throughput
of Tomcat by 75%, that of Cassandra by 63%, that of MySQL by 430%, and reduce
the running time of Hive join job by about 50% and that of Spark join job by
about 80%, solely by configuration adjustment
Towards Automated Data Mining: Reinforcement Intelligence for Self-Optimizing Feature Engineering
Feature engineering is one of the most important components in data mining and machine learning. One of the key thrusts in data mining is to answer: How should a low-dimensional geometry structure be extracted and reconstructed from high-dimensional data? To solve this issue, researchers proposed feature selection, PCA, sparsity regularization, factorization, embedding, and deep learning. However, existing techniques are limited in achieving full automation, globally optimal, and explainable explicitness. Can I address the automation, optimal, and explainability challenges in data geometry reconstruction? A low-dimensional data geometry structure is crucial for SciML methods (e.g., GP models), and the accuracy of these methods depends on how one can learn the data geometry structure from data or physics-based models. This dissertation will target the problem of automated identification of an optimal and explicit low-dimensional data geometry from high dimensional data. I will propose a novel principled self-optimizing data geometry reconstruction framework by viewing feature generation and selection from the lens of Reinforcement Learning (RL). I will show that reconstructing a low-dimensional data geometry (a.k.a., feature space) can be accomplished by an interactive nested feature generation and selection framework, where feature generation is to generate new meaningful and explicit features, feature selection is to subset redundant features to reduce dimensionality, and an optimized sequential structure of generations and selections will result into an optimized feature space for a downstream machine learning task. Finally, I will highlight that the search for such an optimized sequential structure can be generalized as an advanced cascading reinforcement learning system
Rattan Spiny Morphology And Litter Collecting Structures In Association With Ant Colonies
Rattan is a common palm in Malaysian forests but rarely known except for their economic values in furniture or matting products. Many rattan species possess a great number of spines arrangement in various patterns. However, few studies have looked into the different aspects of those spiny structures and their unique functions. This study focused on rattan spine structures in five different species which are common in the northern part of Peninsular Malaysia; they are Daemonorops lewisiana, Daemonorops geniculata, Calamus castaneus, Plectomia griffithii and Korthalsia scortechinii. Spine length, width, inclination, density, and strength were measured, and comparison from every aspect was taken to find out which rattan species possess the greatest defensive abilities to protect themselves. The leaf hairs characteristics on leaflets of D. geniculata, D. lewisiana, and C. castaneus were also measured. The results showed that none of the species has an outstanding defensive weapon since every species have their advantages. D. geniculata has the longest spines; D. lewisiana has the strongest spines; C. castaneus has the greatest the number in density and P. griffithii’s down-pointing spines may effectively deter small climbing mammals. K. scortechinii has nothing special in its spiny structures but was still well defended by ant partners colonizing their ocrea structures. Therefore, a rattan plant may rely on multiple defensive strategies and spiny structures only contribute part of its defensive role. During the study, many ant colonies were found on certain species of rattan plants
Beampattern-Based Tracking for Millimeter Wave Communication Systems
We present a tracking algorithm to maintain the communication link between a
base station (BS) and a mobile station (MS) in a millimeter wave (mmWave)
communication system, where antenna arrays are used for beamforming in both the
BS and MS. Downlink transmission is considered, and the tracking is performed
at the MS as it moves relative to the BS. Specifically, we consider the case
that the MS rotates quickly due to hand movement. The algorithm estimates the
angle of arrival (AoA) by using variations in the radiation pattern of the beam
as a function of this angle. Numerical results show that the algorithm achieves
accurate beam alignment when the MS rotates in a wide range of angular speeds.
For example, the algorithm can support angular speeds up to 800 degrees per
second when tracking updates are available every 10 ms.Comment: 6 pages, to be published in Proc. IEEE GLOBECOM 2016, Washington,
D.C., US
Self-Supervised Sketch-to-Image Synthesis
Imagining a colored realistic image from an arbitrarily drawn sketch is one
of the human capabilities that we eager machines to mimic. Unlike previous
methods that either requires the sketch-image pairs or utilize low-quantity
detected edges as sketches, we study the exemplar-based sketch-to-image (s2i)
synthesis task in a self-supervised learning manner, eliminating the necessity
of the paired sketch data. To this end, we first propose an unsupervised method
to efficiently synthesize line-sketches for general RGB-only datasets. With the
synthetic paired-data, we then present a self-supervised Auto-Encoder (AE) to
decouple the content/style features from sketches and RGB-images, and
synthesize images that are both content-faithful to the sketches and
style-consistent to the RGB-images. While prior works employ either the
cycle-consistence loss or dedicated attentional modules to enforce the
content/style fidelity, we show AE's superior performance with pure
self-supervisions. To further improve the synthesis quality in high resolution,
we also leverage an adversarial network to refine the details of synthetic
images. Extensive experiments on 1024*1024 resolution demonstrate a new
state-of-art-art performance of the proposed model on CelebA-HQ and Wiki-Art
datasets. Moreover, with the proposed sketch generator, the model shows a
promising performance on style mixing and style transfer, which require
synthesized images to be both style-consistent and semantically meaningful. Our
code is available on
https://github.com/odegeasslbc/Self-Supervised-Sketch-to-Image-Synthesis-PyTorch,
and please visit https://create.playform.io/my-projects?mode=sketch for an
online demo of our model.Comment: AAAI-202
T-SaS: Toward Shift-aware Dynamic Adaptation for Streaming Data
In many real-world scenarios, distribution shifts exist in the streaming data
across time steps. Many complex sequential data can be effectively divided into
distinct regimes that exhibit persistent dynamics. Discovering the shifted
behaviors and the evolving patterns underlying the streaming data are important
to understand the dynamic system. Existing methods typically train one robust
model to work for the evolving data of distinct distributions or sequentially
adapt the model utilizing explicitly given regime boundaries. However, there
are two challenges: (1) shifts in data streams could happen drastically and
abruptly without precursors. Boundaries of distribution shifts are usually
unavailable, and (2) training a shared model for all domains could fail to
capture varying patterns. This paper aims to solve the problem of sequential
data modeling in the presence of sudden distribution shifts that occur without
any precursors. Specifically, we design a Bayesian framework, dubbed as T-SaS,
with a discrete distribution-modeling variable to capture abrupt shifts of
data. Then, we design a model that enable adaptation with dynamic network
selection conditioned on that discrete variable. The proposed method learns
specific model parameters for each distribution by learning which neurons
should be activated in the full network. A dynamic masking strategy is adopted
here to support inter-distribution transfer through the overlapping of a set of
sparse networks. Extensive experiments show that our proposed method is
superior in both accurately detecting shift boundaries to get segments of
varying distributions and effectively adapting to downstream forecast or
classification tasks.Comment: CIKM 202
- …