327 research outputs found

    BestConfig: Tapping the Performance Potential of Systems via Automatic Configuration Tuning

    Full text link
    An ever increasing number of configuration parameters are provided to system users. But many users have used one configuration setting across different workloads, leaving untapped the performance potential of systems. A good configuration setting can greatly improve the performance of a deployed system under certain workloads. But with tens or hundreds of parameters, it becomes a highly costly task to decide which configuration setting leads to the best performance. While such task requires the strong expertise in both the system and the application, users commonly lack such expertise. To help users tap the performance potential of systems, we present BestConfig, a system for automatically finding a best configuration setting within a resource limit for a deployed system under a given application workload. BestConfig is designed with an extensible architecture to automate the configuration tuning for general systems. To tune system configurations within a resource limit, we propose the divide-and-diverge sampling method and the recursive bound-and-search algorithm. BestConfig can improve the throughput of Tomcat by 75%, that of Cassandra by 63%, that of MySQL by 430%, and reduce the running time of Hive join job by about 50% and that of Spark join job by about 80%, solely by configuration adjustment

    Towards Automated Data Mining: Reinforcement Intelligence for Self-Optimizing Feature Engineering

    Get PDF
    Feature engineering is one of the most important components in data mining and machine learning. One of the key thrusts in data mining is to answer: How should a low-dimensional geometry structure be extracted and reconstructed from high-dimensional data? To solve this issue, researchers proposed feature selection, PCA, sparsity regularization, factorization, embedding, and deep learning. However, existing techniques are limited in achieving full automation, globally optimal, and explainable explicitness. Can I address the automation, optimal, and explainability challenges in data geometry reconstruction? A low-dimensional data geometry structure is crucial for SciML methods (e.g., GP models), and the accuracy of these methods depends on how one can learn the data geometry structure from data or physics-based models. This dissertation will target the problem of automated identification of an optimal and explicit low-dimensional data geometry from high dimensional data. I will propose a novel principled self-optimizing data geometry reconstruction framework by viewing feature generation and selection from the lens of Reinforcement Learning (RL). I will show that reconstructing a low-dimensional data geometry (a.k.a., feature space) can be accomplished by an interactive nested feature generation and selection framework, where feature generation is to generate new meaningful and explicit features, feature selection is to subset redundant features to reduce dimensionality, and an optimized sequential structure of generations and selections will result into an optimized feature space for a downstream machine learning task. Finally, I will highlight that the search for such an optimized sequential structure can be generalized as an advanced cascading reinforcement learning system

    Rattan Spiny Morphology And Litter Collecting Structures In Association With Ant Colonies

    Get PDF
    Rattan is a common palm in Malaysian forests but rarely known except for their economic values in furniture or matting products. Many rattan species possess a great number of spines arrangement in various patterns. However, few studies have looked into the different aspects of those spiny structures and their unique functions. This study focused on rattan spine structures in five different species which are common in the northern part of Peninsular Malaysia; they are Daemonorops lewisiana, Daemonorops geniculata, Calamus castaneus, Plectomia griffithii and Korthalsia scortechinii. Spine length, width, inclination, density, and strength were measured, and comparison from every aspect was taken to find out which rattan species possess the greatest defensive abilities to protect themselves. The leaf hairs characteristics on leaflets of D. geniculata, D. lewisiana, and C. castaneus were also measured. The results showed that none of the species has an outstanding defensive weapon since every species have their advantages. D. geniculata has the longest spines; D. lewisiana has the strongest spines; C. castaneus has the greatest the number in density and P. griffithii’s down-pointing spines may effectively deter small climbing mammals. K. scortechinii has nothing special in its spiny structures but was still well defended by ant partners colonizing their ocrea structures. Therefore, a rattan plant may rely on multiple defensive strategies and spiny structures only contribute part of its defensive role. During the study, many ant colonies were found on certain species of rattan plants

    Beampattern-Based Tracking for Millimeter Wave Communication Systems

    Full text link
    We present a tracking algorithm to maintain the communication link between a base station (BS) and a mobile station (MS) in a millimeter wave (mmWave) communication system, where antenna arrays are used for beamforming in both the BS and MS. Downlink transmission is considered, and the tracking is performed at the MS as it moves relative to the BS. Specifically, we consider the case that the MS rotates quickly due to hand movement. The algorithm estimates the angle of arrival (AoA) by using variations in the radiation pattern of the beam as a function of this angle. Numerical results show that the algorithm achieves accurate beam alignment when the MS rotates in a wide range of angular speeds. For example, the algorithm can support angular speeds up to 800 degrees per second when tracking updates are available every 10 ms.Comment: 6 pages, to be published in Proc. IEEE GLOBECOM 2016, Washington, D.C., US

    Self-Supervised Sketch-to-Image Synthesis

    Full text link
    Imagining a colored realistic image from an arbitrarily drawn sketch is one of the human capabilities that we eager machines to mimic. Unlike previous methods that either requires the sketch-image pairs or utilize low-quantity detected edges as sketches, we study the exemplar-based sketch-to-image (s2i) synthesis task in a self-supervised learning manner, eliminating the necessity of the paired sketch data. To this end, we first propose an unsupervised method to efficiently synthesize line-sketches for general RGB-only datasets. With the synthetic paired-data, we then present a self-supervised Auto-Encoder (AE) to decouple the content/style features from sketches and RGB-images, and synthesize images that are both content-faithful to the sketches and style-consistent to the RGB-images. While prior works employ either the cycle-consistence loss or dedicated attentional modules to enforce the content/style fidelity, we show AE's superior performance with pure self-supervisions. To further improve the synthesis quality in high resolution, we also leverage an adversarial network to refine the details of synthetic images. Extensive experiments on 1024*1024 resolution demonstrate a new state-of-art-art performance of the proposed model on CelebA-HQ and Wiki-Art datasets. Moreover, with the proposed sketch generator, the model shows a promising performance on style mixing and style transfer, which require synthesized images to be both style-consistent and semantically meaningful. Our code is available on https://github.com/odegeasslbc/Self-Supervised-Sketch-to-Image-Synthesis-PyTorch, and please visit https://create.playform.io/my-projects?mode=sketch for an online demo of our model.Comment: AAAI-202

    T-SaS: Toward Shift-aware Dynamic Adaptation for Streaming Data

    Full text link
    In many real-world scenarios, distribution shifts exist in the streaming data across time steps. Many complex sequential data can be effectively divided into distinct regimes that exhibit persistent dynamics. Discovering the shifted behaviors and the evolving patterns underlying the streaming data are important to understand the dynamic system. Existing methods typically train one robust model to work for the evolving data of distinct distributions or sequentially adapt the model utilizing explicitly given regime boundaries. However, there are two challenges: (1) shifts in data streams could happen drastically and abruptly without precursors. Boundaries of distribution shifts are usually unavailable, and (2) training a shared model for all domains could fail to capture varying patterns. This paper aims to solve the problem of sequential data modeling in the presence of sudden distribution shifts that occur without any precursors. Specifically, we design a Bayesian framework, dubbed as T-SaS, with a discrete distribution-modeling variable to capture abrupt shifts of data. Then, we design a model that enable adaptation with dynamic network selection conditioned on that discrete variable. The proposed method learns specific model parameters for each distribution by learning which neurons should be activated in the full network. A dynamic masking strategy is adopted here to support inter-distribution transfer through the overlapping of a set of sparse networks. Extensive experiments show that our proposed method is superior in both accurately detecting shift boundaries to get segments of varying distributions and effectively adapting to downstream forecast or classification tasks.Comment: CIKM 202
    corecore