496 research outputs found

    Python Wrapper for Simulating Multi-Fidelity Optimization on HPO Benchmarks without Any Wait

    Full text link
    Hyperparameter (HP) optimization of deep learning (DL) is essential for high performance. As DL often requires several hours to days for its training, HP optimization (HPO) of DL is often prohibitively expensive. This boosted the emergence of tabular or surrogate benchmarks, which enable querying the (predictive) performance of DL with a specific HP configuration in a fraction. However, since actual runtimes of a DL training are significantly different from query response times, in a naive implementation, simulators of an asynchronous HPO, e.g. multi-fidelity optimization, must wait for the actual runtimes at each iteration; otherwise, the evaluation order in the simulator does not match with the real experiment. To ease this issue, we develop a Python wrapper to force each worker to wait in order to match the evaluation order with the real experiment and describe the usage. Our implementation reduces the waiting time to 0.01 seconds and it is available at https://github.com/nabenabe0928/mfhpo-simulator/

    Tree-structured Parzen estimator: Understanding its algorithm components and their roles for better empirical performance

    Full text link
    Recent advances in many domains require more and more complicated experiment design. Such complicated experiments often have many parameters, which necessitate parameter tuning. Tree-structured Parzen estimator (TPE), a Bayesian optimization method, is widely used in recent parameter tuning frameworks. Despite its popularity, the roles of each control parameter and the algorithm intuition have not been discussed so far. In this tutorial, we will identify the roles of each control parameter and their impacts on hyperparameter optimization using a diverse set of benchmarks. We compare our recommended setting drawn from the ablation study with baseline methods and demonstrate that our recommended setting improves the performance of TPE. Our TPE implementation is available at https://github.com/nabenabe0928/tpe/tree/single-opt

    c-TPE: Tree-structured Parzen Estimator with Inequality Constraints for Expensive Hyperparameter Optimization

    Full text link
    Hyperparameter optimization (HPO) is crucial for strong performance of deep learning algorithms and real-world applications often impose some constraints, such as memory usage, or latency on top of the performance requirement. In this work, we propose constrained TPE (c-TPE), an extension of the widely-used versatile Bayesian optimization method, tree-structured Parzen estimator (TPE), to handle these constraints. Our proposed extension goes beyond a simple combination of an existing acquisition function and the original TPE, and instead includes modifications that address issues that cause poor performance. We thoroughly analyze these modifications both empirically and theoretically, providing insights into how they effectively overcome these challenges. In the experiments, we demonstrate that c-TPE exhibits the best average rank performance among existing methods with statistical significance on 81 expensive HPO settings.Comment: Accepted to IJCAI 202

    Multi-objective Tree-structured Parzen Estimator Meets Meta-learning

    Full text link
    Hyperparameter optimization (HPO) is essential for the better performance of deep learning, and practitioners often need to consider the trade-off between multiple metrics, such as error rate, latency, memory requirements, robustness, and algorithmic fairness. Due to this demand and the heavy computation of deep learning, the acceleration of multi-objective (MO) optimization becomes ever more important. Although meta-learning has been extensively studied to speedup HPO, existing methods are not applicable to the MO tree-structured parzen estimator (MO-TPE), a simple yet powerful MO-HPO algorithm. In this paper, we extend TPE's acquisition function to the meta-learning setting, using a task similarity defined by the overlap in promising domains of each task. In a comprehensive set of experiments, we demonstrate that our method accelerates MO-TPE on tabular HPO benchmarks and yields state-of-the-art performance. Our method was also validated externally by winning the AutoML 2022 competition on "Multiobjective Hyperparameter Optimization for Transformers".Comment: Meta-learning workshop on NeurIPS 202

    Features of ice sheet flow in East Dronning Maud Land, East Antarctica

    Get PDF
    The Japanese Antarctic Research Expeditions(JAREs) have done glaciological studies on ice sheet dynamics and surface mass balance in East Dronning Maud Land, mainly around the Shirase Glacier drainage basin, during more than 30 years. The surface mass balance, obtained mainly by the snow stake method, was more than 250mm/a in the coastal region, less than 50mm/a in the inland region higher than 3500m in altitude, and about 100mm/a on average in the five drainage basins in East Dronning Maud Land. The ice flow velocity was observed around East Dronning Maud Land in three observation periods: on a route transversal to the Shirase Glacier flow in 1969 to 1974, along a route longitudinal to Shirase Glacier and a transversal route from Mizuho Station(70°42\u27S , 44°17\u27E , 2250m a.s.l.) to the Sr Rondane Mountains area in 1982 to 1987, and along a route from S16(69°02\u27S , 40°03\u27E , 554m a.s.l.) near the coast to Dome Fuji Station(77°19\u27S , 39°42\u27E , 3810m a.s.l.) in 1992 to 1995. Assuming steady ice flow, the balance velocity is calculated by integrating the surface mass balance in the upstream area from a specific point to the flow origin between adjacent stream lines. From the relation between balance velocity and basal shear stress, the basal sliding area was specified

    Combinatorial perturbation analysis reveals divergent regulations of mesenchymal genes during epithelial-to-mesenchymal transition

    Get PDF
    Epithelial-to-mesenchymal transition (EMT), a fundamental transdifferentiation process in development, produces diverse phenotypes in different physiological or pathological conditions. Many genes involved in EMT have been identified to date, but mechanisms contributing to the phenotypic diversity and those governing the coupling between the dynamics of epithelial (E) genes and that of the mesenchymal (M) genes are unclear. In this study, we employed combinatorial perturbations to mammary epithelial cells to induce a series of EMT phenotypes by manipulating two essential EMT-inducing elements, namely TGF-β and ZEB1. By measuring transcriptional changes in more than 700 E-genes and M-genes, we discovered that the M-genes exhibit a significant diversity in their dependency to these regulatory elements and identified three groups of M-genes that are controlled by different regulatory circuits. Notably, functional differences were detected among the M-gene clusters in motility regulation and in survival of breast cancer patients. We computationally predicted and experimentally confirmed that the reciprocity and reversibility of EMT are jointly regulated by ZEB1. Our integrative analysis reveals the key roles of ZEB1 in coordinating the dynamics of a large number of genes during EMT, and it provides new insights into the mechanisms for the diversity of EMT phenotypes

    ZZ-Interaction-Free Single-Qubit-Gate Optimization in Superconducting Qubits

    Full text link
    Overcoming the issue of qubit-frequency fluctuations is essential to realize stable and practical quantum computing with solid-state qubits. Static ZZ interaction, which causes a frequency shift of a qubit depending on the state of neighboring qubits, is one of the major obstacles to integrating fixed-frequency transmon qubits. Here we propose and experimentally demonstrate ZZ-interaction-free single-qubit-gate operations on a superconducting transmon qubit by utilizing a semi-analytically optimized pulse based on a perturbative analysis. The gate is designed to be robust against slow qubit-frequency fluctuations. The robustness of the optimized gate spans a few MHz, which is sufficient for suppressing the adverse effects of the ZZ interaction. Our result paves the way for an efficient approach to overcoming the issue of ZZ interaction without any additional hardware overhead.Comment: 6 pages, 2 figures plus Supplementary Information (4 pages, 2 figures
    • …
    corecore