64 research outputs found
Basic Enhancement Strategies When Using Bayesian Optimization for Hyperparameter Tuning of Deep Neural Networks
Compared to the traditional machine learning models, deep neural networks (DNN) are known to be highly sensitive to the choice of hyperparameters. While the required time and effort for manual tuning has been rapidly decreasing for the well developed and commonly used DNN architectures, undoubtedly DNN hyperparameter optimization will continue to be a major burden whenever a new DNN architecture needs to be designed, a new task needs to be solved, a new dataset needs to be addressed, or an existing DNN needs to be improved further. For hyperparameter optimization of general machine learning problems, numerous automated solutions have been developed where some of the most popular solutions are based on Bayesian Optimization (BO). In this work, we analyze four fundamental strategies for enhancing BO when it is used for DNN hyperparameter optimization. Specifically, diversification, early termination, parallelization, and cost function transformation are investigated. Based on the analysis, we provide a simple yet robust algorithm for DNN hyperparameter optimization - DEEP-BO (Diversified, Early-termination-Enabled, and Parallel Bayesian Optimization). When evaluated over six DNN benchmarks, DEEP-BO mostly outperformed well-known solutions including GP-Hedge, BOHB, and the speed-up variants that use Median Stopping Rule or Learning Curve Extrapolation. In fact, DEEP-BO consistently provided the top, or at least close to the top, performance over all the benchmark types that we have tested. This indicates that DEEP-BO is a robust solution compared to the existing solutions. The DEEP-BO code is publicly available at <uri>https://github.com/snu-adsl/DEEP-BO</uri>
Evaluating Feature Attribution Methods for Electrocardiogram
The performance of cardiac arrhythmia detection with electrocardiograms(ECGs)
has been considerably improved since the introduction of deep learning models.
In practice, the high performance alone is not sufficient and a proper
explanation is also required. Recently, researchers have started adopting
feature attribution methods to address this requirement, but it has been
unclear which of the methods are appropriate for ECG. In this work, we identify
and customize three evaluation metrics for feature attribution methods based on
the characteristics of ECG: localization score, pointing game, and degradation
score. Using the three evaluation metrics, we evaluate and analyze eleven
widely-used feature attribution methods. We find that some of the feature
attribution methods are much more adequate for explaining ECG, where Grad-CAM
outperforms the second-best method by a large margin.Comment: 5 pages, 3 figures. Code is available at
https://github.com/SNU-DRL/Attribution-EC
Isotropic Representation Can Improve Dense Retrieval
The recent advancement in language representation modeling has broadly
affected the design of dense retrieval models. In particular, many of the
high-performing dense retrieval models evaluate representations of query and
document using BERT, and subsequently apply a cosine-similarity based scoring
to determine the relevance. BERT representations, however, are known to follow
an anisotropic distribution of a narrow cone shape and such an anisotropic
distribution can be undesirable for the cosine-similarity based scoring. In
this work, we first show that BERT-based DR also follows an anisotropic
distribution. To cope with the problem, we introduce unsupervised
post-processing methods of Normalizing Flow and whitening, and develop
token-wise method in addition to the sequence-wise method for applying the
post-processing methods to the representations of dense retrieval models. We
show that the proposed methods can effectively enhance the representations to
be isotropic, then we perform experiments with ColBERT and RepBERT to show that
the performance (NDCG at 10) of document re-ranking can be improved by
5.17\%8.09\% for ColBERT and 6.88\%22.81\% for RepBERT. To examine
the potential of isotropic representation for improving the robustness of DR
models, we investigate out-of-distribution tasks where the test dataset differs
from the training dataset. The results show that isotropic representation can
achieve a generally improved performance. For instance, when training dataset
is MS-MARCO and test dataset is Robust04, isotropy post-processing can improve
the baseline performance by up to 24.98\%. Furthermore, we show that an
isotropic model trained with an out-of-distribution dataset can even outperform
a baseline model trained with the in-distribution dataset.Comment: 9 pages, 4 figure
DR.CPO: Diversified and Realistic 3D Augmentation via Iterative Construction, Random Placement, and HPR Occlusion
In autonomous driving, data augmentation is commonly used for improving 3D
object detection. The most basic methods include insertion of copied objects
and rotation and scaling of the entire training frame. Numerous variants have
been developed as well. The existing methods, however, are considerably limited
when compared to the variety of the real world possibilities. In this work, we
develop a diversified and realistic augmentation method that can flexibly
construct a whole-body object, freely locate and rotate the object, and apply
self-occlusion and external-occlusion accordingly. To improve the diversity of
the whole-body object construction, we develop an iterative method that
stochastically combines multiple objects observed from the real world into a
single object. Unlike the existing augmentation methods, the constructed
objects can be randomly located and rotated in the training frame because
proper occlusions can be reflected to the whole-body objects in the final step.
Finally, proper self-occlusion at each local object level and
external-occlusion at the global frame level are applied using the Hidden Point
Removal (HPR) algorithm that is computationally efficient. HPR is also used for
adaptively controlling the point density of each object according to the
object's distance from the LiDAR. Experiment results show that the proposed
DR.CPO algorithm is data-efficient and model-agnostic without incurring any
computational overhead. Also, DR.CPO can improve mAP performance by 2.08% when
compared to the best 3D detection result known for KITTI dataset. The code is
available at https://github.com/SNU-DRL/DRCPO.gi
Meta-Learning with a Geometry-Adaptive Preconditioner
Model-agnostic meta-learning (MAML) is one of the most successful
meta-learning algorithms. It has a bi-level optimization structure where the
outer-loop process learns a shared initialization and the inner-loop process
optimizes task-specific weights. Although MAML relies on the standard gradient
descent in the inner-loop, recent studies have shown that controlling the
inner-loop's gradient descent with a meta-learned preconditioner can be
beneficial. Existing preconditioners, however, cannot simultaneously adapt in a
task-specific and path-dependent way. Additionally, they do not satisfy the
Riemannian metric condition, which can enable the steepest descent learning
with preconditioned gradient. In this study, we propose Geometry-Adaptive
Preconditioned gradient descent (GAP) that can overcome the limitations in
MAML; GAP can efficiently meta-learn a preconditioner that is dependent on
task-specific parameters, and its preconditioner can be shown to be a
Riemannian metric. Thanks to the two properties, the geometry-adaptive
preconditioner is effective for improving the inner-loop optimization.
Experiment results show that GAP outperforms the state-of-the-art MAML family
and preconditioned gradient descent-MAML (PGD-MAML) family in a variety of
few-shot learning tasks. Code is available at:
https://github.com/Suhyun777/CVPR23-GAP.Comment: Accepted at CVPR 2023. Code is available at:
https://github.com/Suhyun777/CVPR23-GAP; This is an extended version of our
previous CVPR23 wor
Time-resolved open-circuit conductive atomic force microscopy for direct electromechanical characterisation.
Studying nanomaterial piezoelectricity and triboelectricity is attractive for energy and sensing applications. However, quantitative characterisation of electromechanical effects in nanomaterials is challenging due to practical limitations and possible combination of effects, resulting in contradicting reports at times. When it comes to piezoelectricity at the nanoscale, piezoresponse force microscopy (PFM) is the default characterisation tool. In PFM the converse piezoelectric effect is measured - the conversion from electrical signal to mechanical response. However, there is an underlying desire to measure the direct piezoelectric effect - conversion of mechanical deformation to an electrical signal. This corresponds to energy harvesting and sensing. Here we present time-resolved open-circuit conductive atomic force microscopy (cAFM) as a new methodology to carry out direct electromechanical characterisation. We show, both theoretically and experimentally, that the standard short-circuit cAFM mode is inadequate for piezoelectric characterisation, and that resulting measurements are governed by competing mechanisms. We apply the new methodology to nanowires of GaAs, an important semiconductor, with relatively low piezoelectric coefficients. The results suggest that time-resolved operation distinguishes between triboelectric and piezoelectric signals, and that by measuring the open-circuit voltage rather than short-circuit current, the new methodology allows quantitative characterisation of the vertical piezoelectric coefficient. The result for GaAs nanowires, ∼ 1-3 pm V-1, is in good agreement with existing knowledge and theory. This method represents a significant advance in understanding the coexistence of different electromechanical effects, and in quantitative piezoelectric nanoscale characterisation. The easy implementation will enable better understanding of electromechanics at the nanoscale
Recommended from our members
A systemic review and meta-analysis on the antihypertensive effect of aromatherapy essential oils
Purpose: This study is a systemic review of experimental results on the effects of aromatherapy on blood pressure.
Materials and Methods: Journal articles published to December, 2017, were retrieved from twelve databases. Randomized controlled trials in which were evaluated for changes in blood pressure following aromatherapy were selected. Risks of bias were assessed using the risk-of-bias (ROB) tool of the Cochrane Collaboration. Meta-analysis were dine using RevMan.
Results: Of the 2545 articles retrieved from the electronic databases, 580 duplicate articles and 1891 articles that were unrelated to the PICO (patient/problem, intervention, comparison, outcome) elements or did not satisfy the inclusion criteria excluded. Of the remaining 74 articles, 15 found to satisfy the inclusion criteria after full-text review and therefore selected for analysis. The findings of meta-analysis of 11 of these 15 articles revealed that essential oil inhalation and massage effectively decreased both systolic (n = 379; mean difference [MD], -4.72; 95% confidence interval [CI], -8.38 to -1.07) and diastolic (n = 379; MD, -2.42; 95% CI, -4.46 to -0.38) pressure.
Conclusions: Essential oil inhalation and massage therapy can effectively decrease systolic and diastolic pressure in health adults as well as in patients with hypertension
Engineering the Size Distributions of Ordered GaAs Nanowires on Silicon
Reproducible integration of III-V semiconductors on silicon can open new path toward CMOS compatible optoelectronics and novel design schemes in next generation solar cells. Ordered arrays of nanowires could accomplish this task, provided they are obtained in high yield and uniformity. In this work, we provide understanding on the physical factors affecting size uniformity in ordered GaAs arrays grown on silicon. We show that the length and diameter distributions in the initial stage of growth are not much influenced. by the Poissonian fluctuation-induced broadening, but rather are determined by the long incubation stage. We also show that the size distributions are consistent with the double exponential shapes typical for macroscopic nucleation with a large critical length after Which the nanowires grow irreversibly. The size uniformity is dramatically improved by increasing the As-4 flux, suggesting, a new path for obtaining highly uniform arrays of GaAs nanowires on silicon
Optimizing the yield of A-polar GaAs nanowires to achieve defect-free zinc blende structure and enhanced optical functionality
Compound semiconductors exhibit an intrinsic polarity, as a consequence of the ionicity of their bonds. Nanowires grow mostly along the (111) direction for energetic reasons. Arsenide and phosphide nanowires grow along (111)B, implying a group V termination of the (111) bilayers. Polarity engineering provides an additional pathway to modulate the structural and optical properties of semiconductor nanowires. In this work, we demonstrate for the first time the growth of Ga-assisted GaAs nanowires with (111)A-polarity, with a yield of up to ∼50%. This goal is achieved by employing highly Ga-rich conditions which enable proper engineering of the energies of A and B-polar surfaces. We also show that A-polarity growth suppresses the stacking disorder along the growth axis. This results in improved optical properties, including the formation of AlGaAs quantum dots with two orders or magnitude higher brightness. Overall, this work provides new grounds for the engineering of nanowire growth directions, crystal quality and optical functionality
- …