69 research outputs found
Basic Enhancement Strategies When Using Bayesian Optimization for Hyperparameter Tuning of Deep Neural Networks
Compared to the traditional machine learning models, deep neural networks (DNN) are known to be highly sensitive to the choice of hyperparameters. While the required time and effort for manual tuning has been rapidly decreasing for the well developed and commonly used DNN architectures, undoubtedly DNN hyperparameter optimization will continue to be a major burden whenever a new DNN architecture needs to be designed, a new task needs to be solved, a new dataset needs to be addressed, or an existing DNN needs to be improved further. For hyperparameter optimization of general machine learning problems, numerous automated solutions have been developed where some of the most popular solutions are based on Bayesian Optimization (BO). In this work, we analyze four fundamental strategies for enhancing BO when it is used for DNN hyperparameter optimization. Specifically, diversification, early termination, parallelization, and cost function transformation are investigated. Based on the analysis, we provide a simple yet robust algorithm for DNN hyperparameter optimization - DEEP-BO (Diversified, Early-termination-Enabled, and Parallel Bayesian Optimization). When evaluated over six DNN benchmarks, DEEP-BO mostly outperformed well-known solutions including GP-Hedge, BOHB, and the speed-up variants that use Median Stopping Rule or Learning Curve Extrapolation. In fact, DEEP-BO consistently provided the top, or at least close to the top, performance over all the benchmark types that we have tested. This indicates that DEEP-BO is a robust solution compared to the existing solutions. The DEEP-BO code is publicly available at <uri>https://github.com/snu-adsl/DEEP-BO</uri>
An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM
Stimulated by the sophisticated reasoning capabilities of recent Large
Language Models (LLMs), a variety of strategies for bridging video modality
have been devised. A prominent strategy involves Video Language Models
(VideoLMs), which train a learnable interface with video data to connect
advanced vision encoders with LLMs. Recently, an alternative strategy has
surfaced, employing readily available foundation models, such as VideoLMs and
LLMs, across multiple stages for modality bridging. In this study, we introduce
a simple yet novel strategy where only a single Vision Language Model (VLM) is
utilized. Our starting point is the plain insight that a video comprises a
series of images, or frames, interwoven with temporal information. The essence
of video comprehension lies in adeptly managing the temporal aspects along with
the spatial details of each frame. Initially, we transform a video into a
single composite image by arranging multiple frames in a grid layout. The
resulting single image is termed as an image grid. This format, while
maintaining the appearance of a solitary image, effectively retains temporal
information within the grid structure. Therefore, the image grid approach
enables direct application of a single high-performance VLM without
necessitating any video-data training. Our extensive experimental analysis
across ten zero-shot video question answering benchmarks, including five
open-ended and five multiple-choice benchmarks, reveals that the proposed Image
Grid Vision Language Model (IG-VLM) surpasses the existing methods in nine out
of ten benchmarks.Comment: Our code is available at https://github.com/imagegridworth/IG-VL
Evaluating Feature Attribution Methods for Electrocardiogram
The performance of cardiac arrhythmia detection with electrocardiograms(ECGs)
has been considerably improved since the introduction of deep learning models.
In practice, the high performance alone is not sufficient and a proper
explanation is also required. Recently, researchers have started adopting
feature attribution methods to address this requirement, but it has been
unclear which of the methods are appropriate for ECG. In this work, we identify
and customize three evaluation metrics for feature attribution methods based on
the characteristics of ECG: localization score, pointing game, and degradation
score. Using the three evaluation metrics, we evaluate and analyze eleven
widely-used feature attribution methods. We find that some of the feature
attribution methods are much more adequate for explaining ECG, where Grad-CAM
outperforms the second-best method by a large margin.Comment: 5 pages, 3 figures. Code is available at
https://github.com/SNU-DRL/Attribution-EC
Isotropic Representation Can Improve Dense Retrieval
The recent advancement in language representation modeling has broadly
affected the design of dense retrieval models. In particular, many of the
high-performing dense retrieval models evaluate representations of query and
document using BERT, and subsequently apply a cosine-similarity based scoring
to determine the relevance. BERT representations, however, are known to follow
an anisotropic distribution of a narrow cone shape and such an anisotropic
distribution can be undesirable for the cosine-similarity based scoring. In
this work, we first show that BERT-based DR also follows an anisotropic
distribution. To cope with the problem, we introduce unsupervised
post-processing methods of Normalizing Flow and whitening, and develop
token-wise method in addition to the sequence-wise method for applying the
post-processing methods to the representations of dense retrieval models. We
show that the proposed methods can effectively enhance the representations to
be isotropic, then we perform experiments with ColBERT and RepBERT to show that
the performance (NDCG at 10) of document re-ranking can be improved by
5.17\%8.09\% for ColBERT and 6.88\%22.81\% for RepBERT. To examine
the potential of isotropic representation for improving the robustness of DR
models, we investigate out-of-distribution tasks where the test dataset differs
from the training dataset. The results show that isotropic representation can
achieve a generally improved performance. For instance, when training dataset
is MS-MARCO and test dataset is Robust04, isotropy post-processing can improve
the baseline performance by up to 24.98\%. Furthermore, we show that an
isotropic model trained with an out-of-distribution dataset can even outperform
a baseline model trained with the in-distribution dataset.Comment: 9 pages, 4 figure
DR.CPO: Diversified and Realistic 3D Augmentation via Iterative Construction, Random Placement, and HPR Occlusion
In autonomous driving, data augmentation is commonly used for improving 3D
object detection. The most basic methods include insertion of copied objects
and rotation and scaling of the entire training frame. Numerous variants have
been developed as well. The existing methods, however, are considerably limited
when compared to the variety of the real world possibilities. In this work, we
develop a diversified and realistic augmentation method that can flexibly
construct a whole-body object, freely locate and rotate the object, and apply
self-occlusion and external-occlusion accordingly. To improve the diversity of
the whole-body object construction, we develop an iterative method that
stochastically combines multiple objects observed from the real world into a
single object. Unlike the existing augmentation methods, the constructed
objects can be randomly located and rotated in the training frame because
proper occlusions can be reflected to the whole-body objects in the final step.
Finally, proper self-occlusion at each local object level and
external-occlusion at the global frame level are applied using the Hidden Point
Removal (HPR) algorithm that is computationally efficient. HPR is also used for
adaptively controlling the point density of each object according to the
object's distance from the LiDAR. Experiment results show that the proposed
DR.CPO algorithm is data-efficient and model-agnostic without incurring any
computational overhead. Also, DR.CPO can improve mAP performance by 2.08% when
compared to the best 3D detection result known for KITTI dataset. The code is
available at https://github.com/SNU-DRL/DRCPO.gi
Meta-Learning with a Geometry-Adaptive Preconditioner
Model-agnostic meta-learning (MAML) is one of the most successful
meta-learning algorithms. It has a bi-level optimization structure where the
outer-loop process learns a shared initialization and the inner-loop process
optimizes task-specific weights. Although MAML relies on the standard gradient
descent in the inner-loop, recent studies have shown that controlling the
inner-loop's gradient descent with a meta-learned preconditioner can be
beneficial. Existing preconditioners, however, cannot simultaneously adapt in a
task-specific and path-dependent way. Additionally, they do not satisfy the
Riemannian metric condition, which can enable the steepest descent learning
with preconditioned gradient. In this study, we propose Geometry-Adaptive
Preconditioned gradient descent (GAP) that can overcome the limitations in
MAML; GAP can efficiently meta-learn a preconditioner that is dependent on
task-specific parameters, and its preconditioner can be shown to be a
Riemannian metric. Thanks to the two properties, the geometry-adaptive
preconditioner is effective for improving the inner-loop optimization.
Experiment results show that GAP outperforms the state-of-the-art MAML family
and preconditioned gradient descent-MAML (PGD-MAML) family in a variety of
few-shot learning tasks. Code is available at:
https://github.com/Suhyun777/CVPR23-GAP.Comment: Accepted at CVPR 2023. Code is available at:
https://github.com/Suhyun777/CVPR23-GAP; This is an extended version of our
previous CVPR23 wor
Enhancing Contrastive Learning with Efficient Combinatorial Positive Pairing
In the past few years, contrastive learning has played a central role for the
success of visual unsupervised representation learning. Around the same time,
high-performance non-contrastive learning methods have been developed as well.
While most of the works utilize only two views, we carefully review the
existing multi-view methods and propose a general multi-view strategy that can
improve learning speed and performance of any contrastive or non-contrastive
method. We first analyze CMC's full-graph paradigm and empirically show that
the learning speed of -views can be increased by times
for small learning rate and early training. Then, we upgrade CMC's full-graph
by mixing views created by a crop-only augmentation, adopting small-size views
as in SwAV multi-crop, and modifying the negative sampling. The resulting
multi-view strategy is called ECPP (Efficient Combinatorial Positive Pairing).
We investigate the effectiveness of ECPP by applying it to SimCLR and assessing
the linear evaluation performance for CIFAR-10 and ImageNet-100. For each
benchmark, we achieve a state-of-the-art performance. In case of ImageNet-100,
ECPP boosted SimCLR outperforms supervised learning
Time-resolved open-circuit conductive atomic force microscopy for direct electromechanical characterisation.
Studying nanomaterial piezoelectricity and triboelectricity is attractive for energy and sensing applications. However, quantitative characterisation of electromechanical effects in nanomaterials is challenging due to practical limitations and possible combination of effects, resulting in contradicting reports at times. When it comes to piezoelectricity at the nanoscale, piezoresponse force microscopy (PFM) is the default characterisation tool. In PFM the converse piezoelectric effect is measured - the conversion from electrical signal to mechanical response. However, there is an underlying desire to measure the direct piezoelectric effect - conversion of mechanical deformation to an electrical signal. This corresponds to energy harvesting and sensing. Here we present time-resolved open-circuit conductive atomic force microscopy (cAFM) as a new methodology to carry out direct electromechanical characterisation. We show, both theoretically and experimentally, that the standard short-circuit cAFM mode is inadequate for piezoelectric characterisation, and that resulting measurements are governed by competing mechanisms. We apply the new methodology to nanowires of GaAs, an important semiconductor, with relatively low piezoelectric coefficients. The results suggest that time-resolved operation distinguishes between triboelectric and piezoelectric signals, and that by measuring the open-circuit voltage rather than short-circuit current, the new methodology allows quantitative characterisation of the vertical piezoelectric coefficient. The result for GaAs nanowires, ∼ 1-3 pm V-1, is in good agreement with existing knowledge and theory. This method represents a significant advance in understanding the coexistence of different electromechanical effects, and in quantitative piezoelectric nanoscale characterisation. The easy implementation will enable better understanding of electromechanics at the nanoscale
Recommended from our members
A systemic review and meta-analysis on the antihypertensive effect of aromatherapy essential oils
Purpose: This study is a systemic review of experimental results on the effects of aromatherapy on blood pressure.
Materials and Methods: Journal articles published to December, 2017, were retrieved from twelve databases. Randomized controlled trials in which were evaluated for changes in blood pressure following aromatherapy were selected. Risks of bias were assessed using the risk-of-bias (ROB) tool of the Cochrane Collaboration. Meta-analysis were dine using RevMan.
Results: Of the 2545 articles retrieved from the electronic databases, 580 duplicate articles and 1891 articles that were unrelated to the PICO (patient/problem, intervention, comparison, outcome) elements or did not satisfy the inclusion criteria excluded. Of the remaining 74 articles, 15 found to satisfy the inclusion criteria after full-text review and therefore selected for analysis. The findings of meta-analysis of 11 of these 15 articles revealed that essential oil inhalation and massage effectively decreased both systolic (n = 379; mean difference [MD], -4.72; 95% confidence interval [CI], -8.38 to -1.07) and diastolic (n = 379; MD, -2.42; 95% CI, -4.46 to -0.38) pressure.
Conclusions: Essential oil inhalation and massage therapy can effectively decrease systolic and diastolic pressure in health adults as well as in patients with hypertension
- …