31 research outputs found
Strong convexity-guided hyper-parameter optimization for flatter losses
We propose a novel white-box approach to hyper-parameter optimization.
Motivated by recent work establishing a relationship between flat minima and
generalization, we first establish a relationship between the strong convexity
of the loss and its flatness. Based on this, we seek to find hyper-parameter
configurations that improve flatness by minimizing the strong convexity of the
loss. By using the structure of the underlying neural network, we derive
closed-form equations to approximate the strong convexity parameter, and
attempt to find hyper-parameters that minimize it in a randomized fashion.
Through experiments on 14 classification datasets, we show that our method
achieves strong performance at a fraction of the runtime.Comment: v
Why neural networks find simple solutions: the many regularizers of geometric complexity
In many contexts, simpler models are preferable to more complex models and
the control of this model complexity is the goal for many methods in machine
learning such as regularization, hyperparameter tuning and architecture design.
In deep learning, it has been difficult to understand the underlying mechanisms
of complexity control, since many traditional measures are not naturally
suitable for deep neural networks. Here we develop the notion of geometric
complexity, which is a measure of the variability of the model function,
computed using a discrete Dirichlet energy. Using a combination of theoretical
arguments and empirical results, we show that many common training heuristics
such as parameter norm regularization, spectral norm regularization, flatness
regularization, implicit gradient regularization, noise regularization and the
choice of parameter initialization all act to control geometric complexity,
providing a unifying framework in which to characterize the behavior of deep
learning models.Comment: Accepted as a NeurIPS 2022 pape
An explanatory machine learning framework for studying pandemics: The case of COVID-19 emergency department readmissions
ArticleInPressOne of the major challenges that confront medical experts during a pandemic is the time required to identify and validate the risk factors of the novel disease and to develop an effective treatment protocol. Traditionally, this process involves numerous clinical trials that may take up to several years, during which strict preventive measures must be in place to control the outbreak and reduce the deaths. Advanced data analytics techniques, however, can be leveraged to guide and speed up this process. In this study, we combine evolutionary search algorithms, deep learning, and advanced model interpretation methods to develop a holistic exploratory- predictive-explanatory machine learning framework that can assist clinical decision-makers in reacting to the challenges of a pandemic in a timely manner. The proposed framework is showcased in studying emergency department (ED) readmissions of COVID-19 patients using ED visits from a real-world electronic health records database. After an exploratory feature selection phase using genetic algorithm, we develop and train a deep artificial neural network to predict early (i.e., 7-day) readmissions (AUC = 0.883). Lastly, a SHAP model is formulated to estimate additive Shapley values (i.e., importance scores) of the features and to interpret the magnitude and direction of their effects. The findings are mostly in line with those reported by lengthy and expensive clinical trial studies