181 research outputs found
SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training
In an era where symbolic mathematical equations are indispensable for
modeling complex natural phenomena, scientific inquiry often involves
collecting observations and translating them into mathematical expressions.
Recently, deep learning has emerged as a powerful tool for extracting insights
from data. However, existing models typically specialize in either numeric or
symbolic domains, and are usually trained in a supervised manner tailored to
specific tasks. This approach neglects the substantial benefits that could
arise from a task-agnostic unified understanding between symbolic equations and
their numeric counterparts. To bridge the gap, we introduce SNIP, a
Symbolic-Numeric Integrated Pre-training, which employs joint contrastive
learning between symbolic and numeric domains, enhancing their mutual
similarities in the pre-trained embeddings. By performing latent space
analysis, we observe that SNIP provides cross-domain insights into the
representations, revealing that symbolic supervision enhances the embeddings of
numeric data and vice versa. We evaluate SNIP across diverse tasks, including
symbolic-to-numeric mathematical property prediction and numeric-to-symbolic
equation discovery, commonly known as symbolic regression. Results show that
SNIP effectively transfers to various tasks, consistently outperforming fully
supervised baselines and competing strongly with established task-specific
methods, especially in few-shot learning scenarios where available data is
limited
The computational asymptotics of Gaussian variational inference
Variational inference is a popular alternative to Markov chain Monte Carlo
methods that constructs a Bayesian posterior approximation by minimizing a
discrepancy to the true posterior within a pre-specified family. This converts
Bayesian inference into an optimization problem, enabling the use of simple and
scalable stochastic optimization algorithms. However, a key limitation of
variational inference is that the optimal approximation is typically not
tractable to compute; even in simple settings the problem is nonconvex. Thus,
recently developed statistical guarantees -- which all involve the (data)
asymptotic properties of the optimal variational distribution -- are not
reliably obtained in practice. In this work, we provide two major
contributions: a theoretical analysis of the asymptotic convexity properties of
variational inference in the popular setting with a Gaussian family; and
consistent stochastic variational inference (CSVI), an algorithm that exploits
these properties to find the optimal approximation in the asymptotic regime.
CSVI consists of a tractable initialization procedure that finds the local
basin of the optimal solution, and a scaled gradient descent algorithm that
stays locally confined to that basin. Experiments on nonconvex synthetic and
real-data examples show that compared with standard stochastic gradient
descent, CSVI improves the likelihood of obtaining the globally optimal
posterior approximation
Multi-Period Trading via Convex Optimization
We consider a basic model of multi-period trading, which can be used to
evaluate the performance of a trading strategy. We describe a framework for
single-period optimization, where the trades in each period are found by
solving a convex optimization problem that trades off expected return, risk,
transaction cost and holding cost such as the borrowing cost for shorting
assets. We then describe a multi-period version of the trading method, where
optimization is used to plan a sequence of trades, with only the first one
executed, using estimates of future quantities that are unknown when the trades
are chosen. The single-period method traces back to Markowitz; the multi-period
methods trace back to model predictive control. Our contribution is to describe
the single-period and multi-period methods in one simple framework, giving a
clear description of the development and the approximations made. In this paper
we do not address a critical component in a trading algorithm, the predictions
or forecasts of future quantities. The methods we describe in this paper can be
thought of as good ways to exploit predictions, no matter how they are made. We
have also developed a companion open-source software library that implements
many of the ideas and methods described in the paper
Modeling, optimization, and sensitivity analysis of a continuous multi-segment crystallizer for production of active pharmaceutical ingredients
We have investigated the simulation-based, steady-state optimization of a new type of crystallizer for the production of pharmaceuticals. The multi-segment, multi-addition plug-flow crystallizer (MSMA-PFC) offers better control over supersaturation in one dimension compared to a batch or stirred-tank crystallizer. Through use of a population balance framework, we have written the governing model equations of population balance and mass balance on the crystallizer segments. The solution of these equations was accomplished through either the method of moments or the finite volume method. The goal was to optimize the performance of the crystallizer with respect to certain quantities, such as maximizing the mean crystal size, minimizing the coefficient of variation, or minimizing the sum of the squared errors when attempting to hit a target distribution. Such optimizations are all highly nonconvex, necessitating the use of the genetic algorithm. Our results for the optimization of a process for crystallizing flufenamic acid showed improvement in crystal size over prior literature results. Through the use of a novel simultaneous design and control (SDC) methodology, we have further optimized the flowrates and crystallizer geometry in tandem.^ We have further investigated the robustness of this process and observe significant sensitivity to error in antisolvent flowrate, as well as the kinetic parameters of crystallization. We have lastly performed a parametric study on the use of the MSMA-PFC for in-situ dissolution of fine crystals back into solution. Fine crystals are a known processing difficulty in drug manufacture, thus motivating the development of a process that can eliminate them efficiently. Prior results for cooling crystallization indicated this to be possible. However, our results show little to no dissolution is used after optimizing the crystallizer, indicating the negative impact of adding pure solvent to the process (reduced concentration via dilution, and decreased residence time) outweighs the positive benefits of dissolving fines. The prior results for cooling crystallization did not possess this coupling between flowrate, residence time, and concentration, thus making fines dissolution significantly more beneficial for that process. We conclude that the success observed in hitting the target distribution has more to do with using multiple segments and having finer control over supersaturation than with the ability to go below solubility. Our results showed that excessive nucleation still overwhelms the MSMA-PFC for in-situ fines dissolution when nucleation is too high
DOLPHIn - Dictionary Learning for Phase Retrieval
We propose a new algorithm to learn a dictionary for reconstructing and
sparsely encoding signals from measurements without phase. Specifically, we
consider the task of estimating a two-dimensional image from squared-magnitude
measurements of a complex-valued linear transformation of the original image.
Several recent phase retrieval algorithms exploit underlying sparsity of the
unknown signal in order to improve recovery performance. In this work, we
consider such a sparse signal prior in the context of phase retrieval, when the
sparsifying dictionary is not known in advance. Our algorithm jointly
reconstructs the unknown signal - possibly corrupted by noise - and learns a
dictionary such that each patch of the estimated image can be sparsely
represented. Numerical experiments demonstrate that our approach can obtain
significantly better reconstructions for phase retrieval problems with noise
than methods that cannot exploit such "hidden" sparsity. Moreover, on the
theoretical side, we provide a convergence result for our method
Approximations of Semicontinuous Functions with Applications to Stochastic Optimization and Statistical Estimation
Upper semicontinuous (usc) functions arise in the analysis of maximization
problems, distributionally robust optimization, and function identification,
which includes many problems of nonparametric statistics. We establish that
every usc function is the limit of a hypo-converging sequence of piecewise
affine functions of the difference-of-max type and illustrate resulting
algorithmic possibilities in the context of approximate solution of
infinite-dimensional optimization problems. In an effort to quantify the ease
with which classes of usc functions can be approximated by finite collections,
we provide upper and lower bounds on covering numbers for bounded sets of usc
functions under the Attouch-Wets distance. The result is applied in the context
of stochastic optimization problems defined over spaces of usc functions. We
establish confidence regions for optimal solutions based on sample average
approximations and examine the accompanying rates of convergence. Examples from
nonparametric statistics illustrate the results
Multi-objective optimisation: algorithms and application to computer-aided molecular and process design
Computer-Aided Molecular Design (CAMD) has been put forward as a powerful and systematic technique that can accelerate the identification of new candidate molecules. Given the benefits of CAMD, the concept has been extended to integrated molecular and process design, usually referred to as Computer-Aided Molecular and Process Design (CAMPD). In CAMPD approaches, not only is the interdependence between the properties of the molecules and the process performance captured, but it is also possible to assess the optimal overall performance of a given fluid using an objective function that may be based on process economics, energy efficiency, or environmental criteria. Despite the significant advances made in the field of CAM(P)D, there are remaining challenges in handling the complexities arising from the large mixed-integer nonlinear structure-property and process models and the presence of conflicting performance criteria that cannot be easily merged into a single metric. Many of the algorithms proposed to date, however, resort to single-objective decomposition-based approaches.
To overcome these challenges, a novel CAMPD optimisation framework is proposed, in the first part of thesis, in the context of identifying optimal amine solvents for carbon dioxide (CO2) chemical absorption. This requires development and validation of a model that enables the prediction of process performance metrics for a
wide range of solvents for which no experimental data exist. An equilibrium-stage model that incorporates the SAFT-γ Mie group contribution approach is proposed to provide an appropriate balance between accuracy and predictive capability with varying molecular design spaces. In order to facilitate the convergence behaviour of the process-molecular model, a tailored initialisation strategy is established based on the inside-out algorithm. Novel feasibility tests that are capable of recognising infeasible regions of molecular and process domains are developed and incorporated into an outer-approximation framework to increase solution robustness. The efficiency of the proposed algorithm is demonstrated by applying it to the design of CO2 chemical absorption processes. The algorithm is found to converge successfully in all 150 runs carried out.
To derive greater insights into the interplay between solvent and process performance, it is desirable to consider multiple objectives. In the second part of the thesis, we thus explore the relative performance of five multi-objective optimisations (MOO) solution techniques, modified from the literature to address nonconvex MINLPs, on CAM(P)D problems to gain a better understanding of the performance of different algorithms in identifying the Pareto front efficiently. The combination of the sandwich algorithm with a multi-level single-linkage algorithm to solve nonconvex subproblems is found to perform best on average. Next, a robust algorithm for bi-objective optimisation (BOO), the SDNBI algorithm, is designed to address the theoretical and numerical challenges associated with the solution of general nonconvex and discrete BOO problems. The main improvements in the development of the algorithm are focused on the effective exploration of the nonconvex regions of the Pareto front and the early identification of regions where no additional Pareto solutions exist. The performance of the algorithm is compared to that of the sandwich algorithm and the modified normal boundary intersection method (mNBI) over a set of literature benchmark problems and molecular design problems. The SDNBI found to provide the most evenly distributed approximation of the Pareto front as well as useful information on regions of the objective space that
do not contain a nondominated point. The advances in this thesis can accelerate the discovery of novel solvents for CO2 capture that can achieve improved process performance. More broadly, the modelling and algorithmic development presented extend the applicability of CAMPD and MOO based CAMD/CAMPD to a wider range of applications.Open Acces
- …