44 research outputs found
Multi-Objective Optimization via Equivariant Deep Hypervolume Approximation
Optimizing multiple competing objectives is a common problem across science and industry. The inherent inextricable trade-off between those objectives leads one to the task of exploring their Pareto front. A meaningful quantity for the purpose of the latter is the hypervolume indicator, which is used in Bayesian Optimization (BO) and Evolutionary Algorithms (EAs). However, the computational complexity for the calculation of the hypervolume scales unfavorably with increasing number of objectives and data points, which restricts its use in those common multi-objective optimization frameworks. To overcome these restrictions we propose to approximate the hypervolume function with a deep neural network, which we call DeepHV. For better sample efficiency and generalization, we exploit the fact that the hypervolume is scale-equivariant in each of the objectives as well as permutation invariant w.r.t. both the objectives and the samples, by using a deep neural network that is equivariant w.r.t. the combined group of scalings and permutations. We evaluate our method against exact, and approximate hypervolume methods in terms of accuracy, computation time, and generalization. We also apply and compare our methods to state-of-the-art multi-objective BO methods and EAs on a range of synthetic benchmark test cases. The results show that our methods are promising for such multi-objective optimization tasks
Closed-loop automatic gradient design for liquid chromatography using Bayesian optimization
Contemporary complex samples require sophisticated methods for full analysis. This work describes the development of a Bayesian optimization algorithm for automated and unsupervised development of gradient programs. The algorithm was tailored to LC using a Gaussian process model with a novel covariance kernel. To facilitate unsupervised learning, the algorithm was designed to interface directly with the chromatographic system. Single-objective and multi-objective Bayesian optimization strategies were investigated for the separation of two complex (n>18, and n>80) dye mixtures. Both approaches found satisfactory optima in under 35 measurements. The multi-objective strategy was found to be powerful and flexible in terms of exploring the Pareto front. The performance difference between the single-objective and multi-objective strategy was further investigated using a retention modeling example. One additional advantage of the multi-objective approach was that it allows for a trade-off to be made between multiple objectives without prior knowledge. In general, the Bayesian optimization strategy was found to be particularly suitable, but not limited to, cases where retention modelling is not possible, although its scalability might be limited in terms of the number of parameters that can be simultaneously optimized
Predicting RP-LC retention indices of structurally unknown chemicals from mass spectrometry data
Non-target analysis combined with liquid chromatography high resolution mass spectrometry is considered one of the most comprehensive strategies for the detection and identification of known and unknown chemicals in complex samples. However, many compounds remain unidentified due to data complexity and limited number structures in chemical databases. In this work, we have developed and validated a novel machine learning algorithm to predict the retention index (ri) values for structurally (un)known chemicals bIased on their measured fragmentation pattern. The developed model, for the first time, enabled the predication of r values without the need for the exact structure of the chemicals, with an R2 of 0.91 and 0.77 and root mean squared error (RMSE) of 47 and 67 ri  units for the NORMAN (n = 3131) and amide (n = 604) test sets, respectively. This fragment based model showed comparable accuracy in ri  prediction compared to conventional descriptor-based models that rely on known chemical structure, which obtained an R2 of 0.85 with an RMSE of 67
Computer-driven optimization of complex gradients in comprehensive two-dimensional liquid chromatography
Method development in comprehensive two-dimensional liquid chromatography (LC × LC) is a complicated endeavor. The dependency between the two dimensions and the possibility of incorporating complex gradient profiles, such as multi-segmented gradients or shifting gradients, renders method development by “trial-and-error” time-consuming and highly dependent on user experience. In this work, an open-source algorithm for the automated and interpretive method development of complex gradients in LC × LC-mass spectrometry (MS) was developed. A workflow was designed to operate within a closed-loop that allowed direct interaction between the LC × LC-MS system and a data-processing computer which ran in an unsupervised and automated fashion. Obtaining accurate retention models in LC × LC is difficult due to the challenges associated with the exact determination of retention times, curve fitting because of the use of gradient elution, and gradient deformation. Thus, retention models were compared in terms of repeatability of determination. Additionally, the design of shifting gradients in the second dimension and the prediction of peak widths were investigated. The algorithm was tested on separations of a tryptic digest of a monoclonal antibody using an objective function that included the sum of resolutions and analysis time as quality descriptors. The algorithm was able to improve the separation relative to a generic starting method using these complex gradient profiles after only four method-development iterations (i.e., sets of chromatographic conditions). Further iterations improved retention time and peak width predictions and thus the accuracy in the separations predicted by the algorithm.</p
Ruthenium polypyridyl complexes and their modes of interaction with DNA : is there a correlation between these interactions and the antitumor activity of the compounds?
Various interaction modes between a group of six ruthenium polypyridyl complexes and DNA have been studied using a number of spectroscopic techniques. Five mononuclear species were selected with formula [Ru(tpy) L1L2](2-n)?, and one closely related dinuclear cation of formula [{Ru(apy)(tpy)}2{l-H2N(CH2)6NH2}]4?. The ligand tpy is 2,20:60,200-terpyridine and the ligand L1 is a bidentate ligand, namely, apy (2,20-azobispyridine), 2-phenylazopyridine, or 2-phenylpyridinylmethylene amine. The ligand L2 is a labile monodentate ligand, being Cl-, H2O, or CH3CN. All six species containing a labile L2 were found to be able to coordinate to the DNA model base 9-ethylguanine by 1H NMR and mass spectrometry. The dinuclear cationic species, which has no positions available for coordination to a DNA base, was studied for comparison purposes. The interactions between a selection of four representative complexes and calf-thymus DNA were studied by circular and linear dichroism. To explore a possible relation between DNA-binding ability and toxicity, all compounds were screened for anticancer activity in a variety of cancer cell lines, showing in some cases an activity which is comparable to that of cisplatin. Comparison of the details of the compound structures, their DNA binding, and their toxicity allows the exploration of structure–activity relationships that might be used to guide optimization of the activity of agents of this class of compounds
Multi-Objective Optimization via Equivariant Deep Hypervolume Approximation
Optimizing multiple competing objectives is a common problem across science and industry. The inherent inextricable trade-off between those objectives leads one to the task of exploring their Pareto front. A meaningful quantity for the purpose of the latter is the hypervolume indicator, which is used in Bayesian Optimization (BO) and Evolutionary Algorithms (EAs). However, the computational complexity for the calculation of the hypervolume scales unfavorably with increasing number of objectives and data points, which restricts its use in those common multi-objective optimization frameworks. To overcome these restrictions we propose to approximate the hypervolume function with a deep neural network, which we call DeepHV. For better sample efficiency and generalization, we exploit the fact that the hypervolume is scale-equivariant in each of the objectives as well as permutation invariant w.r.t. both the objectives and the samples, by using a deep neural network that is equivariant w.r.t. the combined group of scalings and permutations. We evaluate our method against exact, and approximate hypervolume methods in terms of accuracy, computation time, and generalization. We also apply and compare our methods to state-of-the-art multi-objective BO methods and EAs on a range of synthetic benchmark test cases. The results show that our methods are promising for such multi-objective optimization tasks
Multi-Objective Optimization via Equivariant Deep Hypervolume Approximation
Optimizing multiple competing objectives is a common problem across science and industry. The inherent inextricable trade-off between those objectives leads one to the task of exploring their Pareto front. A meaningful quantity for the purpose of the latter is the hypervolume indicator, which is used in Bayesian Optimization (BO) and Evolutionary Algorithms (EAs). However, the computational complexity for the calculation of the hypervolume scales unfavorably with increasing number of objectives and data points, which restricts its use in those common multi-objective optimization frameworks. To overcome these restrictions we propose to approximate the hypervolume function with a deep neural network, which we call DeepHV. For better sample efficiency and generalization, we exploit the fact that the hypervolume is scale-equivariant in each of the objectives as well as permutation invariant w.r.t. both the objectives and the samples, by using a deep neural network that is equivariant w.r.t. the combined group of scalings and permutations. We evaluate our method against exact, and approximate hypervolume methods in terms of accuracy, computation time, and generalization. We also apply and compare our methods to state-of-the-art multi-objective BO methods and EAs on a range of synthetic benchmark test cases. The results show that our methods are promising for such multi-objective optimization tasks