Search CORE

11 research outputs found

Cluster Exploration using Informative Manifold Projections

Author: Evangelopoulos Xenophon
Gerolymatos Stavros
Goulermas John Y.
Gusev Vladimir
Publication venue
Publication date: 26/09/2023
Field of study

Dimensionality reduction (DR) is one of the key tools for the visual exploration of high-dimensional data and uncovering its cluster structure in two- or three-dimensional spaces. The vast majority of DR methods in the literature do not take into account any prior knowledge a practitioner may have regarding the dataset under consideration. We propose a novel method to generate informative embeddings which not only factor out the structure associated with different kinds of prior knowledge but also aim to reveal any remaining underlying structure. To achieve this, we employ a linear combination of two objectives: firstly, contrastive PCA that discounts the structure associated with the prior information, and secondly, kurtosis projection pursuit which ensures meaningful data separation in the obtained embeddings. We formulate this task as a manifold optimization problem and validate it empirically across a variety of datasets considering three distinct types of prior knowledge. Lastly, we provide an automated framework to perform iterative visual exploration of high-dimensional data

arXiv.org e-Print Archive

Circular Object Arrangement using Spherical Embeddings

Author: Brockmeier Austin J.
Evangelopoulos Xenophon
Goulermas John Y.
Mu Tingting
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

University of Liverpool Repository

The University of Manchester - Institutional Repository

HypBO: Expert-Guided Chemist-in-the-Loop Bayesian Search for New Materials

Author: Carruthers Sam
Cisse Abdoulatif
Cooper Andrew I.
Evangelopoulos Xenophon
Gusev Vladimir V.
Publication venue
Publication date: 24/08/2023
Field of study

Robotics and automation offer massive accelerations for solving intractable, multivariate scientific problems such as materials discovery, but the available search spaces can be dauntingly large. Bayesian optimization (BO) has emerged as a popular sample-efficient optimization engine, thriving in tasks where no analytic form of the target function/property is known. Here we exploit expert human knowledge in the form of hypotheses to direct Bayesian searches more quickly to promising regions of chemical space. Previous methods have used underlying distributions derived from existing experimental measurements, which is unfeasible for new, unexplored scientific tasks. Also, such distributions cannot capture intricate hypotheses. Our proposed method, which we call HypBO, uses expert human hypotheses to generate an improved seed of samples. Unpromising seeds are automatically discounted, while promising seeds are used to augment the surrogate model data, thus achieving better-informed sampling. This process continues in a global versus local search fashion, organized in a bilevel optimization framework. We validate the performance of our method on a range of synthetic functions and demonstrate its practical utility on a real chemical design task where the use of expert hypotheses accelerates the search performance significantly

arXiv.org e-Print Archive

Domain Knowledge Injection in Bayesian Search for New Materials.

Author: Cooper Andrew I
Evangelopoulos Xenophon
Thacker Joseph CR
Xie Zikai
Publication venue
Publication date: 01/01/2023
Field of study

University of Liverpool Repository

Cluster Exploration using Informative Manifold Projections.

Author: Evangelopoulos Xenophon
Gerolymatos Stavros
Goulermas John Yannis
Gusev Vladimir V
Publication venue
Publication date: 01/01/2023
Field of study

University of Liverpool Repository

Continuation Methods for Approximate Large Scale Object Sequencing

Author: Brockmeier Austin
Evangelopoulos Xenophon
Goulermas John
Mu Tingting
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

We propose a set of highly scalable algorithms for the combinatorial data analysis problem of seriating similarity matrices. Seriation consists of finding a permutation of data instances, such that similar instances are nearby in the ordering. Applications of the seriation problem can be found in various disciplines such as in bioinformatics for genome sequencing, data visualization and exploratory data analysis. Our algorithms attempt to minimize certain p-SUM objectives, which also arise in the problem of envelope reduction of sparse matrices. In particular, we present a set of graduated non-convexity algorithms for vector-based relaxations of the general p-SUM problem for p ∈ {2,1,½} that can scale to very large problem sizes. Different choices of p emphasize global versus local similarity pattern structure. We conduct a number of experiments to compare our algorithms to various state-of-the-art combinatorial optimization methods on real and synthetic datasets. The experimental results demonstrate that compared to other approaches, the proposed algorithms are very competitive and scale well with large problem sizes

University of Liverpool Repository

The University of Manchester - Institutional Repository

Fine-tuning GPT-3 for machine learning electronic and functional properties of organic molecules

Author: Chen Linjiang
Cooper Andrew I.
Evangelopoulos Xenophon
Omar Ömer
Troisi Alessandro
Xie Zikai
Publication venue: ChemRxiv
Publication date: 23/08/2023
Field of study

We evaluate the effectiveness of fine-tuning GPT-3 for the prediction of electronic and functional properties of organic molecules. Our findings show that fine-tuned GPT-3 can successfully identify and distinguish between chemically meaningful patterns, and discern subtle differences among them, exhibiting robust predictive performance for the prediction of molecular properties. We focus on assessing the fine-tuned models' resilience to information loss, resulting from the absence of atoms or chemical groups, and to noise that we introduce via random alterations in atomic identities. We discuss the challenges and limitations inherent to the use of GPT-3 in molecular machine-learning tasks and suggest potential directions for future research and improvements to address these issues

University of Birmingham Research Portal

Domain Knowledge Injection in Bayesian Search for New Materials

Author: Cooper Andrew I
Evangelopoulos Xenophon
Thacker Joseph CR
Xie Zikai
Publication venue: IOS Press
Publication date: 28/09/2023
Field of study

In this paper we propose DKIBO, a Bayesian optimization (BO) algorithm that accommodates domain knowledge to tune exploration in the search space. Bayesian optimization has recently emerged as a sample-efficient optimizer for many intractable scientific problems. While various existing BO frameworks allow the input of prior beliefs to accelerate the search by narrowing down the space, incorporating such knowledge is not always straightforward and can often introduce bias and lead to poor performance. Here we propose a simple approach to incorporate structural knowledge in the acquisition function by utilizing an additional deterministic surrogate model to enrich the approximation power of the Gaussian process. This is suitably chosen according to structural information of the problem at hand and acts a corrective term towards a better-informed sampling. We empirically demonstrate the practical utility of the proposed method by successfully injecting domain knowledge in a materials design task. We further validate our method’s performance on different experimental settings and ablation analyses.</jats:p

University of Liverpool Repository

Fine-tuning GPT-3 for machine learning electronic and functional properties of organic molecules

Author: Chen Linjiang
Cooper Andrew I.
Evangelopoulos Xenophon
Omar Ömer
Troisi Alessandro
Xie Zikai
Publication venue: ChemRxiv
Publication date: 23/08/2023
Field of study

University of Birmingham Research Portal

Fine-tuning GPT-3 for machine learning electronic and functional properties of organic molecules

Author: Alessandro Troisi
Andrew I. Cooper
Linjiang Chen
Xenophon Evangelopoulos
Zikai Xie
Ömer Omar
Publication venue
Publication date: 23/08/2023
Field of study

We evaluate the effectiveness of fine-tuning GPT-3 for the prediction of electronic and functional properties of organic molecules. Our findings show that fine-tuned GPT-3 can successfully identify and distinguish between chemically meaningful patterns, and discern subtle differences among them, exhibiting robust predictive performance for the prediction of molecular properties. We focus on assessing the fine-tuned models\u27 resilience to information loss, resulting from the absence of atoms or chemical groups, and to noise that we introduce via random alterations in atomic identities. We discuss the challenges and limitations inherent to the use of GPT-3 in molecular machine-learning tasks and suggest potential directions for future research and improvements to address these issues

ChemRxiv