Search CORE

84 research outputs found

Machine Learning in Discrete Molecular Spaces

Author: Mater Adam
Publication venue
Publication date: 01/01/2022
Field of study

The past decade has seen an explosion of machine learning in chemistry. Whether it is in property prediction, synthesis, molecular design, or any other subdivision, machine learning seems poised to become an integral, if not a dominant, component of future research efforts. This extraordinary capacity rests on the interac- tion between machine learning models and the underlying chemical data landscape commonly referred to as chemical space. Chemical space has multiple incarnations, but is generally considered the space of all possible molecules. In this sense, it is one example of a molecular set: an arbitrary collection of molecules. This thesis is devoted to precisely these objects, and particularly how they interact with machine learning models. This work is predicated on the idea that by better understanding the relationship between molecular sets and the models trained on them we can improve models, achieve greater interpretability, and further break down the walls between data-driven and human-centric chemistry. The hope is that this enables the full predictive power of machine learning to be leveraged while continuing to build our understanding of chemistry. The first three chapters of this thesis introduce and reviews the necessary machine learning theory, particularly the tools that have been specially designed for chemical problems. This is followed by an extensive literature review in which the contributions of machine learning to multiple facets of chemistry over the last two decades are explored. Chapters 4-7 explore the research conducted throughout this PhD. Here we explore how we can meaningfully describe the properties of an arbitrary set of molecules through information theory; how we can determine the most informative data points in a set of molecules; how graph signal processing can be used to understand the relationship between the chosen molecular representation, the property, and the machine learning model; and finally how this approach can be brought to bear on protein space. Each of these sub-projects briefly explores the necessary mathematical theory before leveraging it to provide approaches that resolve the posed problems. We conclude with a summary of the contributions of this work and outline fruitful avenues for further exploration

The Australian National University

Black-, grey-, and white-box side-channel programming for software integrity checking

Author: Liu Hong
Publication venue: Kansas State University
Publication date
Field of study

Doctor of PhilosophyDepartment of Computing and Information SciencesEugene VassermanChecking software integrity is a fundamental problem of system security. Many approaches have been proposed trying to enforce that a device runs the original code. Software-based methods such as hypervisors, separation kernels, and control flow integrity checking often rely on processors to provide some form of separation such as operation modes and memory protection. Hardware-based methods such as remote attestation, secure boot, and watchdog coprocessors rely on trusted hardware to execute attestation code such as verifying memory content and examining signatures appearing on buses. However, many embedded systems do not possess such sophisticated capabilities due to prohibitive hardware costs, unacceptably high power consumption, or the inability to update fielded components. Further, security assumption may become invalid as time goes by. For Systems-on-Chip (SoCs), in particular, internal activities cannot be observed directly, while in non-SoCs, sniffing bus traffic between constituent components may suffice for integrity checking. A promising approach to check software integrity for resource-constrained SoCs is through side-channels. Side-channels have been used mostly for attacks, such as eavesdropping from vibration of glass or plant leaves, fingerprinting machines from traffic patterns, or extracting secret key materials of cryptographic routines using power consumption measurements. In this work, side-channels are used to enhance rather than undercut security. First, we study the relationships between the internal states of a target device and side-channel information. We use the uncovered relationships to monitor the internal state of a running device and determine whether the internal state is an expected one. An unexpected state may be a sign of incorrect execution or malicious activity. To further explore the possibilities inherent in side-channel-based software integrity checking, we investigate various hardware platforms, representative of different degrees of knowledge of the hardware from the side-channel profiling point of view. In other words, side-channel information is extracted by black-, grey-, and white-box analysis. Each one involves unique challenges requiring different techniques to successfully derive “side-channel profiles”. We can use these profiles to detect unexpected states with extremely high probability, even when an adversary knows that their code may be subject to side-channel analysis, i.e., the methodology is robust to side-channel-aware adversaries. The research includes: (1) Constructing systematic approaches for black- and grey-box profiling of side channels (and comparing them to white-box analysis); (2) Designing custom measurement instrumentation; and (3) Developing techniques for monitoring and enforcing software integrity utilizing side-channel profiles. We introduce the term “side-channel programming” to refer to techniques we design in which developers explicitly utilize side-channel characteristics of existing hardware to optimize run-time software integrity checking, creating executable code which is more conducive to side-channel-based monitoring. Compared with other software integrity checking techniques, our approach has numerous benefits. Among them are that the measurement process is non-invasive, non-interruptive, and backward-compatible in that it does not require any hardware modification, meaning our approach works with processors that do not include security features. Our method can even be used to augment existing protection mechanism, as it works even when all security mechanisms internal to the device fail

K-State Research Exchange

Urban Informatics

Author
Publication venue: Springer Nature
Publication date: 06/04/2021
Field of study

This open access book is the first to systematically introduce the principles of urban informatics and its application to every aspect of the city that involves its functioning, control, management, and future planning. It introduces new models and tools being developed to understand and implement these technologies that enable cities to function more efficiently – to become ‘smart’ and ‘sustainable’. The smart city has quickly emerged as computers have become ever smaller to the point where they can be embedded into the very fabric of the city, as well as being central to new ways in which the population can communicate and act. When cities are wired in this way, they have the potential to become sentient and responsive, generating massive streams of ‘big’ data in real time as well as providing immense opportunities for extracting new forms of urban data through crowdsourcing. This book offers a comprehensive review of the methods that form the core of urban informatics from various kinds of urban remote sensing to new approaches to machine learning and statistical modelling. It provides a detailed technical introduction to the wide array of tools information scientists need to develop the key urban analytics that are fundamental to learning about the smart city, and it outlines ways in which these tools can be used to inform design and policy so that cities can become more efficient with a greater concern for environment and equity

UCL Discovery

Gaussian Process Regression for Materials and Molecules.

Author: Bartók AP
Bernstein N
Ceriotti M
Csányi G
Deringer VL
Wilkins DM
Publication venue: Chemical reviews
Publication date: 25/08/2021
Field of study

We provide an introduction to Gaussian process regression (GPR) machine-learning methods in computational materials science and chemistry. The focus of the present review is on the regression of atomistic properties: in particular, on the construction of interatomic potentials, or force fields, in the Gaussian Approximation Potential (GAP) framework; beyond this, we also discuss the fitting of arbitrary scalar, vectorial, and tensorial quantities. Methodological aspects of reference data generation, representation, and regression, as well as the question of how a data-driven model may be validated, are reviewed and critically discussed. A survey of applications to a variety of research questions in chemistry and materials science illustrates the rapid growth in the field. A vision is outlined for the development of the methodology in the years to come

Infoscience - École polytechnique fédérale de Lausanne

Queen's University Belfast Research Portal

PubMed Central

Warwick Research Archives Portal Repository

Apollo (Cambridge)

CUED - Cambridge University Engineering Department

Urban Informatics

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/04/2021
Field of study

Directory of Open Access Books (DOAB)

Model design for algorithmic efficiency in electromagnetic sensing

Author: Krueger Kyle R.
Publication venue: Georgia Institute of Technology
Publication date: 13/01/2014
Field of study

The objective of the proposed research is to develop structural changes to the design and application of electromagnetic (EM) sensing models to more efficiently and accurately invert EM measurements to extract parameters for applications such as landmine detection. Two different acquisition modalities are addressed in this research: ground-penetrating radar (GPR) and electromagnetic induction (EMI) sensors. The models needed for practical three-dimensional (3D) spatial imaging typically become impractically large, with up to seven dimensions of parameters that need to be extracted. These parameters include, but are not limited to target type, 3D location, and 3D orientation. The new special structures for these models exploit properties such as shift invariance and tensor representation, which can be combined with strategic inversion techniques, including the Fast Fourier Transform and semidefinite programming. The structures dramatically reduce the amount of computation and can eliminate the need to store up to five dimensions of parameters while still accurately estimating them.Ph.D

Scholarly Materials And Research @ Georgia Tech

Urban Informatics

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

OAPEN Library

Proceedings full papers ISG*ISARC2012 : joint conference of the 8th World Conference of the International Society for Gerontechnology (ISG) and the 29th International Symposium on Automation and Robotics in Construction (ISARC), June 26-29, Eindhoven, The Netherlands

Author
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2012
Field of study

Pure OAI Repository

Proceedings full papers ISG*ISARC2012 : joint conference of the 8th World Conference of the International Society for Gerontechnology (ISG) and the 29th International Symposium on Automation and Robotics in Construction (ISARC), June 26-29, Eindhoven, The Netherlands

Author
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2012
Field of study

Pure OAI Repository

On the use of context information for an improved application of data-based algorithms in condition monitoring

Author: López de Calle Etxabe Kerman
Publication venue
Publication date: 04/11/2020
Field of study

xi, 124 p.En el campo de la monitorización de la condición, los algoritmos basados en datos cuentan con un amplio recorrido. Desde el uso de los gráficos de control de calidad que se llevan empleando durante casi un siglo a técnicas de mayor complejidad como las redes neuronales o máquinas de soporte vectorial que se emplean para detección, diagnóstico y estimación de vida remanente de los equipos. Sin embargo, la puesta en producción de los algoritmos de monitorización requiere de un estudio exhaustivo de un factor que es a menudo obviado por otros trabajos de la literatura: el contexto. El contexto, que en este trabajo es considerado como el conjunto de factores que influencian la monitorización de un bien, tiene un gran impacto en la algoritmia de monitorización y su aplicación final. Por este motivo, es el objeto de estudio de esta tesis en la que se han analizado tres casos de uso. Se ha profundizado en sus respectivos contextos, tratando de generalizar a la problemática habitual en la monitorización de maquinaria industrial, y se ha abordado dicha problemática de monitorización de forma que solucionen el contexto en lugar de cada caso de uso. Así, el conocimiento adquirido durante el desarrollo de las soluciones puede ser transferido a otros casos de uso que cuenten con contextos similares

Archivo Digital para la Docencia y la Investigación