2,127 research outputs found

    Biometric Systems

    Get PDF
    Because of the accelerating progress in biometrics research and the latest nation-state threats to security, this book's publication is not only timely but also much needed. This volume contains seventeen peer-reviewed chapters reporting the state of the art in biometrics research: security issues, signature verification, fingerprint identification, wrist vascular biometrics, ear detection, face detection and identification (including a new survey of face recognition), person re-identification, electrocardiogram (ECT) recognition, and several multi-modal systems. This book will be a valuable resource for graduate students, engineers, and researchers interested in understanding and investigating this important field of study

    Exploring Machine Learning for Untargeted Metabolomics Using Molecular Fingerprints

    Get PDF
    Background Metabolomics, the study of substrates and products of cellular metabolism, offers valuable insights into an organism's state under specific conditions and has the potential to revolutionise preventive healthcare and pharmaceutical research. However, analysing large metabolomics datasets remains challenging, with available methods relying on limited and incompletely annotated metabolic pathways. Methods This study, inspired by well-established methods in drug discovery, employs machine learning on metabolite fingerprints to explore the relationship of their structure with responses in experimental conditions beyond known pathways, shedding light on metabolic processes. It evaluates fingerprinting effectiveness in representing metabolites, addressing challenges like class imbalance, data sparsity, high dimensionality, duplicate structural encoding, and interpretable features. Feature importance analysis is then applied to reveal key chemical configurations affecting classification, identifying related metabolite groups. Results The approach is tested on two datasets: one on Ataxia Telangiectasia and another on endothelial cells under low oxygen. Machine learning on molecular fingerprints predicts metabolite responses effectively, and feature importance analysis aligns with known metabolic pathways, unveiling new affected metabolite groups for further study. Conclusion In conclusion, the presented approach leverages the strengths of drug discovery to address critical issues in metabolomics research and aims to bridge the gap between these two disciplines. This work lays the foundation for future research in this direction, possibly exploring alternative structural encodings and machine learning models

    SE(3)-Invariant Multiparameter Persistent Homology for Chiral-Sensitive Molecular Property Prediction

    Full text link
    In this study, we present a novel computational method for generating molecular fingerprints using multiparameter persistent homology (MPPH). This technique holds considerable significance for drug discovery and materials science, where precise molecular property prediction is vital. By integrating SE(3)-invariance with Vietoris-Rips persistent homology, we effectively capture the three-dimensional representations of molecular chirality. This non-superimposable mirror image property directly influences the molecular interactions, serving as an essential factor in molecular property prediction. We explore the underlying topologies and patterns in molecular structures by applying Vietoris-Rips persistent homology across varying scales and parameters such as atomic weight, partial charge, bond type, and chirality. Our method's efficacy can be improved by incorporating additional parameters such as aromaticity, orbital hybridization, bond polarity, conjugated systems, as well as bond and torsion angles. Additionally, we leverage Stochastic Gradient Langevin Boosting in a Bayesian ensemble of GBDTs to obtain aleatoric and epistemic uncertainty estimates for gradient boosting models. With these uncertainty estimates, we prioritize high-uncertainty samples for active learning and model fine-tuning, benefiting scenarios where data labeling is costly or time consuming. Compared to conventional GNNs which usually suffer from oversmoothing and oversquashing, MPPH provides a more comprehensive and interpretable characterization of molecular data topology. We substantiate our approach with theoretical stability guarantees and demonstrate its superior performance over existing state-of-the-art methods in predicting molecular properties through extensive evaluations on the MoleculeNet benchmark datasets.Comment: NeurIPS 2023 AI for Science Worksho

    Network-driven strategies to integrate and exploit biomedical data

    Get PDF
    [eng] In the quest for understanding complex biological systems, the scientific community has been delving into protein, chemical and disease biology, populating biomedical databases with a wealth of data and knowledge. Currently, the field of biomedicine has entered a Big Data era, in which computational-driven research can largely benefit from existing knowledge to better understand and characterize biological and chemical entities. And yet, the heterogeneity and complexity of biomedical data trigger the need for a proper integration and representation of this knowledge, so that it can be effectively and efficiently exploited. In this thesis, we aim at developing new strategies to leverage the current biomedical knowledge, so that meaningful information can be extracted and fused into downstream applications. To this goal, we have capitalized on network analysis algorithms to integrate and exploit biomedical data in a wide variety of scenarios, providing a better understanding of pharmacoomics experiments while helping accelerate the drug discovery process. More specifically, we have (i) devised an approach to identify functional gene sets associated with drug response mechanisms of action, (ii) created a resource of biomedical descriptors able to anticipate cellular drug response and identify new drug repurposing opportunities, (iii) designed a tool to annotate biomedical support for a given set of experimental observations, and (iv) reviewed different chemical and biological descriptors relevant for drug discovery, illustrating how they can be used to provide solutions to current challenges in biomedicine.[cat] En la cerca d’una millor comprensió dels sistemes biològics complexos, la comunitat científica ha estat aprofundint en la biologia de les proteïnes, fàrmacs i malalties, poblant les bases de dades biomèdiques amb un gran volum de dades i coneixement. En l’actualitat, el camp de la biomedicina es troba en una era de “dades massives” (Big Data), on la investigació duta a terme per ordinadors se’n pot beneficiar per entendre i caracteritzar millor les entitats químiques i biològiques. No obstant, la heterogeneïtat i complexitat de les dades biomèdiques requereix que aquestes s’integrin i es representin d’una manera idònia, permetent així explotar aquesta informació d’una manera efectiva i eficient. L’objectiu d’aquesta tesis doctoral és desenvolupar noves estratègies que permetin explotar el coneixement biomèdic actual i així extreure informació rellevant per aplicacions biomèdiques futures. Per aquesta finalitat, em fet servir algoritmes de xarxes per tal d’integrar i explotar el coneixement biomèdic en diferents tasques, proporcionant un millor enteniment dels experiments farmacoòmics per tal d’ajudar accelerar el procés de descobriment de nous fàrmacs. Com a resultat, en aquesta tesi hem (i) dissenyat una estratègia per identificar grups funcionals de gens associats a la resposta de línies cel·lulars als fàrmacs, (ii) creat una col·lecció de descriptors biomèdics capaços, entre altres coses, d’anticipar com les cèl·lules responen als fàrmacs o trobar nous usos per fàrmacs existents, (iii) desenvolupat una eina per descobrir quins contextos biològics corresponen a una associació biològica observada experimentalment i, finalment, (iv) hem explorat diferents descriptors químics i biològics rellevants pel procés de descobriment de nous fàrmacs, mostrant com aquests poden ser utilitzats per trobar solucions a reptes actuals dins el camp de la biomedicina

    Knowledge Expansion of a Statistical Machine Translation System using Morphological Resources

    Get PDF
    Translation capability of a Phrase-Based Statistical Machine Translation (PBSMT) system mostly depends on parallel data and phrases that are not present in the training data are not correctly translated. This paper describes a method that efficiently expands the existing knowledge of a PBSMT system without adding more parallel data but using external morphological resources. A set of new phrase associations is added to translation and reordering models; each of them corresponds to a morphological variation of the source/target/both phrases of an existing association. New associations are generated using a string similarity score based on morphosyntactic information. We tested our approach on En-Fr and Fr-En translations and results showed improvements of the performance in terms of automatic scores (BLEU and Meteor) and reduction of out-of-vocabulary (OOV) words. We believe that our knowledge expansion framework is generic and could be used to add different types of information to the model.JRC.G.2-Global security and crisis managemen

    Computational Approaches to Drug Profiling and Drug-Protein Interactions

    Get PDF
    Despite substantial increases in R&D spending within the pharmaceutical industry, denovo drug design has become a time-consuming endeavour. High attrition rates led to a long period of stagnation in drug approvals. Due to the extreme costs associated with introducing a drug to the market, locating and understanding the reasons for clinical failure is key to future productivity. As part of this PhD, three main contributions were made in this respect. First, the web platform, LigNFam enables users to interactively explore similarity relationships between ‘drug like’ molecules and the proteins they bind. Secondly, two deep-learning-based binding site comparison tools were developed, competing with the state-of-the-art over benchmark datasets. The models have the ability to predict offtarget interactions and potential candidates for target-based drug repurposing. Finally, the open-source ScaffoldGraph software was presented for the analysis of hierarchical scaffold relationships and has already been used in multiple projects, including integration into a virtual screening pipeline to increase the tractability of ultra-large screening experiments. Together, and with existing tools, the contributions made will aid in the understanding of drug-protein relationships, particularly in the fields of off-target prediction and drug repurposing, helping to design better drugs faster

    Signal and data processing for machine olfaction and chemical sensing: A review

    Get PDF
    Signal and data processing are essential elements in electronic noses as well as in most chemical sensing instruments. The multivariate responses obtained by chemical sensor arrays require signal and data processing to carry out the fundamental tasks of odor identification (classification), concentration estimation (regression), and grouping of similar odors (clustering). In the last decade, important advances have shown that proper processing can improve the robustness of the instruments against diverse perturbations, namely, environmental variables, background changes, drift, etc. This article reviews the advances made in recent years in signal and data processing for machine olfaction and chemical sensing

    Introduction to Graph Polynomials

    Get PDF
    With graph polynomials being a fairly new but intricate realm of graph theory, I will begin with a brief historical background and progress to elucidate each polynomial’s unique characteristics and mathematical underpinnings. Through illustrative examples, the paper elucidates the practical applications of these graph polynomials, showcasing their efficacy in real-world scenarios. My research contributes to the broader understanding of graph polynomials and inspires further research in the intersection of mathematics and technology

    Robust recognition and exploratory analysis of crystal structures using machine learning

    Get PDF
    In den Materialwissenschaften läuten Künstliche-Intelligenz Methoden einen Paradigmenwechsel in Richtung Big-data zentrierter Forschung ein. Datenbanken mit Millionen von Einträgen, sowie hochauflösende Experimente, z.B. Elektronenmikroskopie, enthalten eine Fülle wachsender Information. Um diese ungenützten, wertvollen Daten für die Entdeckung verborgener Muster und Physik zu nutzen, müssen automatische analytische Methoden entwickelt werden. Die Kristallstruktur-Klassifizierung ist essentiell für die Charakterisierung eines Materials. Vorhandene Daten bieten vielfältige atomare Strukturen, enthalten jedoch oft Defekte und sind unvollständig. Eine geeignete Methode sollte diesbezüglich robust sein und gleichzeitig viele Systeme klassifizieren können, was für verfügbare Methoden nicht zutrifft. In dieser Arbeit entwickeln wir ARISE, eine Methode, die auf Bayesian deep learning basiert und mehr als 100 Strukturklassen robust und ohne festzulegende Schwellwerte klassifiziert. Die einfach erweiterbare Strukturauswahl ist breit gefächert und umfasst nicht nur Bulk-, sondern auch zwei- und ein-dimensionale Systeme. Für die lokale Untersuchung von großen, polykristallinen Systemen, führen wir die strided pattern matching Methode ein. Obwohl nur auf perfekte Strukturen trainiert, kann ARISE stark gestörte mono- und polykristalline Systeme synthetischen als auch experimentellen Ursprungs charakterisieren. Das Model basiert auf Bayesian deep learning und ist somit probabilistisch, was die systematische Berechnung von Unsicherheiten erlaubt, welche mit der Kristallordnung von metallischen Nanopartikeln in Elektronentomographie-Experimenten korrelieren. Die Anwendung von unüberwachtem Lernen auf interne Darstellungen des neuronalen Netzes enthüllt Korngrenzen und nicht ersichtliche Regionen, die über interpretierbare geometrische Eigenschaften verknüpft sind. Diese Arbeit ermöglicht die Analyse atomarer Strukturen mit starken Rauschquellen auf bisher nicht mögliche Weise.In materials science, artificial-intelligence tools are driving a paradigm shift towards big data-centric research. Large computational databases with millions of entries and high-resolution experiments such as electron microscopy contain large and growing amount of information. To leverage this under-utilized - yet very valuable - data, automatic analytical methods need to be developed. The classification of the crystal structure of a material is essential for its characterization. The available data is structurally diverse but often defective and incomplete. A suitable method should therefore be robust with respect to sources of inaccuracy, while being able to treat multiple systems. Available methods do not fulfill both criteria at the same time. In this work, we introduce ARISE, a Bayesian-deep-learning based framework that can treat more than 100 structural classes in robust fashion, without any predefined threshold. The selection of structural classes, which can be easily extended on demand, encompasses a wide range of materials, in particular, not only bulk but also two- and one-dimensional systems. For the local study of large, polycrystalline samples, we extend ARISE by introducing so-called strided pattern matching. While being trained on ideal structures only, ARISE correctly characterizes strongly perturbed single- and polycrystalline systems, from both synthetic and experimental resources. The probabilistic nature of the Bayesian-deep-learning model allows to obtain principled uncertainty estimates which are found to be correlated with crystalline order of metallic nanoparticles in electron-tomography experiments. Applying unsupervised learning to the internal neural-network representations reveals grain boundaries and (unapparent) structural regions sharing easily interpretable geometrical properties. This work enables the hitherto hindered analysis of noisy atomic structural data
    corecore