5,663 research outputs found

    UMSL Bulletin 2023-2024

    Get PDF
    The 2023-2024 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1088/thumbnail.jp

    Graduate Catalog of Studies, 2023-2024

    Get PDF

    Safe passage for attachment systems:Can attachment security at international schools be measured, and is it at risk?

    Get PDF
    Relocations challenge attachment networks. Regardless of whether a person moves or is moved away from, relocation produces separation and loss. When such losses are repeatedly experienced without being adequately processed, a defensive shutting down of the attachment system could result, particularly when such experiences occur during or across the developmental years. At schools with substantial turnover, this possibility could be shaping youth in ways that compromise attachment security and young people’s willingness or ability to develop and maintain deep long-term relationships. Given the well-documented associations between attachment security, social support, and long-term physical and mental health, the hypothesis that mobility could erode attachment and relational health warrants exploration. International schools are logical settings to test such a hypothesis, given their frequently high turnover without confounding factors (e.g. war trauma or refugee experiences). In addition, repeated experiences of separation and loss in international school settings would seem likely to create mental associations for the young people involved regarding how they and others tend to respond to such situations in such settings, raising the possibility that people at such schools, or even the school itself, could collectively be represented as an attachment figure. Questions like these have received scant attention in the literature. They warrant consideration because of their potential to shape young people’s most general convictions regarding attachment, which could, in turn, have implications for young people’s ability to experience meaning in their lives

    Machine Learning Approaches for the Prioritisation of Cardiovascular Disease Genes Following Genome- wide Association Study

    Get PDF
    Genome-wide association studies (GWAS) have revealed thousands of genetic loci, establishing itself as a valuable method for unravelling the complex biology of many diseases. As GWAS has grown in size and improved in study design to detect effects, identifying real causal signals, disentangling from other highly correlated markers associated by linkage disequilibrium (LD) remains challenging. This has severely limited GWAS findings and brought the method’s value into question. Although thousands of disease susceptibility loci have been reported, causal variants and genes at these loci remain elusive. Post-GWAS analysis aims to dissect the heterogeneity of variant and gene signals. In recent years, machine learning (ML) models have been developed for post-GWAS prioritisation. ML models have ranged from using logistic regression to more complex ensemble models such as random forests and gradient boosting, as well as deep learning models (i.e., neural networks). When combined with functional validation, these methods have shown important translational insights, providing a strong evidence-based approach to direct post-GWAS research. However, ML approaches are in their infancy across biological applications, and as they continue to evolve an evaluation of their robustness for GWAS prioritisation is needed. Here, I investigate the landscape of ML across: selected models, input features, bias risk, and output model performance, with a focus on building a prioritisation framework that is applied to blood pressure GWAS results and tested on re-application to blood lipid traits

    Using machine learning to predict pathogenicity of genomic variants throughout the human genome

    Get PDF
    Geschätzt mehr als 6.000 Erkrankungen werden durch Veränderungen im Genom verursacht. Ursachen gibt es viele: Eine genomische Variante kann die Translation eines Proteins stoppen, die Genregulation stören oder das Spleißen der mRNA in eine andere Isoform begünstigen. All diese Prozesse müssen überprüft werden, um die zum beschriebenen Phänotyp passende Variante zu ermitteln. Eine Automatisierung dieses Prozesses sind Varianteneffektmodelle. Mittels maschinellem Lernen und Annotationen aus verschiedenen Quellen bewerten diese Modelle genomische Varianten hinsichtlich ihrer Pathogenität. Die Entwicklung eines Varianteneffektmodells erfordert eine Reihe von Schritten: Annotation der Trainingsdaten, Auswahl von Features, Training verschiedener Modelle und Selektion eines Modells. Hier präsentiere ich ein allgemeines Workflow dieses Prozesses. Dieses ermöglicht es den Prozess zu konfigurieren, Modellmerkmale zu bearbeiten, und verschiedene Annotationen zu testen. Der Workflow umfasst außerdem die Optimierung von Hyperparametern, Validierung und letztlich die Anwendung des Modells durch genomweites Berechnen von Varianten-Scores. Der Workflow wird in der Entwicklung von Combined Annotation Dependent Depletion (CADD), einem Varianteneffektmodell zur genomweiten Bewertung von SNVs und InDels, verwendet. Durch Etablierung des ersten Varianteneffektmodells für das humane Referenzgenome GRCh38 demonstriere ich die gewonnenen Möglichkeiten Annotationen aufzugreifen und neue Modelle zu trainieren. Außerdem zeige ich, wie Deep-Learning-Scores als Feature in einem CADD-Modell die Vorhersage von RNA-Spleißing verbessern. Außerdem werden Varianteneffektmodelle aufgrund eines neuen, auf Allelhäufigkeit basierten, Trainingsdatensatz entwickelt. Diese Ergebnisse zeigen, dass der entwickelte Workflow eine skalierbare und flexible Möglichkeit ist, um Varianteneffektmodelle zu entwickeln. Alle entstandenen Scores sind unter cadd.gs.washington.edu und cadd.bihealth.org frei verfügbar.More than 6,000 diseases are estimated to be caused by genomic variants. This can happen in many possible ways: a variant may stop the translation of a protein, interfere with gene regulation, or alter splicing of the transcribed mRNA into an unwanted isoform. It is necessary to investigate all of these processes in order to evaluate which variant may be causal for the deleterious phenotype. A great help in this regard are variant effect scores. Implemented as machine learning classifiers, they integrate annotations from different resources to rank genomic variants in terms of pathogenicity. Developing a variant effect score requires multiple steps: annotation of the training data, feature selection, model training, benchmarking, and finally deployment for the model's application. Here, I present a generalized workflow of this process. It makes it simple to configure how information is converted into model features, enabling the rapid exploration of different annotations. The workflow further implements hyperparameter optimization, model validation and ultimately deployment of a selected model via genome-wide scoring of genomic variants. The workflow is applied to train Combined Annotation Dependent Depletion (CADD), a variant effect model that is scoring SNVs and InDels genome-wide. I show that the workflow can be quickly adapted to novel annotations by porting CADD to the genome reference GRCh38. Further, I demonstrate the integration of deep-neural network scores as features into a new CADD model, improving the annotation of RNA splicing events. Finally, I apply the workflow to train multiple variant effect models from training data that is based on variants selected by allele frequency. In conclusion, the developed workflow presents a flexible and scalable method to train variant effect scores. All software and developed scores are freely available from cadd.gs.washington.edu and cadd.bihealth.org

    Disease and Illness Trajectories of Pancreatic Cancer

    Get PDF

    Rational development of stabilized cyclic disulfide redox probes and bioreductive prodrugs to target dithiol oxidoreductases

    Get PDF
    Countless biological processes allow cells to develop, survive, and proliferate. Among these, tightly balanced regulatory enzymatic pathways that can respond rapidly to external impacts maintain dynamic physiological homeostasis. More specifically, redox homeostasis broadly affects cellular metabolism and proliferation, with major contributions by thiol/disulfide oxidoreductase systems, in particular, the Thioredoxin Reductase Thioredoxin (TrxR/Trx) and the Glutathione Reductase-Glutathione-Glutaredoxin (GR/GSH/Grx) systems. These cascades drive vital cellular functions in many ways through signaling, regulating other proteins' activity by redox switches, and by stoichiometric reductant transfers in metabolism and antioxidant systems. Increasing evidence argues that there is a persistent alteration of the redox environment in certain pathological states, such as cancer, that heavily involve the Trx system: upregulation and/or overactivity of the Trx system may support or drive cancer progression, making both TrxR and Trx promising targets for anti-cancer drug development. Understanding the biochemical mechanisms and connections between certain redox cascades requires research tools that interact with them. The state-of-the-art genetic tools are mostly ratiometric reporters that measure reduced:oxidized ratios of selected redox pairs or the general thiol pool. However, the precise cellular roles of the central oxidoreductase systems, including TrxR and Trx, remain inaccessible due to the lack of probes to selectively measure turnover by either of these proteins. However, such probes would allow measuring their effective reductive activity apart from expression levels in native systems, including in cells, animals, or patient samples. They are also of high interest to identify chemical inhibitors for TrxR/Trx in cells and to validate their potential use as anti-cancer agents (to date, there is no selective cellular Trx inhibitor, and most known TrxR inhibitors were not comprehensively evaluated considering selectivity and potential off-targets). However, small molecule redox imaging tools are underdeveloped: their protein specificity, spectral properties, and applicability remain poorly precedented. This work aimed to address this opportunity gap and develop novel, small molecule diagnostic and therapeutic tools to selectively target the Trx system based on a modular trigger cargo design: artificial cyclic disulfide substrates (trigger) for oxidoreductases are tethered to molecular agents (cargo) such that the cargo’s activity is masked and is re-established only through reduction by a target protein. The rational design of these novel reduction sensors to target the cell's strongest disulfide-reducing enzymes was driven by the following principles: (i) cyclic disulfide triggers with stabilized ring systems were used to gain low reduction potentials that should resist reduction except by the strongest cellular reductases, such as Trx; and (ii) the cyclic topology also offers the potential for kinetic reversibility that should select for dithiol-type redox proteins over the cellular monothiol background. Creating imaging agents based on such two-component designs to selectively measure redox protein activity in native cells required to combine the correct trigger reducibility, probe activation kinetics, and imaging modalities and to consider the overall molecular architecture. The major prior art in this field has applied cyclic 5-membered disulfides (1,2 dithiolanes) as substrates for TrxR in a similar way to create such tools. However, this motif was described elsewhere as thermodynamically instable and was due to widely used for dynamic covalent cascade reactions. By comparing a novel 1,2 dithiolane-based probe to the state-of-the-art probes, including commercial TrxR sensors, by screening a conclusive assay panel of cellular TrxR modulations, I clarified that 1,2 dithiolanes are not selective substrates for TrxR in biological settings (Nat Commun 2022). Instead, aiming for more stable ring systems and thus more robust redox probes, during this work, I developed bicyclic 6 membered disulfides (piperidine fused 1,2 dithianes) with remarkably low reduction potentials. I showed that molecular probes using them as reduction sensors can be mostly processed by thioredoxins while being stable against reduction by GSH. The thermodynamically stabilized decalin like topology of the cis-annelated 1,2 dithianes requires particularly strong reductants to be cleaved. They also select for dithiol type redox proteins, like Trx, based on kinetic reversibility and offer fast cyclization due to the preorganization by annelation (JACS 2021). This work further expanded the system’s modularity with structural cores based on piperazine-fused 1,2 dithianes with the two amines allowing independent derivatization. Diagnostic tools using them as reduction sensors proved equally robust but with highly improved activation kinetics and were thus cellularly activated. Cellular studies evolved that they are substrates for both Trxs and their protein cousins Grxs, so measuring the cellular dithiol protein pool rather than solely Trx activity (preprint 2023). Finally, a trigger based on a slightly adapted reduction sensor, a desymmetrized 1,2 thiaselenane, was designed for selective reduction by TrxR’s selenol/thiol active site, then combined with a precipitating large Stokes’ shift fluorophore and a solubilizing group, to evolve the first selective probe RX1 to measure cellular TrxR activity, which even allowed high throughput inhibitor screening (Chem 2022). The central principle of this work was further advanced to therapeutic prodrugs based on the duocarmycin cargo (CBI) with tunable potency (JACS Au 2022) that can be used to create off-to-on therapeutic prodrugs. Such CBI prodrugs employing stabilized 1,2 dichalcogenide triggers proved to be cytotoxins that depend on Trx system activity in cells. They could further be exploited for cell-line dependent reductase activity profiling by screening their redox activation indices, the reduction-dependent part of total prodrug activation, in 177 cell lines. Beyond that, these prodrugs were well-tolerated in animals and showed anti-cancer efficacy in vivo in two distinct mouse tumor models (preprint 2022). Taken together, I introduced unique monothiol-resistant reducible motifs to target the cellular Trx system with chemocompatible units for each for TrxR and Trx/Grx, where the cyclic nature of the dichalcogenides avoids activation by GSH. By using them with distinct molecular cargos, I developed novel selective fluorescent reporter probes; and introduced a new class of bioreductive therapeutic constructs based on a common modular design. These were either applied to selectively measure cellular reductase activity or to deliver cytotoxic anti cancer agents in vivo. Ongoing work aims to differentiate between the two major redox effector proteins Trx and Grx, requiring additional layers of selectivity that may be addressed by tuned molecular recognition. The flexible use of various molecular cargos allows harnessing the same cellular redox machinery by either probes or prodrugs. This allows predictive conclusions from diagnostics to be directly translated into therapy and offers great potential for future adaptation to other enzyme classes and therapeutic venues.Die zelluläre Redox-Homöostase hängt von Thiol/Disulfid-Oxidoreduktasen ab, die den Stoffwechsel, die Proliferation und die antioxidative Antwort von Zellen beeinflussen. Die wichtigsten Netzwerke sind die Thioredoxin Reduktase-Thioredoxin (TrxR/Trx) und Glutathion Reduktase-Glutathion-Glutaredoxin (GR/GSH/Grx) Systeme, die über Redox-Schalter in Substratproteinen lebenswichtige zelluläre Funktionen steuern und so an der Redox-Regulation und -Signalübertragung beteiligt sind. Persistente Veränderungen des Redoxmilieus in pathologischen Zuständen, wie z. B. bei Krebs, sind in hohem Maße mit dem Trx-System verbunden. Eine Hochregulierung und/oder Überaktivität des Trx-Systems, die bei vielen Krebsarten auftreten, unterstützt zudem das Fortschreiten des Krebswachstums, was TrxR/Trx zu vielversprechenden Zielproteinen für die Entwicklung neuer Krebsmedikamente macht. Um die biochemischen Prozesse dahinter zu erforschen, sind spezielle Techniken zur Visualisierung und Messung enzymatischer Aktivität nötig. Die hierzu geeigneten, meist genetischen Sensoren messen ratiometrisch das Verhältnis reduzierter/oxidierter Spezies in zellulärem Umfeld oder spezifisch ausgewählte Redoxpaare. Die weitere Erforschung der exakten Funktion von TrxR/Trx und deren Substrate ist jedoch durch mangelnde Nachweismethoden limitiert. Diese sind außerdem zur Validierung chemischer Hemmstoffe für TrxR/Trx in Zellen und deren potenziellen Verwendung als Krebsmittel von großem Interesse. Bislang gibt es keinen selektiven zellulären Trx-Inhibitor und potenzielle Off-Target-Effekte der bekannten TrxR-Inhibitoren wurden nicht abschließend bewertet. Ziel dieser Arbeit ist die Entwicklung niedermolekularer, diagnostischer und therapeutischer Werkzeuge, die selektiv auf das Trx-System abzielen und auf einem modularen Trigger-Cargo Design basieren. Hierzu werden zyklische Disulfid-Substrate (Trigger) für Oxidoreduktasen so mit molekularen Wirkstoffen (Cargo) verknüpft, dass dabei die Wirkstoffaktivität maskiert, und erst nach Reduktion durch ein Zielprotein wiederhergestellt wird. Diese neuartigen, synthetischen Reduktionssensoren basieren auf den folgenden Grundprinzipien: (i) Zyklische Disulfide sind thermodynamisch stabilisiert und können nur durch die stärksten Reduktasen gespalten werden; und (ii) die zyklische Topologie ermöglicht die kinetische Reversibilität der zwei Thiol-Disulfid-Austauschreaktionen, die eine erste Reaktion mit Monothiolen, wie z. B. GSH, sofort umkehrt und so eine vollständige Reduktion verhindert. Die meisten früheren Arbeiten auf diesem Gebiet verwendeten ein zyklisches, fünfgliedriges Disulfid (1,2 Dithiolan) als Substrat für TrxR. Das gleiche Strukturmotiv wurde jedoch an anderer Stelle als thermodynamisch instabil beschrieben und aufgrund dieser Eigenschaft explizit für dynamische Kaskadenreaktionen verwendet. Deshalb vergleicht diese Arbeit zu Beginn einen neuen 1,2 Dithiolan basierten fluorogenen Indikator mit bestehenden, z. T. kommerziellen, Redox Sonden für TrxR in einer Reihe von Zellkultur-Experimenten unter Modulation der zellulären TrxR Aktivität und stellt so einen Widerspruch in der Literatur klar: 1,2 Dithiolane eignen sich nicht als selektive Substrate für TrxR, da sie labil sowohl gegen die Reduktion durch andere Redoxproteine, als auch gegen den Monothiol Hintergrund in Zellen sind (Nat. Commun. 2022). Als alternatives Strukturmotiv wird in dieser Arbeit ein bizyklisches sechsgliedriges Disulfid (anneliertes 1,2 Dithian) etabliert. Durch sein niedriges Reduktionspotenzial, also seine hohe Resistenz gegen Reduktion, werden molekulare Sonden basierend auf diesem 1,2 Dithian als Reduktionssensor fast ausschließlich von Trx aktiviert, nicht aber von TrxR oder GSH (JACS 2021). Dieses Kernmotiv bestimmt dabei die Reduzierbarkeit, und damit die Enzymspezifität, durch seine zyklische Natur und die Annelierung, auch unter Verwendung unterschiedlicher Farb-/Wirkstoffe. Auf dieser Grundlage konnte die molekulare Struktur durch einen weiteren Modifikationspunkt für die flexible Verwendung weiterer funktioneller Einheiten ergänzt werden. Obwohl zelluläre Studien ergaben, dass diese neuartigen 1,2 Dithian Einheiten in Zellen sowohl Trx als auch das strukturell verwandte Grx adressieren, sind die daraus resultierenden diagnostischen Moleküle wertvoll, um den katalytischen Umsatz zellulärer Dithiol-Reduktasen, der sogenannten Trx Superfamilie, selektiv anzuzeigen (Preprint 2023). Begünstigt durch das modulare Moleküldesign stellt diese Arbeit zudem das erste Reportersystem RX1 zum selektiven Nachweis der TrxR-Aktivität in Zellen vor. Es basiert auf der Verwendung eines zyklischen, unsymmetrischen Selenenylsulfid-Sensors (1,2 Thiaselenan), der selektiv von dem einzigartigen Selenolat der TrxR angegriffen wird, und dadurch letztlich nur von TrxR reduziert werden kann. RX1 eignete sich zudem für eine Hochdurchsatz-Validierung bestehender TrxR Inhibitoren und unterstreicht dadurch den kommerziellen Nutzen derartiger Diagnostika (Chem 2022). Das zentrale Trigger-Cargo Konzept dieser Arbeit wurde für therapeutische Zwecke weiterentwickelt und nutzt dabei den einzigartigen Wirkmechanismus der Duocarmycin-Naturstoffklasse (CBI) (JACS Au 2022) zur Entwicklung reduktiv aktivierbarer Therapeutika. CBI Prodrugs basierend auf stabilisierten Redox-Schaltern (1,2 Dithiane für Trx; 1,2 Thiaselenan für TrxR) reagierten signifikant auf TrxR-Modulation in Zellen. Sie wurden darüber hinaus durch das Referenzieren ihrer Aktivität gegenüber nicht-reduzierbaren Kontrollmoleküle für die Erstellung zelllinienabhängiger Profile der Reduktaseaktivität in 177 Zelllinien genutzt. Schließlich waren diese neuen Krebsmittel im Tiermodell gut verträglich und zeigten in zwei verschiedenen Mausmodellen eine krebshemmende Wirkung (Preprint 2022b). Zusammenfassend präsentiert diese Dissertation monothiol-resistente reduzierbare Trigger-Einheiten für das zelluläre Trx-System zur Entwicklung neuartiger, selektiver Reporter-Sonden, sowie eine neue Klasse reduktiv aktivierbarer Krebsmittel auf Basis eines adaptierbaren Trigger-Cargo Designs. Diese fanden entweder zur selektiven Messung zellulärer Proteinaktivität oder zum Einsatz als Antikrebsmittel Verwendung. Es wurden chemokompatible Motive sowohl für TrxR als auch für Trx/Grx identifiziert, wobei deren zyklische Natur eine Aktivierung durch GSH verhindert. Eine weitere Differenzierung zwischen den beiden Redox-Proteinen Trx und Grx und anderen Proteinen der Trx-Superfamilie erfordert eine zusätzliche Ebene der Selektierung, z. B. durch molekulare Erkennung, und ist Gegenstand laufender Arbeiten. Die flexible Verwendung verschiedener molekularer Wirkstoffe ermöglicht dabei die „Pipeline-Entwicklung“ von Diagnostika und Therapeutika, die von der zellulären Redox-Maschinerie analog umgesetzt werden, und dadurch Schlussfolgerungen aus der Diagnostik direkt auf eine Therapie übertragbar machen. Dies birgt großes Potenzial für künftige Entwicklungen bei einer potenziellen Übertragung des modularen Konzepts auf andere Enzymklassen und therapeutische Einsatzgebiete

    Robustness and Interpretability of Neural Networks’ Predictions under Adversarial Attacks

    Get PDF
    Le reti neurali profonde (DNNs) sono potenti modelli predittivi, che superano le capacità umane in una varietà di task. Imparano sistemi decisionali complessi e flessibili dai dati a disposizione e raggiungono prestazioni eccezionali in molteplici campi di apprendimento automatico, dalle applicazioni dell'intelligenza artificiale, come il riconoscimento di immagini, parole e testi, alle scienze più tradizionali, tra cui medicina, fisica e biologia. Nonostante i risultati eccezionali, le prestazioni elevate e l’alta precisione predittiva non sono sufficienti per le applicazioni nel mondo reale, specialmente in ambienti critici per la sicurezza, dove l'utilizzo dei DNNs è fortemente limitato dalla loro natura black-box. Vi è una crescente necessità di comprendere come vengono eseguite le predizioni, fornire stime di incertezza, garantire robustezza agli attacchi avversari e prevenire comportamenti indesiderati. Anche le migliori architetture sono vulnerabili a piccole perturbazioni nei dati di input, note come attacchi avversari: manipolazioni malevole degli input che sono percettivamente indistinguibili dai campioni originali ma sono in grado di ingannare il modello in predizioni errate. In questo lavoro, dimostriamo che tale fragilità è correlata alla geometria del manifold dei dati ed è quindi probabile che sia una caratteristica intrinseca delle predizioni dei DNNs. Questa condizione suggerisce una possibile direzione al fine di ottenere robustezza agli attacchi: studiamo la geometria degli attacchi avversari nel limite di un numero infinito di dati e di pesi per le reti neurali Bayesiane, dimostrando che, in questo limite, sono immuni agli attacchi avversari gradient-based. Inoltre, proponiamo alcune tecniche di training per migliorare la robustezza delle architetture deterministiche. In particolare, osserviamo sperimentalmente che ensembles di reti neurali addestrati su proiezioni casuali degli input originali in spazi basso-dimensionali sono più resistenti agli attacchi. Successivamente, ci concentriamo sul problema dell'interpretabilità delle predizioni delle reti nel contesto delle saliency-based explanations. Analizziamo la stabilità delle explanations soggette ad attacchi avversari e dimostriamo che, nel limite di un numero infinito di dati e di pesi, le interpretazioni Bayesiane sono più stabili di quelle fornite dalle reti deterministiche. Confermiamo questo comportamento in modo sperimentale nel regime di un numero finito di dati. Infine, introduciamo il concetto di attacco avversario alle sequenze di amminoacidi per protein Language Models (LM). I modelli di Deep Learning per la predizione della struttura delle proteine, come AlphaFold2, sfruttano le architetture Transformer e il loro meccanismo di attention per catturare le proprietà strutturali e funzionali delle sequenze di amminoacidi. Nonostante l'elevata precisione delle predizioni, perturbazioni biologicamente piccole delle sequenze di input, o anche mutazioni di un singolo amminoacido, possono portare a strutture 3D sostanzialmente diverse. Al contempo, i protein LMs sono insensibili alle mutazioni che inducono misfolding o disfunzione (ad esempio le missense mutations). In particolare, le predizioni delle coordinate 3D non rivelano l'effetto di unfolding indotto da queste mutazioni. Pertanto, esiste un'evidente incoerenza tra l'importanza biologica delle mutazioni e il conseguente cambiamento nella predizione strutturale. Ispirati da questo problema, introduciamo il concetto di perturbazione avversaria delle sequenze proteiche negli embedding continui dei protein LMs. Il nostro metodo utilizza i valori di attention per rilevare le posizioni degli amminoacidi più vulnerabili nelle sequenze di input. Le mutazioni avversarie sono biologicamente diverse dalle sequenze di riferimento e sono in grado di alterare in modo significativo le strutture 3D.Deep Neural Networks (DNNs) are powerful predictive models, exceeding human capabilities in a variety of tasks. They learn complex and flexible decision systems from the available data and achieve exceptional performances in multiple machine learning fields, spanning from applications in artificial intelligence, such as image, speech and text recognition, to the more traditional sciences, including medicine, physics and biology. Despite the outstanding achievements, high performance and high predictive accuracy are not sufficient for real-world applications, especially in safety-critical settings, where the usage of DNNs is severely limited by their black-box nature. There is an increasing need to understand how predictions are performed, to provide uncertainty estimates, to guarantee robustness to malicious attacks and to prevent unwanted behaviours. State-of-the-art DNNs are vulnerable to small perturbations in the input data, known as adversarial attacks: maliciously crafted manipulations of the inputs that are perceptually indistinguishable from the original samples but are capable of fooling the model into incorrect predictions. In this work, we prove that such brittleness is related to the geometry of the data manifold and is therefore likely to be an intrinsic feature of DNNs’ predictions. This negative condition suggests a possible direction to overcome such limitation: we study the geometry of adversarial attacks in the large-data, overparameterized limit for Bayesian Neural Networks and prove that, in this limit, they are immune to gradient-based adversarial attacks. Furthermore, we propose some training techniques to improve the adversarial robustness of deterministic architectures. In particular, we experimentally observe that ensembles of NNs trained on random projections of the original inputs into lower dimensional spaces are more resilient to the attacks. Next, we focus on the problem of interpretability of NNs’ predictions in the setting of saliency-based explanations. We analyze the stability of the explanations under adversarial attacks on the inputs and we prove that, in the large-data and overparameterized limit, Bayesian interpretations are more stable than those provided by deterministic networks. We validate this behaviour in multiple experimental settings in the finite data regime. Finally, we introduce the concept of adversarial perturbations of amino acid sequences for protein Language Models (LMs). Deep Learning models for protein structure prediction, such as AlphaFold2, leverage Transformer architectures and their attention mechanism to capture structural and functional properties of amino acid sequences. Despite the high accuracy of predictions, biologically small perturbations of the input sequences, or even single point mutations, can lead to substantially different 3d structures. On the other hand, protein language models are insensitive to mutations that induce misfolding or dysfunction (e.g. missense mutations). Precisely, predictions of the 3d coordinates do not reveal the structure-disruptive effect of these mutations. Therefore, there is an evident inconsistency between the biological importance of mutations and the resulting change in structural prediction. Inspired by this problem, we introduce the concept of adversarial perturbation of protein sequences in continuous embedding spaces of protein language models. Our method relies on attention scores to detect the most vulnerable amino acid positions in the input sequences. Adversarial mutations are biologically diverse from their references and are able to significantly alter the resulting 3D structures
    • …
    corecore