8,317 research outputs found
Modular lifelong machine learning
Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge.
Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand.
This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems.
First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures.
Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations.
Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods.
Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer
A multifluid model with chemically reacting components -- construction of weak solutions
We investigate the existence of weak solutions to the multi-component system,
consisting of compressible chemically reacting components, coupled with the
compressible Stokes equation for the velocity. Specifically, we consider the
case of the irreversible chemical reaction and assume the nonlinear relation
between the pressure and the particular densities. These assumptions cause
additional difficulties in the mathematical analysis, due to the possible
presence of vacuum.
It is shown that there exists a global weak solution, satisfying the
bounds for all the components. Moreover, despite the lack of
regularity on the gradients, we obtain strong compactness of densities in
spaces. The applied method captures the properties of the models of high
generality, which admit an arbitrary number of components. Furthermore, the
framework we develop can handle models that contain both diffusing and
non-diffusing elements
Blind Restoration of Real-World Audio by 1D Operational GANs
Objective: Despite numerous studies proposed for audio restoration in the
literature, most of them focus on an isolated restoration problem such as
denoising or dereverberation, ignoring other artifacts. Moreover, assuming a
noisy or reverberant environment with limited number of fixed
signal-to-distortion ratio (SDR) levels is a common practice. However,
real-world audio is often corrupted by a blend of artifacts such as
reverberation, sensor noise, and background audio mixture with varying types,
severities, and duration. In this study, we propose a novel approach for blind
restoration of real-world audio signals by Operational Generative Adversarial
Networks (Op-GANs) with temporal and spectral objective metrics to enhance the
quality of restored audio signal regardless of the type and severity of each
artifact corrupting it. Methods: 1D Operational-GANs are used with generative
neuron model optimized for blind restoration of any corrupted audio signal.
Results: The proposed approach has been evaluated extensively over the
benchmark TIMIT-RAR (speech) and GTZAN-RAR (non-speech) datasets corrupted with
a random blend of artifacts each with a random severity to mimic real-world
audio signals. Average SDR improvements of over 7.2 dB and 4.9 dB are achieved,
respectively, which are substantial when compared with the baseline methods.
Significance: This is a pioneer study in blind audio restoration with the
unique capability of direct (time-domain) restoration of real-world audio
whilst achieving an unprecedented level of performance for a wide SDR range and
artifact types. Conclusion: 1D Op-GANs can achieve robust and computationally
effective real-world audio restoration with significantly improved performance.
The source codes and the generated real-world audio datasets are shared
publicly with the research community in a dedicated GitHub repository1
Assembling Single RbCs Molecules with Optical Tweezers
Optical tweezer arrays are useful tools for manipulating single atoms and molecules.
An exciting avenue for research with optical tweezers is using the interactions between polar molecules for quantum computation or quantum simulation.
Molecules can be assembled in an optical tweezer array starting from pairs of atoms.
The atoms must be initialised in the relative motional ground state of a common trap.
This work outlines the design of a Raman sideband cooling protocol which is implemented to prepare an 87-Rubidium atom in the motional ground state of an 817 nm tweezer, and a 133-Caesium atom in the motional ground state of a 938 nm tweezer.
The protocol circumvents strong heating and dephasing associated with the trap by operating at lower trap depths and cooling from outside the Lamb-Dicke regime.
By analysing several sources of heating, we design and implement a merging sequence that transfers the Rb atom and the Cs atom to a common trap with minimal motional excitation.
Subsequently, we perform a detailed characterisation of AC Stark shifts caused by the tweezer light, and identify several situations in which the confinement of the atom pair influences their interactions.
Then, we demonstrate the preparation of a molecular bound state after an adiabatic ramp across a magnetic Feshbach resonance.
Measurements of molecular loss rates provide evidence that the atoms are in fact associated during the merging sequence, before the magnetic field ramp.
By preparing a weakly-bound molecule in an optical tweezer, we carry out important steps towards assembling an array of ultracold RbCs molecules in their rovibrational ground states
Rational development of stabilized cyclic disulfide redox probes and bioreductive prodrugs to target dithiol oxidoreductases
Countless biological processes allow cells to develop, survive, and proliferate. Among these, tightly balanced regulatory enzymatic pathways that can respond rapidly to external impacts maintain dynamic physiological homeostasis. More specifically, redox homeostasis broadly affects cellular metabolism and proliferation, with major contributions by thiol/disulfide oxidoreductase systems, in particular, the Thioredoxin Reductase Thioredoxin (TrxR/Trx) and the Glutathione Reductase-Glutathione-Glutaredoxin (GR/GSH/Grx) systems.
These cascades drive vital cellular functions in many ways through signaling, regulating other proteins' activity by redox switches, and by stoichiometric reductant transfers in metabolism and antioxidant systems. Increasing evidence argues that there is a persistent alteration of the redox environment in certain pathological states, such as cancer, that heavily involve the Trx system: upregulation and/or overactivity of the Trx system may support or drive cancer progression, making both TrxR and Trx promising targets for anti-cancer drug development.
Understanding the biochemical mechanisms and connections between certain redox cascades requires research tools that interact with them. The state-of-the-art genetic tools are mostly ratiometric reporters that measure reduced:oxidized ratios of selected redox pairs or the general thiol pool. However, the precise cellular roles of the central oxidoreductase systems, including TrxR and Trx, remain inaccessible due to the lack of probes to selectively measure turnover by either of these proteins. However, such probes would allow measuring their effective reductive activity apart from expression levels in native systems, including in cells, animals, or patient samples. They are also of high interest to identify chemical inhibitors for TrxR/Trx in cells and to validate their potential use as anti-cancer agents (to date, there is no selective cellular Trx inhibitor, and most known TrxR inhibitors were not comprehensively evaluated considering selectivity and potential off-targets). However, small molecule redox imaging tools are underdeveloped: their protein specificity, spectral properties, and applicability remain poorly precedented.
This work aimed to address this opportunity gap and develop novel, small molecule diagnostic and therapeutic tools to selectively target the Trx system based on a modular trigger cargo design: artificial cyclic disulfide substrates (trigger) for oxidoreductases are tethered to molecular agents (cargo) such that the cargoâs activity is masked and is re-established only through reduction by a target protein.
The rational design of these novel reduction sensors to target the cell's strongest disulfide-reducing enzymes was driven by the following principles: (i) cyclic disulfide triggers with stabilized ring systems were used to gain low reduction potentials that should resist reduction except by the strongest cellular reductases, such as Trx; and (ii) the cyclic topology also offers the potential for kinetic reversibility that should select for dithiol-type redox proteins over the cellular monothiol background. Creating imaging agents based on such two-component designs to selectively measure redox protein activity in native cells required to combine the correct trigger reducibility, probe activation kinetics, and imaging modalities and to consider the overall molecular architecture.
The major prior art in this field has applied cyclic 5-membered disulfides (1,2 dithiolanes) as substrates for TrxR in a similar way to create such tools. However, this motif was described elsewhere as thermodynamically instable and was due to widely used for dynamic covalent cascade reactions. By comparing a novel 1,2 dithiolane-based probe to the state-of-the-art probes, including commercial TrxR sensors, by screening a conclusive assay panel of cellular TrxR modulations, I clarified that 1,2 dithiolanes are not selective substrates for TrxR in biological settings (Nat Commun 2022).
Instead, aiming for more stable ring systems and thus more robust redox probes, during this work, I developed bicyclic 6 membered disulfides (piperidine fused 1,2 dithianes) with remarkably low reduction potentials. I showed that molecular probes using them as reduction sensors can be mostly processed by thioredoxins while being stable against reduction by GSH. The thermodynamically stabilized decalin like topology of the cis-annelated 1,2 dithianes requires particularly strong reductants to be cleaved. They also select for dithiol type redox proteins, like Trx, based on kinetic reversibility and offer fast cyclization due to the preorganization by annelation (JACS 2021).
This work further expanded the systemâs modularity with structural cores based on piperazine-fused 1,2 dithianes with the two amines allowing independent derivatization. Diagnostic tools using them as reduction sensors proved equally robust but with highly improved activation kinetics and were thus cellularly activated. Cellular studies evolved that they are substrates for both Trxs and their protein cousins Grxs, so measuring the cellular dithiol protein pool rather than solely Trx activity (preprint 2023).
Finally, a trigger based on a slightly adapted reduction sensor, a desymmetrized 1,2 thiaselenane, was designed for selective reduction by TrxRâs selenol/thiol active site, then combined with a precipitating large Stokesâ shift fluorophore and a solubilizing group, to evolve the first selective probe RX1 to measure cellular TrxR activity, which even allowed high throughput inhibitor screening (Chem 2022).
The central principle of this work was further advanced to therapeutic prodrugs based on the duocarmycin cargo (CBI) with tunable potency (JACS Au 2022) that can be used to create off-to-on therapeutic prodrugs. Such CBI prodrugs employing stabilized 1,2 dichalcogenide triggers proved to be cytotoxins that depend on Trx system activity in cells. They could further be exploited for cell-line dependent reductase activity profiling by screening their redox activation indices, the reduction-dependent part of total prodrug activation, in 177 cell lines. Beyond that, these prodrugs were well-tolerated in animals and showed anti-cancer efficacy in vivo in two distinct mouse tumor models (preprint 2022).
Taken together, I introduced unique monothiol-resistant reducible motifs to target the cellular Trx system with chemocompatible units for each for TrxR and Trx/Grx, where the cyclic nature of the dichalcogenides avoids activation by GSH. By using them with distinct molecular cargos, I developed novel selective fluorescent reporter probes; and introduced a new class of bioreductive therapeutic constructs based on a common modular design. These were either applied to selectively measure cellular reductase activity or to deliver cytotoxic anti cancer agents in vivo. Ongoing work aims to differentiate between the two major redox effector proteins Trx and Grx, requiring additional layers of selectivity that may be addressed by tuned molecular recognition. The flexible use of various molecular cargos allows harnessing the same cellular redox machinery by either probes or prodrugs. This allows predictive conclusions from diagnostics to be directly translated into therapy and offers great potential for future adaptation to other enzyme classes and therapeutic venues.Die zellulĂ€re Redox-Homöostase hĂ€ngt von Thiol/Disulfid-Oxidoreduktasen ab, die den Stoffwechsel, die Proliferation und die antioxidative Antwort von Zellen beeinflussen. Die wichtigsten Netzwerke sind die Thioredoxin Reduktase-Thioredoxin (TrxR/Trx) und Glutathion Reduktase-Glutathion-Glutaredoxin (GR/GSH/Grx) Systeme, die ĂŒber Redox-Schalter in Substratproteinen lebenswichtige zellulĂ€re Funktionen steuern und so an der Redox-Regulation und -SignalĂŒbertragung beteiligt sind. Persistente VerĂ€nderungen des Redoxmilieus in pathologischen ZustĂ€nden, wie z. B. bei Krebs, sind in hohem MaĂe mit dem Trx-System verbunden. Eine Hochregulierung und/oder ĂberaktivitĂ€t des Trx-Systems, die bei vielen Krebsarten auftreten, unterstĂŒtzt zudem das Fortschreiten des Krebswachstums, was TrxR/Trx zu vielversprechenden Zielproteinen fĂŒr die Entwicklung neuer Krebsmedikamente macht.
Um die biochemischen Prozesse dahinter zu erforschen, sind spezielle Techniken zur Visualisierung und Messung enzymatischer AktivitĂ€t nötig. Die hierzu geeigneten, meist genetischen Sensoren messen ratiometrisch das VerhĂ€ltnis reduzierter/oxidierter Spezies in zellulĂ€rem Umfeld oder spezifisch ausgewĂ€hlte Redoxpaare. Die weitere Erforschung der exakten Funktion von TrxR/Trx und deren Substrate ist jedoch durch mangelnde Nachweismethoden limitiert. Diese sind auĂerdem zur Validierung chemischer Hemmstoffe fĂŒr TrxR/Trx in Zellen und deren potenziellen Verwendung als Krebsmittel von groĂem Interesse. Bislang gibt es keinen selektiven zellulĂ€ren Trx-Inhibitor und potenzielle Off-Target-Effekte der bekannten TrxR-Inhibitoren wurden nicht abschlieĂend bewertet.
Ziel dieser Arbeit ist die Entwicklung niedermolekularer, diagnostischer und therapeutischer Werkzeuge, die selektiv auf das Trx-System abzielen und auf einem modularen Trigger-Cargo Design basieren. Hierzu werden zyklische Disulfid-Substrate (Trigger) fĂŒr Oxidoreduktasen so mit molekularen Wirkstoffen (Cargo) verknĂŒpft, dass dabei die WirkstoffaktivitĂ€t maskiert, und erst nach Reduktion durch ein Zielprotein wiederhergestellt wird. Diese neuartigen, synthetischen Reduktionssensoren basieren auf den folgenden Grundprinzipien: (i) Zyklische Disulfide sind thermodynamisch stabilisiert und können nur durch die stĂ€rksten Reduktasen gespalten werden; und (ii) die zyklische Topologie ermöglicht die kinetische ReversibilitĂ€t der zwei Thiol-Disulfid-Austauschreaktionen, die eine erste Reaktion mit Monothiolen, wie z. B. GSH, sofort umkehrt und so eine vollstĂ€ndige Reduktion verhindert.
Die meisten frĂŒheren Arbeiten auf diesem Gebiet verwendeten ein zyklisches, fĂŒnfgliedriges Disulfid (1,2 Dithiolan) als Substrat fĂŒr TrxR. Das gleiche Strukturmotiv wurde jedoch an anderer Stelle als thermodynamisch instabil beschrieben und aufgrund dieser Eigenschaft explizit fĂŒr dynamische Kaskadenreaktionen verwendet. Deshalb vergleicht diese Arbeit zu Beginn einen neuen 1,2 Dithiolan basierten fluorogenen Indikator mit bestehenden, z. T. kommerziellen, Redox Sonden fĂŒr TrxR in einer Reihe von Zellkultur-Experimenten unter Modulation der zellulĂ€ren TrxR AktivitĂ€t und stellt so einen Widerspruch in der Literatur klar: 1,2 Dithiolane eignen sich nicht als selektive Substrate fĂŒr TrxR, da sie labil sowohl gegen die Reduktion durch andere Redoxproteine, als auch gegen den Monothiol Hintergrund in Zellen sind (Nat. Commun. 2022).
Als alternatives Strukturmotiv wird in dieser Arbeit ein bizyklisches sechsgliedriges Disulfid (anneliertes 1,2 Dithian) etabliert. Durch sein niedriges Reduktionspotenzial, also seine hohe Resistenz gegen Reduktion, werden molekulare Sonden basierend auf diesem 1,2 Dithian als Reduktionssensor fast ausschlieĂlich von Trx aktiviert, nicht aber von TrxR oder GSH (JACS 2021). Dieses Kernmotiv bestimmt dabei die Reduzierbarkeit, und damit die EnzymspezifitĂ€t, durch seine zyklische Natur und die Annelierung, auch unter Verwendung unterschiedlicher Farb-/Wirkstoffe. Auf dieser Grundlage konnte die molekulare Struktur durch einen weiteren Modifikationspunkt fĂŒr die flexible Verwendung weiterer funktioneller Einheiten ergĂ€nzt werden. Obwohl zellulĂ€re Studien ergaben, dass diese neuartigen 1,2 Dithian Einheiten in Zellen sowohl Trx als auch das strukturell verwandte Grx adressieren, sind die daraus resultierenden diagnostischen MolekĂŒle wertvoll, um den katalytischen Umsatz zellulĂ€rer Dithiol-Reduktasen, der sogenannten Trx Superfamilie, selektiv anzuzeigen (Preprint 2023).
BegĂŒnstigt durch das modulare MolekĂŒldesign stellt diese Arbeit zudem das erste Reportersystem RX1 zum selektiven Nachweis der TrxR-AktivitĂ€t in Zellen vor. Es basiert auf der Verwendung eines zyklischen, unsymmetrischen Selenenylsulfid-Sensors (1,2 Thiaselenan), der selektiv von dem einzigartigen Selenolat der TrxR angegriffen wird, und dadurch letztlich nur von TrxR reduziert werden kann. RX1 eignete sich zudem fĂŒr eine Hochdurchsatz-Validierung bestehender TrxR Inhibitoren und unterstreicht dadurch den kommerziellen Nutzen derartiger Diagnostika (Chem 2022).
Das zentrale Trigger-Cargo Konzept dieser Arbeit wurde fĂŒr therapeutische Zwecke weiterentwickelt und nutzt dabei den einzigartigen Wirkmechanismus der Duocarmycin-Naturstoffklasse (CBI) (JACS Au 2022) zur Entwicklung reduktiv aktivierbarer Therapeutika. CBI Prodrugs basierend auf stabilisierten Redox-Schaltern (1,2 Dithiane fĂŒr Trx; 1,2 Thiaselenan fĂŒr TrxR) reagierten signifikant auf TrxR-Modulation in Zellen. Sie wurden darĂŒber hinaus durch das Referenzieren ihrer AktivitĂ€t gegenĂŒber nicht-reduzierbaren KontrollmolekĂŒle fĂŒr die Erstellung zelllinienabhĂ€ngiger Profile der ReduktaseaktivitĂ€t in 177 Zelllinien genutzt. SchlieĂlich waren diese neuen Krebsmittel im Tiermodell gut vertrĂ€glich und zeigten in zwei verschiedenen Mausmodellen eine krebshemmende Wirkung (Preprint 2022b).
Zusammenfassend prĂ€sentiert diese Dissertation monothiol-resistente reduzierbare Trigger-Einheiten fĂŒr das zellulĂ€re Trx-System zur Entwicklung neuartiger, selektiver Reporter-Sonden, sowie eine neue Klasse reduktiv aktivierbarer Krebsmittel auf Basis eines adaptierbaren Trigger-Cargo Designs. Diese fanden entweder zur selektiven Messung zellulĂ€rer ProteinaktivitĂ€t oder zum Einsatz als Antikrebsmittel Verwendung. Es wurden chemokompatible Motive sowohl fĂŒr TrxR als auch fĂŒr Trx/Grx identifiziert, wobei deren zyklische Natur eine Aktivierung durch GSH verhindert. Eine weitere Differenzierung zwischen den beiden Redox-Proteinen Trx und Grx und anderen Proteinen der Trx-Superfamilie erfordert eine zusĂ€tzliche Ebene der Selektierung, z. B. durch molekulare Erkennung, und ist Gegenstand laufender Arbeiten.
Die flexible Verwendung verschiedener molekularer Wirkstoffe ermöglicht dabei die âPipeline-Entwicklungâ von Diagnostika und Therapeutika, die von der zellulĂ€ren Redox-Maschinerie analog umgesetzt werden, und dadurch Schlussfolgerungen aus der Diagnostik direkt auf eine Therapie ĂŒbertragbar machen. Dies birgt groĂes Potenzial fĂŒr kĂŒnftige Entwicklungen bei einer potenziellen Ăbertragung des modularen Konzepts auf andere Enzymklassen und therapeutische Einsatzgebiete
Discriminative Multimodal Learning via Conditional Priors in Generative Models
Deep generative models with latent variables have been used lately to learn
joint representations and generative processes from multi-modal data. These two
learning mechanisms can, however, conflict with each other and representations
can fail to embed information on the data modalities. This research studies the
realistic scenario in which all modalities and class labels are available for
model training, but where some modalities and labels required for downstream
tasks are missing. We show, in this scenario, that the variational lower bound
limits mutual information between joint representations and missing modalities.
We, to counteract these problems, introduce a novel conditional multi-modal
discriminative model that uses an informative prior distribution and optimizes
a likelihood-free objective function that maximizes mutual information between
joint representations and missing modalities. Extensive experimentation
demonstrates the benefits of our proposed model, empirical results show that
our model achieves state-of-the-art results in representative problems such as
downstream classification, acoustic inversion, and image and annotation
generation
An Efficient Quadrature Sequence and Sparsifying Methodology for Mean-Field Variational Inference
This work proposes a quasirandom sequence of quadratures for high-dimensional
mean-field variational inference and a related sparsifying methodology. Each
iterate of the sequence contains two evaluations points that combine to
correctly integrate all univariate quadratic functions, as well as univariate
cubics if the mean-field factors are symmetric. More importantly, averaging
results over short subsequences achieves periodic exactness on a much larger
space of multivariate polynomials of quadratic total degree. This framework is
devised by first considering stochastic blocked mean-field quadratures, which
may be useful in other contexts. By replacing pseudorandom sequences with
quasirandom sequences, over half of all multivariate quadratic basis functions
integrate exactly with only 4 function evaluations, and the exactness dimension
increases for longer subsequences. Analysis shows how these efficient integrals
characterize the dominant log-posterior contributions to mean-field variational
approximations, including diagonal Hessian approximations, to support a robust
sparsifying methodology in deep learning algorithms. A numerical demonstration
of this approach on a simple Convolutional Neural Network for MNIST retains
high test accuracy, 96.9%, while training over 98.9% of parameters to zero in
only 10 epochs, bearing potential to reduce both storage and energy
requirements for deep learning models
Large deviation theory-based adaptive importance sampling for rare events in high dimensions
We propose a method for the accurate estimation of rare event or failure
probabilities for expensive-to-evaluate numerical models in high dimensions.
The proposed approach combines ideas from large deviation theory and adaptive
importance sampling. The importance sampler uses a cross-entropy method to find
an optimal Gaussian biasing distribution, and reuses all samples made
throughout the process for both, the target probability estimation and for
updating the biasing distributions. Large deviation theory is used to find a
good initial biasing distribution through the solution of an optimization
problem. Additionally, it is used to identify a low-dimensional subspace that
is most informative of the rare event probability. This subspace is used for
the cross-entropy method, which is known to lose efficiency in higher
dimensions. The proposed method does not require smoothing of indicator
functions nor does it involve numerical tuning parameters. We compare the
method with a state-of-the-art cross-entropy-based importance sampling scheme
using three examples: a high-dimensional failure probability estimation
benchmark, a problem governed by a diffusion equation, and a tsunami problem
governed by the time-dependent shallow water system in one spatial dimension
Efficient Multimodal Fusion via Interactive Prompting
Large-scale pre-training has brought unimodal fields such as computer vision
and natural language processing to a new era. Following this trend, the size of
multi-modal learning models constantly increases, leading to an urgent need to
reduce the massive computational cost of finetuning these models for downstream
tasks. In this paper, we propose an efficient and flexible multimodal fusion
method, namely PMF, tailored for fusing unimodally pre-trained transformers.
Specifically, we first present a modular multimodal fusion framework that
exhibits high flexibility and facilitates mutual interactions among different
modalities. In addition, we disentangle vanilla prompts into three types in
order to learn different optimizing objectives for multimodal learning. It is
also worth noting that we propose to add prompt vectors only on the deep layers
of the unimodal transformers, thus significantly reducing the training memory
usage. Experiment results show that our proposed method achieves comparable
performance to several other multimodal finetuning methods with less than 3%
trainable parameters and up to 66% saving of training memory usage.Comment: Camera-ready version for CVPR202
The e-value and the Full Bayesian Significance Test: Logical Properties and Philosophical Consequences
This article gives a conceptual review of the e-value, ev(H|X) â the epistemic value of hypothesis H given observations X. This statistical significance measure was developed in order to allow logically coherent and consistent tests of hypotheses, including sharp or precise hypotheses, via the Full Bayesian Significance Test (FBST).
Arguments of analysis allow a full characterization of this statistical test by its logical or compositional properties, showing a mutual complementarity between results of mathematical statistics and the logical desiderata lying at the foundations of this theory
- âŠ