13,762 research outputs found

    Diffusion Schr\"odinger Bridge Matching

    Full text link
    Solving transport problems, i.e. finding a map transporting one given distribution to another, has numerous applications in machine learning. Novel mass transport methods motivated by generative modeling have recently been proposed, e.g. Denoising Diffusion Models (DDMs) and Flow Matching Models (FMMs) implement such a transport through a Stochastic Differential Equation (SDE) or an Ordinary Differential Equation (ODE). However, while it is desirable in many applications to approximate the deterministic dynamic Optimal Transport (OT) map which admits attractive properties, DDMs and FMMs are not guaranteed to provide transports close to the OT map. In contrast, Schr\"odinger bridges (SBs) compute stochastic dynamic mappings which recover entropy-regularized versions of OT. Unfortunately, existing numerical methods approximating SBs either scale poorly with dimension or accumulate errors across iterations. In this work, we introduce Iterative Markovian Fitting, a new methodology for solving SB problems, and Diffusion Schr\"odinger Bridge Matching (DSBM), a novel numerical algorithm for computing IMF iterates. DSBM significantly improves over previous SB numerics and recovers as special/limiting cases various recent transport methods. We demonstrate the performance of DSBM on a variety of problems

    Model Diagnostics meets Forecast Evaluation: Goodness-of-Fit, Calibration, and Related Topics

    Get PDF
    Principled forecast evaluation and model diagnostics are vital in fitting probabilistic models and forecasting outcomes of interest. A common principle is that fitted or predicted distributions ought to be calibrated, ideally in the sense that the outcome is indistinguishable from a random draw from the posited distribution. Much of this thesis is centered on calibration properties of various types of forecasts. In the first part of the thesis, a simple algorithm for exact multinomial goodness-of-fit tests is proposed. The algorithm computes exact pp-values based on various test statistics, such as the log-likelihood ratio and Pearson\u27s chi-square. A thorough analysis shows improvement on extant methods. However, the runtime of the algorithm grows exponentially in the number of categories and hence its use is limited. In the second part, a framework rooted in probability theory is developed, which gives rise to hierarchies of calibration, and applies to both predictive distributions and stand-alone point forecasts. Based on a general notion of conditional T-calibration, the thesis introduces population versions of T-reliability diagrams and revisits a score decomposition into measures of miscalibration, discrimination, and uncertainty. Stable and efficient estimators of T-reliability diagrams and score components arise via nonparametric isotonic regression and the pool-adjacent-violators algorithm. For in-sample model diagnostics, a universal coefficient of determination is introduced that nests and reinterprets the classical R2R^2 in least squares regression. In the third part, probabilistic top lists are proposed as a novel type of prediction in classification, which bridges the gap between single-class predictions and predictive distributions. The probabilistic top list functional is elicited by strictly consistent evaluation metrics, based on symmetric proper scoring rules, which admit comparison of various types of predictions

    A study of uncertainty quantification in overparametrized high-dimensional models

    Full text link
    Uncertainty quantification is a central challenge in reliable and trustworthy machine learning. Naive measures such as last-layer scores are well-known to yield overconfident estimates in the context of overparametrized neural networks. Several methods, ranging from temperature scaling to different Bayesian treatments of neural networks, have been proposed to mitigate overconfidence, most often supported by the numerical observation that they yield better calibrated uncertainty measures. In this work, we provide a sharp comparison between popular uncertainty measures for binary classification in a mathematically tractable model for overparametrized neural networks: the random features model. We discuss a trade-off between classification accuracy and calibration, unveiling a double descent like behavior in the calibration curve of optimally regularized estimators as a function of overparametrization. This is in contrast with the empirical Bayes method, which we show to be well calibrated in our setting despite the higher generalization error and overparametrization

    A variational Bayesian inference technique for model updating of structural systems with unknown noise statistics

    Get PDF
    Dynamic models of structural and mechanical systems can be updated to match the measured data through a Bayesian inference process. However, the performance of classical (non-adaptive) Bayesian model updating approaches decreases significantly when the pre-assumed statistical characteristics of the model prediction error are violated. To overcome this issue, this paper presents an adaptive recursive variational Bayesian approach to estimate the statistical characteristics of the prediction error jointly with the unknown model parameters. This approach improves the accuracy and robustness of model updating by including the estimation of model prediction error. The performance of this approach is demonstrated using numerically simulated data obtained from a structural frame with material non-linearity under earthquake excitation. Results show that in the presence of non-stationary noise/error, the non-adaptive approach fails to estimate unknown model parameters, whereas the proposed approach can accurately estimate them

    Modeling Uncertainty for Reliable Probabilistic Modeling in Deep Learning and Beyond

    Full text link
    [ES] Esta tesis se enmarca en la intersección entre las técnicas modernas de Machine Learning, como las Redes Neuronales Profundas, y el modelado probabilístico confiable. En muchas aplicaciones, no solo nos importa la predicción hecha por un modelo (por ejemplo esta imagen de pulmón presenta cáncer) sino también la confianza que tiene el modelo para hacer esta predicción (por ejemplo esta imagen de pulmón presenta cáncer con 67% probabilidad). En tales aplicaciones, el modelo ayuda al tomador de decisiones (en este caso un médico) a tomar la decisión final. Como consecuencia, es necesario que las probabilidades proporcionadas por un modelo reflejen las proporciones reales presentes en el conjunto al que se ha asignado dichas probabilidades; de lo contrario, el modelo es inútil en la práctica. Cuando esto sucede, decimos que un modelo está perfectamente calibrado. En esta tesis se exploran tres vias para proveer modelos más calibrados. Primero se muestra como calibrar modelos de manera implicita, que son descalibrados por técnicas de aumentación de datos. Se introduce una función de coste que resuelve esta descalibración tomando como partida las ideas derivadas de la toma de decisiones con la regla de Bayes. Segundo, se muestra como calibrar modelos utilizando una etapa de post calibración implementada con una red neuronal Bayesiana. Finalmente, y en base a las limitaciones estudiadas en la red neuronal Bayesiana, que hipotetizamos que se basan en un prior mispecificado, se introduce un nuevo proceso estocástico que sirve como distribución a priori en un problema de inferencia Bayesiana.[CA] Aquesta tesi s'emmarca en la intersecció entre les tècniques modernes de Machine Learning, com ara les Xarxes Neuronals Profundes, i el modelatge probabilístic fiable. En moltes aplicacions, no només ens importa la predicció feta per un model (per ejemplem aquesta imatge de pulmó presenta càncer) sinó també la confiança que té el model per fer aquesta predicció (per exemple aquesta imatge de pulmó presenta càncer amb 67% probabilitat). En aquestes aplicacions, el model ajuda el prenedor de decisions (en aquest cas un metge) a prendre la decisió final. Com a conseqüència, cal que les probabilitats proporcionades per un model reflecteixin les proporcions reals presents en el conjunt a què s'han assignat aquestes probabilitats; altrament, el model és inútil a la pràctica. Quan això passa, diem que un model està perfectament calibrat. En aquesta tesi s'exploren tres vies per proveir models més calibrats. Primer es mostra com calibrar models de manera implícita, que són descalibrats per tècniques d'augmentació de dades. S'introdueix una funció de cost que resol aquesta descalibració prenent com a partida les idees derivades de la presa de decisions amb la regla de Bayes. Segon, es mostra com calibrar models utilitzant una etapa de post calibratge implementada amb una xarxa neuronal Bayesiana. Finalment, i segons les limitacions estudiades a la xarxa neuronal Bayesiana, que es basen en un prior mispecificat, s'introdueix un nou procés estocàstic que serveix com a distribució a priori en un problema d'inferència Bayesiana.[EN] This thesis is framed at the intersection between modern Machine Learning techniques, such as Deep Neural Networks, and reliable probabilistic modeling. In many machine learning applications, we do not only care about the prediction made by a model (e.g. this lung image presents cancer) but also in how confident is the model in making this prediction (e.g. this lung image presents cancer with 67% probability). In such applications, the model assists the decision-maker (in this case a doctor) towards making the final decision. As a consequence, one needs that the probabilities provided by a model reflects the true underlying set of outcomes, otherwise the model is useless in practice. When this happens, we say that a model is perfectly calibrated. In this thesis three ways are explored to provide more calibrated models. First, it is shown how to calibrate models implicitly, which are decalibrated by data augmentation techniques. A cost function is introduced that solves this decalibration taking as a starting point the ideas derived from decision making with Bayes' rule. Second, it shows how to calibrate models using a post-calibration stage implemented with a Bayesian neural network. Finally, and based on the limitations studied in the Bayesian neural network, which we hypothesize that came from a mispecified prior, a new stochastic process is introduced that serves as a priori distribution in a Bayesian inference problem.Maroñas Molano, J. (2022). Modeling Uncertainty for Reliable Probabilistic Modeling in Deep Learning and Beyond [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/181582TESI

    A Comparative Study on Students’ Learning Expectations of Entrepreneurship Education in the UK and China

    Get PDF
    Entrepreneurship education has become a critical subject in academic research and educational policy design, occupying a central role in contemporary education globally. However, a review of the literature indicates that research on entrepreneurship education is still in a relatively early stage. Little is known about how entrepreneurship education learning is affected by the environmental context to date. Therefore, combining the institutional context and focusing on students’ learning expectations as a novel perspective, the main aim of the thesis is to address the knowledge gap by developing an original conceptual framework to advance understanding of the dynamic learning process of entrepreneurship education through the lens of self-determination theory, thereby providing a basis for advancing understanding of entrepreneurship education. The author adopted an epistemological positivism philosophy and a deductive approach. This study gathered 247 valid questionnaires from the UK (84) and China (163). It requested students to recall their learning expectations before attending their entrepreneurship courses and to assess their perceptions of learning outcomes after taking the entrepreneurship courses. It was found that entrepreneurship education policy is an antecedent that influences students' learning expectations, which is represented in the difference in student autonomy. British students in active learning under a voluntary education policy have higher autonomy than Chinese students in passive learning under a compulsory education policy, thus having higher learning expectations, leading to higher satisfaction. The positive relationship between autonomy and learning expectations is established, which adds a new dimension to self-determination theory. Furthermore, it is also revealed that the change in students’ entrepreneurial intentions before and after their entrepreneurship courses is explained by understanding the process of a business start-up (positive), hands-on business start-up opportunities (positive), students’ actual input (positive) and tutors’ academic qualification (negative). The thesis makes contributions to both theory and practice. The findings have far reaching implications for different parties, including policymakers, educators, practitioners and researchers. Understanding and shaping students' learning expectations is a critical first step in optimising entrepreneurship education teaching and learning. On the one hand, understanding students' learning expectations of entrepreneurship and entrepreneurship education can help the government with educational interventions and policy reform, as well as improving the quality and delivery of university-based entrepreneurship education. On the other hand, entrepreneurship education can assist students in establishing correct and realistic learning expectations and entrepreneurial conceptions, which will benefit their future entrepreneurial activities and/or employment. An important implication is that this study connects multiple stakeholders by bridging the national-level institutional context, organisational-level university entrepreneurship education, and individual level entrepreneurial learning to promote student autonomy based on an understanding of students' learning expectations. This can help develop graduates with their ability for autonomous learning and autonomous entrepreneurial behaviour. The results of this study help to remind students that it is them, the learners, their expectations and input that can make the difference between the success or failure of their study. This would not only apply to entrepreneurship education but also to other fields of study. One key message from this study is that education can be encouraged and supported but cannot be “forced”. Mandatory entrepreneurship education is not a quick fix for the lack of university students’ innovation and entrepreneurship. More resources must be invested in enhancing the enterprise culture, thus making entrepreneurship education desirable for students

    Full stack development toward a trapped ion logical qubit

    Get PDF
    Quantum error correction is a key step toward the construction of a large-scale quantum computer, by preventing small infidelities in quantum gates from accumulating over the course of an algorithm. Detecting and correcting errors is achieved by using multiple physical qubits to form a smaller number of robust logical qubits. The physical implementation of a logical qubit requires multiple qubits, on which high fidelity gates can be performed. The project aims to realize a logical qubit based on ions confined on a microfabricated surface trap. Each physical qubit will be a microwave dressed state qubit based on 171Yb+ ions. Gates are intended to be realized through RF and microwave radiation in combination with magnetic field gradients. The project vertically integrates software down to hardware compilation layers in order to deliver, in the near future, a fully functional small device demonstrator. This thesis presents novel results on multiple layers of a full stack quantum computer model. On the hardware level a robust quantum gate is studied and ion displacement over the X-junction geometry is demonstrated. The experimental organization is optimized through automation and compressed waveform data transmission. A new quantum assembly language purely dedicated to trapped ion quantum computers is introduced. The demonstrator is aimed at testing implementation of quantum error correction codes while preparing for larger scale iterations.Open Acces

    Optimizing transcriptomics to study the evolutionary effect of FOXP2

    Get PDF
    The field of genomics was established with the sequencing of the human genome, a pivotal achievement that has allowed us to address various questions in biology from a unique perspective. One question in particular, that of the evolution of human speech, has gripped philosophers, evolutionary biologists, and now genomicists. However, little is known of the genetic basis that allowed humans to evolve the ability to speak. Of the few genes implicated in human speech, one of the most studied is FOXP2, which encodes for the transcription factor Forkhead box protein P2 (FOXP2). FOXP2 is essential for proper speech development and two mutations in the human lineage are believed to have contributed to the evolution of human speech. To address the effect of FOXP2 and investigate its evolutionary contribution to human speech, one can utilize the power of genomics, more specifically gene expression analysis via ribonucleic acid sequencing (RNA-seq). To this end, I first contributed in developing mcSCRB-seq, a highly sensitive, powerful, and efficient single cell RNA-seq (scRNA-seq) protocol. Previously having emerged as a central method for studying cellular heterogeneity and identifying cellular processes, scRNA-seq was a powerful genomic tool but lacked the sensitivity and cost-efficiency of more established protocols. By systematically evaluating each step of the process, I helped find that the addition of polyethylene glycol increased sensitivity by enhancing the cDNA synthesis reaction. This, along with other optimizations resulted in developing a sensitive and flexible protocol that is cost-efficient and ideal in many research settings. A primary motivation driving the extensive optimizations surrounding single cell transcriptomics has been the generation of cellular atlases, which aim to identify and characterize all of the cells in an organism. As such efforts are carried out in a variety of research groups using a number of different RNA-seq protocols, I contributed in an effort to benchmark and standardize scRNA-seq methods. This not only identified methods which may be ideal for the purpose of cell atlas creation, but also highlighted optimizations that could be integrated into existing protocols. Using mcSCRB-seq as a foundation as well as the findings from the scRNA-seq benchmarking, I helped develop prime-seq, a sensitive, robust, and most importantly, affordable bulk RNA-seq protocol. Bulk RNA-seq was frequently overlooked during the efforts to optimize and establish single-cell techniques, even though the method is still extensively used in analyzing gene expression. Introducing early barcoding and reducing library generation costs kept prime-seq cost-efficient, but basing it off of single-cell methods ensured that it would be a sensitive and powerful technique. I helped verify this by benchmarking it against TruSeq generated data and then helped test the robustness by generating prime-seq libraries from over seventeen species. These optimizations resulted in a final protocol that is well suited for investigating gene expression in comprehensive and high-throughput studies. Finally, I utilized prime-seq in order to develop a comprehensive gene expression atlas to study the function of FOXP2 and its role in speech evolution. I used previously generated mouse models: a knockout model containing one non-functional Foxp2 allele and a humanized model, which has a variant Foxp2 allele with two human-specific mutations. To study the effect globally across the mouse, I helped harvest eighteen tissues which were previously identified to express FOXP2. By then comparing the mouse models to wild-type mice, I helped highlight the importance of FOXP2 within lung development and the importance of the human variant allele in the brain. Both mcSCRB-seq and prime-seq have already been used and published in numerous studies to address a variety of biological and biomedical questions. Additionally, my work on FOXP2 not only provides a thorough expression atlas, but also provides a detailed and cost-efficient plan for undertaking a similar study on other genes of interest. Lastly, the studies on FOXP2 done within this work, lay the foundation for future studies investigating the role of FOXP2 in modulating learning behavior, and thereby affecting human speech
    corecore