77 research outputs found

    Variational cross-validation of slow dynamical modes in molecular kinetics

    Full text link
    Markov state models (MSMs) are a widely used method for approximating the eigenspectrum of the molecular dynamics propagator, yielding insight into the long-timescale statistical kinetics and slow dynamical modes of biomolecular systems. However, the lack of a unified theoretical framework for choosing between alternative models has hampered progress, especially for non-experts applying these methods to novel biological systems. Here, we consider cross-validation with a new objective function for estimators of these slow dynamical modes, a generalized matrix Rayleigh quotient (GMRQ), which measures the ability of a rank-mm projection operator to capture the slow subspace of the system. It is shown that a variational theorem bounds the GMRQ from above by the sum of the first mm eigenvalues of the system's propagator, but that this bound can be violated when the requisite matrix elements are estimated subject to statistical uncertainty. This overfitting can be detected and avoided through cross-validation. These result make it possible to construct Markov state models for protein dynamics in a way that appropriately captures the tradeoff between systematic and statistical errors

    Multi-Epoch Machine Learning 2: Identifying physical drivers of galaxy properties in simulations

    Full text link
    Using a novel machine learning method, we investigate the buildup of galaxy properties in different simulations, and in various environments within a single simulation. The aim of this work is to show the power of this approach at identifying the physical drivers of galaxy properties within simulations. We compare how the stellar mass is dependent on the value of other galaxy and halo properties at different points in time by examining the feature importance values of a machine learning model. By training the model on IllustrisTNG we show that stars are produced at earlier times in higher density regions of the universe than they are in low density regions. We also apply the technique to the Illustris, EAGLE, and CAMELS simulations. We find that stellar mass is built up in a similar way in EAGLE and IllustrisTNG, but significantly differently in the original Illustris, suggesting that subgrid model physics is more important than the choice of hydrodynamics method. These differences are driven by the efficiency of supernova feedback. Applying principal component analysis to the CAMELS simulations allows us to identify a component associated with the importance of a halo's gravitational potential and another component representing the time at which galaxies form. We discover that the speed of galactic winds is a more critical subgrid parameter than the total energy per unit star formation. Finally we find that the Simba black hole feedback model has a larger effect on galaxy formation than the IllustrisTNG black hole feedback model.Comment: 16 pages, 12 figures, accepted to MNRA

    Multi-Epoch Machine Learning 1: Unravelling Nature vs Nurture for Galaxy Formation

    Full text link
    We present a novel machine learning method for predicting the baryonic properties of dark matter only subhalos from N-body simulations. Our model is built using the extremely randomized tree (ERT) algorithm and takes subhalo properties over a wide range of redshifts as its input features. We train our model using the IllustrisTNG simulations to predict blackhole mass, gas mass, magnitudes, star formation rate, stellar mass, and metallicity. We compare the results of our method with a baseline model from previous works, and against a model that only considers the mass history of the subhalo. We find that our new model significantly outperforms both of the other models. We then investigate the predictive power of each input by looking at feature importance scores from the ERT algorithm. We produce feature importance plots for each baryonic property, and find that they differ significantly. We identify low redshifts as being most important for predicting star formation rate and gas mass, with high redshifts being most important for predicting stellar mass and metallicity, and consider what this implies for nature vs nurture. We find that the physical properties of galaxies investigated in this study are all driven by nurture and not nature. The only property showing a somewhat stronger impact of nature is the present-day star formation rate of galaxies. Finally we verify that the feature importance plots are discovering physical patterns, and that the trends shown are not an artefact of the ERT algorithm.Comment: Accepted to MNRAS, 15 pages, 8 figures, 2 tables, Main Figure is Fig

    Stepping stability: effects of sensory perturbation

    Get PDF
    BACKGROUND: Few tools exist for quantifying locomotor stability in balance impaired populations. The objective of this study was to develop and evaluate a technique for quantifying stability of stepping in healthy people and people with peripheral (vestibular hypofunction, VH) and central (cerebellar pathology, CB) balance dysfunction by means a sensory (auditory) perturbation test. METHODS: Balance impaired and healthy subjects performed a repeated bench stepping task. The perturbation was applied by suddenly changing the cadence of the metronome (100 beat/min to 80 beat/min) at a predetermined time (but unpredictable by the subject) during the trial. Perturbation response was quantified by computing the Euclidian distance, expressed as a fractional error, between the anterior-posterior center of gravity attractor trajectory before and after the perturbation was applied. The error immediately after the perturbation (Emax), error after recovery (Emin) and the recovery response (Edif) were documented for each participant, and groups were compared with ANOVA. RESULTS: Both balance impaired groups exhibited significantly higher Emax (p = .019) and Emin (p = .028) fractional errors compared to the healthy (HE) subjects, but there were no significant differences between CB and VH groups. Although response recovery was slower for CB and VH groups compared to the HE group, the difference was not significant (p = .051). CONCLUSION: The findings suggest that individuals with balance impairment have reduced ability to stabilize locomotor patterns following perturbation, revealing the fragility of their impairment adaptations and compensations. These data suggest that auditory perturbations applied during a challenging stepping task may be useful for measuring rehabilitation outcomes

    Multi-epoch machine learning for galaxy formation

    Get PDF
    In this thesis I utilise a range of machine learning techniques in conjunction with hydrodynamical cosmological simulations. In Chapter 2 I present a novel machine learning method for predicting the baryonic properties of dark matter only subhalos taken from N-body simulations. The model is built using a tree-based algorithm and incorporates subhalo properties over a wide range of redshifts as its input features. I train the model using a hydrodynamical simulation which enables it to predict black hole mass, gas mass, magnitudes, star formation rate, stellar mass, and metallicity. This new model surpasses the performance of previous models. Furthermore, I explore the predictive power of each input property by looking at feature importance scores from the tree-based model. By applying the method to the LEGACY N-body simulation I generate a large volume mock catalog of the quasar population at z=3. By comparing this mock catalog with observations, I demonstrate that the IllustrisTNG subgrid model for black holes is not accurately capturing the growth of the most massive objects. In Chapter 3 I apply my method to investigate the evolution of galaxy properties in different simulations, and in various environments within a single simulation. By comparing the Illustris, EAGLE, and TNG simulations I show that subgrid model physics plays a more significant role than the choice of hydrodynamics method. Using the CAMELS simulation suite I consider the impact of cosmological and astrophysical parameters on the buildup of stellar mass within the TNG and SIMBA models. In the final chapter I apply a combination of neural networks and symbolic regression methods to construct a semi-analytic model which reproduces the galaxy population from a cosmological simulation. The neural network based approach is capable of producing a more accurate population than a previous method of binning based on halo mass. The equations resulting from symbolic regression are found to be a good approximation of the neural network

    On stable homotopy equivalences

    Get PDF
    corecore