2,579 research outputs found

    Bayesian spline-based hidden Markov models with applications to actimetry data and sleep analysis

    Get PDF
    B-spline-based hidden Markov models employ B-splines to specify the emission distributions, offering a more flexible modeling approach to data than conventional parametric HMMs. We introduce a Bayesian framework for inference, enabling the simultaneous estimation of all unknown model parameters including the number of states. A parsimonious knot configuration of the B-splines is identified by the use of a trans-dimensional Markov chain sampling algorithm, while model selection regarding the number of states can be performed based on the marginal likelihood within a parallel sampling framework. Using extensive simulation studies, we demonstrate the superiority of our methodology over alternative approaches as well as its robustness and scalability. We illustrate the explorative use of our methods for data on activity in animals, that is whitetip-sharks. The flexibility of our Bayesian approach also facilitates the incorporation of more realistic assumptions and we demonstrate this by developing a novel hierarchical conditional HMM to analyse human activity for circadian and sleep modeling. Supplementary materials for this article are available online

    Meta-learning algorithms and applications

    Get PDF
    Meta-learning in the broader context concerns how an agent learns about their own learning, allowing them to improve their learning process. Learning how to learn is not only beneficial for humans, but it has also shown vast benefits for improving how machines learn. In the context of machine learning, meta-learning enables models to improve their learning process by selecting suitable meta-parameters that influence the learning. For deep learning specifically, the meta-parameters typically describe details of the training of the model but can also include description of the model itself - the architecture. Meta-learning is usually done with specific goals in mind, for example trying to improve ability to generalize or learn new concepts from only a few examples. Meta-learning can be powerful, but it comes with a key downside: it is often computationally costly. If the costs would be alleviated, meta-learning could be more accessible to developers of new artificial intelligence models, allowing them to achieve greater goals or save resources. As a result, one key focus of our research is on significantly improving the efficiency of meta-learning. We develop two approaches: EvoGrad and PASHA, both of which significantly improve meta-learning efficiency in two common scenarios. EvoGrad allows us to efficiently optimize the value of a large number of differentiable meta-parameters, while PASHA enables us to efficiently optimize any type of meta-parameters but fewer in number. Meta-learning is a tool that can be applied to solve various problems. Most commonly it is applied for learning new concepts from only a small number of examples (few-shot learning), but other applications exist too. To showcase the practical impact that meta-learning can make in the context of neural networks, we use meta-learning as a novel solution for two selected problems: more accurate uncertainty quantification (calibration) and general-purpose few-shot learning. Both are practically important problems and using meta-learning approaches we can obtain better solutions than the ones obtained using existing approaches. Calibration is important for safety-critical applications of neural networks, while general-purpose few-shot learning tests model's ability to generalize few-shot learning abilities across diverse tasks such as recognition, segmentation and keypoint estimation. More efficient algorithms as well as novel applications enable the field of meta-learning to make more significant impact on the broader area of deep learning and potentially solve problems that were too challenging before. Ultimately both of them allow us to better utilize the opportunities that artificial intelligence presents

    On the robustness of Bayesian phylogenetic gene tree estimation

    Get PDF

    Automated identification and behaviour classification for modelling social dynamics in group-housed mice

    Get PDF
    Mice are often used in biology as exploratory models of human conditions, due to their similar genetics and physiology. Unfortunately, research on behaviour has traditionally been limited to studying individuals in isolated environments and over short periods of time. This can miss critical time-effects, and, since mice are social creatures, bias results. This work addresses this gap in research by developing tools to analyse the individual behaviour of group-housed mice in the home-cage over several days and with minimal disruption. Using data provided by the Mary Lyon Centre at MRC Harwell we designed an end-to-end system that (a) tracks and identifies mice in a cage, (b) infers their behaviour, and subsequently (c) models the group dynamics as functions of individual activities. In support of the above, we also curated and made available a large dataset of mouse localisation and behaviour classifications (IMADGE), as well as two smaller annotated datasets for training/evaluating the identification (TIDe) and behaviour inference (ABODe) systems. This research constitutes the first of its kind in terms of the scale and challenges addressed. The data source (side-view single-channel video with clutter and no identification markers for mice) presents challenging conditions for analysis, but has the potential to give richer information while using industry standard housing. A Tracking and Identification module was developed to automatically detect, track and identify the (visually similar) mice in the cluttered home-cage using only single-channel IR video and coarse position from RFID readings. Existing detectors and trackers were combined with a novel Integer Linear Programming formulation to assign anonymous tracks to mouse identities. This utilised a probabilistic weight model of affinity between detections and RFID pickups. The next task necessitated the implementation of the Activity Labelling module that classifies the behaviour of each mouse, handling occlusion to avoid giving unreliable classifications when the mice cannot be observed. Two key aspects of this were (a) careful feature-selection, and (b) judicious balancing of the errors of the system in line with the repercussions for our setup. Given these sequences of individual behaviours, we analysed the interaction dynamics between mice in the same cage by collapsing the group behaviour into a sequence of interpretable latent regimes using both static and temporal (Markov) models. Using a permutation matrix, we were able to automatically assign mice to roles in the HMM, fit a global model to a group of cages and analyse abnormalities in data from a different demographic

    2023-2024 Catalog

    Get PDF
    The 2023-2024 Governors State University Undergraduate and Graduate Catalog is a comprehensive listing of current information regarding:Degree RequirementsCourse OfferingsUndergraduate and Graduate Rules and Regulation

    Talking about personal recovery in bipolar disorder: Integrating health research, natural language processing, and corpus linguistics to analyse peer online support forum posts

    Get PDF
    Background: Personal recovery, ‘living a satisfying, hopeful and contributing lifeeven with the limitations caused by the illness’ (Anthony, 1993) is of particular value in bipolar disorder where symptoms often persist despite treatment. So far, personal recovery has only been studied in researcher-constructed environments (interviews, focus groups). Support forum posts can serve as a complementary naturalistic data source. Objective: The overarching aim of this thesis was to study personal recovery experiences that people living with bipolar disorder have shared in online support forums through integrating health research, NLP, and corpus linguistics in a mixed methods approach within a pragmatic research paradigm, while considering ethical issues and involving people with lived experience. Methods: This mixed-methods study analysed: 1) previous qualitative evidence on personal recovery in bipolar disorder from interviews and focus groups 2) who self-reports a bipolar disorder diagnosis on the online discussion platform Reddit 3) the relationship of mood and posting in mental health-specific Reddit forums (subreddits) 4) discussions of personal recovery in bipolar disorder subreddits. Results: A systematic review of qualitative evidence resulted in the first framework for personal recovery in bipolar disorder, POETIC (Purpose & meaning, Optimism & hope, Empowerment, Tensions, Identity, Connectedness). Mainly young or middle-aged US-based adults self-report a bipolar disorder diagnosis on Reddit. Of these, those experiencing more intense emotions appear to be more likely to post in mental health support subreddits. Their personal recovery-related discussions in bipolar disorder subreddits primarily focussed on three domains: Purpose & meaning (particularly reproductive decisions, work), Connectedness (romantic relationships, social support), Empowerment (self-management, personal responsibility). Support forum data highlighted personal recovery issues that exclusively or more frequently came up online compared to previous evidence from interviews and focus groups. Conclusion: This project is the first to analyse non-reactive data on personal recovery in bipolar disorder. Indicating the key areas that people focus on in personal recovery when posting freely and the language they use provides a helpful starting point for formal and informal carers to understand the concerns of people diagnosed with bipolar disorder and to consider how best to offer support

    Asymptotics of stochastic learning in structured networks

    Get PDF

    Deep learning for computer vision constrained by limited supervision

    Get PDF
    This thesis presents the research work conducted on developing algo- rithms capable of training neural networks for image classification and re- gression in low supervision settings. The research was conducted on publicly available benchmark image datasets as well as real world data with appli- cations to herbage quality estimation in an agri-tech scope at the VistaMilk SFI centre. Topics include label noise and web-crawled datasets where some images have an incorrect classification label, semi-supervised learning where only a small part of the available images have been annotated by humans and unsupervised learning where the images are not annotated. The principal contributions are summarized as follows. Label noise: a study highlighting the dual in- and out-of-distribution nature of web-noise; a noise detection metric than can independently retrieve each noise type; an observation of the linear separability of in- and out-of-distribution images in unsupervised contrastive feature spaces; two noise-robust algorithms DSOS and SNCF that iteratively improve the state-of-the-art accuracy on the mini-Webvision dataset. Semi-supervised learning: we use unsupervised features to propagate labels from a few labeled examples to the entire dataset; ReLaB an algorithm that allows to decrease the classification error up to 8% with one labeled representative image on CIFAR-10. Biomass composition estimation from images: two semi-supervised approaches that utilize unlabeled images either through an approximate annotator or by adapting semi-supervised algorithms from the image classification litterature. To scale the biomass to drone images, we use super-resolution paired with semi-supervised learning. Early results on grass biomass estimation show the feasibility of automating the process with accuracies on par or better than human experts. The conclusion of the thesis will summarize the research contributions and discuss thoughts on future research that I believe should be tackled in the field of low supervision computer vision

    Model-based deep autoencoders for clustering single-cell RNA sequencing data with side information

    Get PDF
    Clustering analysis has been conducted extensively in single-cell RNA sequencing (scRNA-seq) studies. scRNA-seq can profile tens of thousands of genes\u27 activities within a single cell. Thousands or tens of thousands of cells can be captured simultaneously in a typical scRNA-seq experiment. Biologists would like to cluster these cells for exploring and elucidating cell types or subtypes. Numerous methods have been designed for clustering scRNA-seq data. Yet, single-cell technologies develop so fast in the past few years that those existing methods do not catch up with these rapid changes and fail to fully fulfil their potential. For instance, besides profiling transcription expression levels of genes, recent single-cell technologies can capture other auxiliary information at the single-cell level, such as protein expression (multi-omics scRNA-seq) and cells\u27 spatial location information (spatial-resolved scRNA-seq). Most existing clustering methods for scRNA-seq are performed in an unsupervised manner and fail to exploit available side information for optimizing clustering performance. This dissertation focuses on developing novel computational methods for clustering scRNA-seq data. The basic models are built on a deep autoencoder (AE) framework, which is coupled with a ZINB (zero-inflated negative binomial) loss to characterize the zero-inflated and over-dispersed scRNA-seq count data. To integrate multi-omics scRNA-seq data, a multimodal autoencoder (MAE) is employed. It applies one encoder for the multimodal inputs and two decoders for reconstructing each omics of data. This model is named scMDC (Single-Cell Multi-omics Deep Clustering). Besides, it is expected that cells in spatial proximity tend to be of the same cell types. To exploit cellular spatial information available for spatial-resolved scRNA-seq (sp-scRNA-seq) data, a novel model, DSSC (Deep Spatial-constrained Single-cell Clustering), is developed. DSSC integrates the spatial information of cells into the clustering process by two steps: 1) the spatial information is encoded by using a graphical neural network model; 2) cell-to-cell constraints are built based on the spatially expression pattern of the marker genes and added in the model to guide the clustering process. DSSC is the first model which can utilize the information from both the spatial coordinates and the marker genes to guide the cell/spot clustering. For both scMDC and DSSC, a clustering loss is optimized on the bottleneck layer of autoencoder along with the learning of feature representation. Extensive experiments on both simulated and real datasets demonstrate that scMDC and DSSC boost clustering performance significantly while costing no extra time and space during the training process. These models hold great promise as valuable tools for harnessing the full potential of state-of-the-art single-cell data

    Singularity Formation in the High-Dimensional Euler Equations and Sampling of High-Dimensional Distributions by Deep Generative Networks

    Get PDF
    High dimensionality brings both opportunities and challenges to the study of applied mathematics. This thesis consists of two parts. The first part explores the singularity formation of the axisymmetric incompressible Euler equations with no swirl in ℝⁿ, which is closely related to the Millennium Prize Problem on the global singularity of the Navier-Stokes equations. In this part, the high dimensionality contributes to the singularity formation in finite time by enhancing the strength of the vortex stretching term. The second part focuses on sampling from a high-dimensional distribution using deep generative networks, which has wide applications in the Bayesian inverse problem and the image synthesis task. The high dimensionality in this part becomes a significant challenge to the numerical algorithms, known as the curse of dimensionality. In the first part of this thesis, we consider the singularity formation in two scenarios. In the first scenario, for the axisymmetric Euler equations with no swirl, we consider the case when the initial condition for the angular vorticity is Cα Hölder continuous. We provide convincing numerical examples where the solutions develop potential self-similar blow-up in finite time when the Hölder exponent α &lt; α*, and this upper bound α* can asymptotically approach 1 - 2/n. This result supports a conjecture from Drivas and Elgindi [37], and generalizes it to the high-dimensional case. This potential blow-up is insensitive to the perturbation of initial data. Based on assumptions summarized from numerical experiments, we study a limiting case of the Euler equations, and obtain α* = 1 - 2/n which agrees with the numerical result. For the general case, we propose a relatively simple one-dimensional model and numerically verify its approximation to the Euler equations. This one-dimensional model might suggest a possible way to show this finite-time blow-up scenario analytically. Compared to the first proved blow-up result of the 3D axisymmetric Euler equations with no swirl and Hölder continuous initial data by Elgindi in [40], our potential blow-up scenario has completely different scaling behavior and regularity of the initial condition. In the second scenario, we consider using smooth initial data, but modify the Euler equations by adding a factor Δ as the coefficient of the convection terms to weaken the convection effect. The new model is called the weak convection model. We provide convincing numerical examples of the weak convection model where the solutions develop potential self-similar blow-up in finite time when the convection strength Δ &lt; Δ*, and this upper bound Δ* should be close to 1 - 2/n. This result is closely related to the infinite-dimensional case of an open question [37] stated by Drivas and Elgindi. Our numerical observations also inspire us to approximate the weak convection model with a one-dimensional model. We give a rigorous proof that the one-dimensional model will develop finite-time blow-up if Δ &lt; 1 - 2/n, and study the approximation quality of the one-dimensional model to the weak convection model numerically, which could be beneficial to a rigorous proof of the potential finite-time blow-up. In the second part of the thesis, we propose the Multiscale Invertible Generative Network (MsIGN) to sample from high-dimensional distributions by exploring the low-dimensional structure in the target distribution. The MsIGN models a transport map from a known reference distribution to the target distribution, and thus is very efficient in generating uncorrelated samples compared to MCMC-type methods. The MsIGN captures multiple modes in the target distribution by generating new samples hierarchically from a coarse scale to a fine scale with the help of a novel prior conditioning layer. The hierarchical structure of the MsIGN also allows training in a coarse-to-fine scale manner. The Jeffreys divergence is used as the objective function in training to avoid mode collapse. Importance sampling based on the prior conditioning layer is leveraged to estimate the Jeffreys divergence, which is intractable in previous deep generative networks. Numerically, when applied to two Bayesian inverse problems, the MsIGN clearly captures multiple modes in the high-dimensional posterior and approximates the posterior accurately, demonstrating its superior performance compared with previous methods. We also provide an ablation study to show the necessity of our proposed network architecture and training algorithm for the good numerical performance. Moreover, we also apply the MsIGN to the image synthesis task, where it achieves superior performance in terms of bits-per-dimension value over other flow-based generative models and yields very good interpretability of its neurons in intermediate layers.</p
    • 

    corecore