544 research outputs found

    Unsupervised Machine Learning Algorithms to Characterize Single-Cell Heterogeneity and Perturbation Response

    Get PDF
    Recent advances in microfluidic technologies facilitate the measurement of gene expression, DNA accessibility, protein content, or genomic mutations at unprecedented scale. The challenges imposed by the scale of these datasets are further exacerbated by non-linearity in molecular effects, complex interdependencies between features, and a lack of understanding of both data generating processes and sources of technical and biological noise. As a result, analysis of modern single-cell data requires the development of specialized computational tools. One solution to these problems is the use of manifold learning, a sub-field of unsupervised machine learning that seeks to model data geometry using a simplifying assumption that the underlying system is continuous and locally Euclidean. In this dissertation, I show how manifold learning is naturally suited for single-cell analysis and introduce three related algorithms for characterization of single-cell heterogeneity and perturbation response. I first describe Vertex Frequency Clustering, an algorithm that identifies groups of cells with similar responses to an experiment perturbation by analyzing the spectral representation of condition labels expressed as signals over a cell similarity graph. Next, I introduce MELD, an algorithm that expands on these ideas to estimate the density of each experimental sample over the graph to quantify the effect of an experimental perturbation at single cell resolution. Finally, I describe a neural network for archetypal analysis that represents the data as continuously distributed between a set of extrema. Each of these algorithms are demonstrated on a combination of real and synthetic datasets and are benchmarked against state-of-the-art algorithms

    Ion Temperatures in Earth\u27s Inner Magnetosphere: Ring Current Dynamics, Transient Effects, and Data-Model Comparisons

    Get PDF
    Earth\u27s magnetosphere is an inherently complex, strongly nonlinear system with intrinsic coupling between internal and external drivers. In general, magnetospheric systems can be understood as a balance between multiple regions which all exhibit unique plasma properties. The feedback processes between each region depend on geomagnetic activity levels and the preceding states of the solar wind and the respective magnetospheric regions. Of particular interest is understanding how ion temperatures evolve during geomagnetically active periods, and also understanding the space weather impacts of hot ion populations injected during such periods. Dynamic, spatiotemporally resolved ion temperature boundary conditions have been implemented into the Comprehensive Ring Current Model (CRCM); the temperatures are based on 2-D equatorial maps derived from remotely imaged energetic neutral atom (ENA) measurements. The high-speed-stream-driven event on 22 July 2009 and the coronal mass ejection-driven event on 30-31 October 2013 are simulated and compared against identical simulations using a statistically derived boundary condition model.;This new method for establishing boundary conditions allows users to include event-specific observations associated with a dynamic plasma sheet. It is found that spatial and energy distributions in the storm-time ring current exhibit sensitive dependence on boundary conditions during these events. The coupling of boundary conditions to the time history of the convection electric field strength is found to play an important role in throttling the influence of the boundary plasma on the inner magnetosphere. Storm-time dusk-dawn asymmetries consistent with observational data are reproduced well when CRCM is provided with the event-specific boundary condition model. The dependence of average, global magnetospheric ion temperatures derived from ENA maps is also investigated as a function of various combinations of solar wind parameters, IMF parameters, and geomagnetic indices. Covering a 31-month interval of time near solar maximum, the parametric study reveals average stormtime features consistent with various in situ observations, ionospheric observations, and ground-based measurements

    Unlocking the potential of deep learning for marine ecology: overview, applications, and outlook

    Get PDF
    The deep learning (DL) revolution is touching all scientific disciplines and corners of our lives as a means of harnessing the power of big data. Marine ecology is no exception. New methods provide analysis of data from sensors, cameras, and acoustic recorders, even in real time, in ways that are reproducible and rapid. Off-the-shelf algorithms find, count, and classify species from digital images or video and detect cryptic patterns in noisy data. These endeavours require collaboration across ecological and data science disciplines, which can be challenging to initiate. To promote the use of DL towards ecosystem-based management of the sea, this paper aims to bridge the gap between marine ecologists and computer scientists. We provide insight into popular DL approaches for ecological data analysis, focusing on supervised learning techniques with deep neural networks, and illustrate challenges and opportunities through established and emerging applications of DL to marine ecology. We present case studies on plankton, fish, marine mammals, pollution, and nutrient cycling that involve object detection, classification, tracking, and segmentation of visualized data. We conclude with a broad outlook of the field’s opportunities and challenges, including potential technological advances and issues with managing complex data sets.publishedVersionPaid Open Acces
    • …
    corecore