39 research outputs found

    Manifolds.jl: An Extensible Julia Framework for Data Analysis on Manifolds

    Get PDF
    For data given on a nonlinear space, like angles, symmetric positive matrices, the sphere, or the hyperbolic space, there is often enough structure to form a Riemannian manifold. We present the Julia package Manifolds.jl, providing a fast and easy to use library of Riemannian manifolds and Lie groups. We introduce a common interface, available in ManifoldsBase.jl, with which new manifolds, applications, and algorithms can be implemented. We demonstrate the utility of Manifolds.jl using B\'ezier splines, an optimization task on manifolds, and a principal component analysis on nonlinear data. In a benchmark, Manifolds.jl outperforms existing packages in Matlab or Python by several orders of magnitude and is about twice as fast as a comparable package implemented in C++

    Comparative analysis of carboxysome shell proteins

    Get PDF
    Carboxysomes are metabolic modules for CO2 fixation that are found in all cyanobacteria and some chemoautotrophic bacteria. They comprise a semi-permeable proteinaceous shell that encapsulates ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO) and carbonic anhydrase. Structural studies are revealing the integral role of the shell protein paralogs to carboxysome form and function. The shell proteins are composed of two domain classes: those with the bacterial microcompartment (BMC; Pfam00936) domain, which oligomerize to form (pseudo)hexamers, and those with the CcmL/EutN (Pfam03319) domain which form pentamers in carboxysomes. These two shell protein types are proposed to be the basis for the carboxysome’s icosahedral geometry. The shell proteins are also thought to allow the flux of metabolites across the shell through the presence of the small pore formed by their hexameric/pentameric symmetry axes. In this review, we describe bioinformatic and structural analyses that highlight the important primary, tertiary, and quaternary structural features of these conserved shell subunits. In the future, further understanding of these molecular building blocks may provide the basis for enhancing CO2 fixation in other organisms or creating novel biological nanostructures

    Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Get PDF
    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies’ publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010’s top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century [1]. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question [1–12] in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences [13–16]. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [ www.pkal.org])

    Manifolds.jl: An Extensible Julia Framework for Data Analysis on Manifolds

    No full text

    A Simple Representation of Three-Dimensional Molecular Structure.

    No full text

    Frequency effects in linear discriminative learning

    No full text
    Word frequency is a strong predictor in most lexical processing tasks. Thus, any model of word recognition needs to account for how word frequency effects arise. The Discriminative Lexicon Model (DLM; Baayen et al., 2018a, 2019) models lexical processing with linear mappings between words' forms and their meanings. So far, the mappings can either be obtained incrementally via error-driven learning, a computationally expensive process able to capture frequency effects, or in an efficient, but frequency-agnostic solution modelling the theoretical endstate of learning (EL) where all words are learned optimally. In this study we show how an efficient, yet frequency-informed mapping between form and meaning can be obtained (Frequency-informed learning; FIL). We find that FIL well approximates an incremental solution while being computationally much cheaper. FIL shows a relatively low type- and high token-accuracy, demonstrating that the model is able to process most word tokens encountered by speakers in daily life correctly. We use FIL to model reaction times in the Dutch Lexicon Project (Keuleers et al., 2010) and find that FIL predicts well the S-shaped relationship between frequency and the mean of reaction times but underestimates the variance of reaction times for low frequency words. FIL is also better able to account for priming effects in an auditory lexical decision task in Mandarin Chinese (Lee, 2007), compared to EL. Finally, we used ordered data from CHILDES (Brown, 1973; Demuth et al., 2006) to compare mappings obtained with FIL and incremental learning. The mappings are highly correlated, but with FIL some nuances based on word ordering effects are lost. Our results show how frequency effects in a learning model can be simulated efficiently, and raise questions about how to best account for low-frequency words in cognitive models.Comment: 32 pages, 12 figures, 3 tables; revised versio

    Bacterial phyla tree with distribution of BMC locus types.

    No full text
    <p>The classified BMC locus types, excluding satellite and satellite-like loci, denoted as colored shapes are adjacent to the phyla in which they appear. For a given phylum, the shape of the triangular wedge represents sequence diversity; the nearest edge represents the shortest branch length from the phylum node to a leaf, while the farthest edge represents the longest branch length from the phylum node to a leaf. Phyla marked with an asterisk (*) are not in NR but contain BMC loci; the data were retrieved from IMG (<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003898#s2" target="_blank">Materials and Methods</a>). Phylum tree based on <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003898#pcbi.1003898-Rinke1" target="_blank">[52]</a> with expansion by Christian Rinke.</p

    A Taxonomy of Bacterial Microcompartment Loci Constructed by a Novel Scoring Method

    No full text
    <div><p>Bacterial microcompartments (BMCs) are proteinaceous organelles involved in both autotrophic and heterotrophic metabolism. All BMCs share homologous shell proteins but differ in their complement of enzymes; these are typically encoded adjacent to shell protein genes in genetic loci, or operons. To enable the identification and prediction of functional (sub)types of BMCs, we developed LoClass, an algorithm that finds putative BMC loci and inventories, weights, and compares their constituent pfam domains to construct a locus similarity network and predict locus (sub)types. In addition to using LoClass to analyze sequences in the Non-redundant Protein Database, we compared predicted BMC loci found in seven candidate bacterial phyla (six from single-cell genomic studies) to the LoClass taxonomy. Together, these analyses resulted in the identification of 23 different types of BMCs encoded in 30 distinct locus (sub)types found in 23 bacterial phyla. These include the two carboxysome types and a divergent set of metabolosomes, BMCs that share a common catalytic core and process distinct substrates via specific signature enzymes. Furthermore, many Candidate BMCs were found that lack one or more core metabolosome components, including one that is predicted to represent an entirely new paradigm for BMC-associated metabolism, joining the carboxysome and metabolosome. By placing these results in a phylogenetic context, we provide a framework for understanding the horizontal transfer of these loci, a starting point for studies aimed at understanding the evolution of BMCs. This comprehensive taxonomy of BMC loci, based on their constituent protein domains, foregrounds the functional diversity of BMCs and provides a reference for interpreting the role of BMC gene clusters encoded in isolate, single cell, and metagenomic data. Many loci encode ancillary functions such as transporters or genes for cofactor assembly; this expanded vocabulary of BMC-related functions should be useful for design of genetic modules for introducing BMCs in bioengineering applications.</p></div
    corecore