6,074 research outputs found
Unsupervised Detection of Cell-Assembly Sequences by Similarity-Based Clustering
Neurons which fire in a fixed temporal pattern (i.e., "cell assemblies") are hypothesized to be a fundamental unit of neural information processing. Several methods are available for the detection of cell assemblies without a time structure. However, the systematic detection of cell assemblies with time structure has been challenging, especially in large datasets, due to the lack of efficient methods for handling the time structure. Here, we show a method to detect a variety of cell-assembly activity patterns, recurring in noisy neural population activities at multiple timescales. The key innovation is the use of a computer science method to comparing strings ("edit similarity"), to group spikes into assemblies. We validated the method using artificial data and experimental data, which were previously recorded from the hippocampus of male Long-Evans rats and the prefrontal cortex of male Brown Norway/Fisher hybrid rats. From the hippocampus, we could simultaneously extract place-cell sequences occurring on different timescales during navigation and awake replay. From the prefrontal cortex, we could discover multiple spike sequences of neurons encoding different segments of a goal-directed task. Unlike conventional event-driven statistical approaches, our method detects cell assemblies without creating event-locked averages. Thus, the method offers a novel analytical tool for deciphering the neural code during arbitrary behavioral and mental processes
The Parallelism Motifs of Genomic Data Analysis
Genomic data sets are growing dramatically as the cost of sequencing
continues to decline and small sequencing devices become available. Enormous
community databases store and share this data with the research community, but
some of these genomic data analysis problems require large scale computational
platforms to meet both the memory and computational requirements. These
applications differ from scientific simulations that dominate the workload on
high end parallel systems today and place different requirements on programming
support, software libraries, and parallel architectural design. For example,
they involve irregular communication patterns such as asynchronous updates to
shared data structures. We consider several problems in high performance
genomics analysis, including alignment, profiling, clustering, and assembly for
both single genomes and metagenomes. We identify some of the common
computational patterns or motifs that help inform parallelization strategies
and compare our motifs to some of the established lists, arguing that at least
two key patterns, sorting and hashing, are missing
Genome-wide diversity and gene expression profiling of Babesia microti isolates identify polymorphic genes that mediate host-pathogen interactions
Babesia microti, a tick-transmitted, intraerythrocytic protozoan parasite circulating mainly among small mammals, is the primary cause of human babesiosis. While most cases are transmitted by Ixodes ticks, the disease may also be transmitted through blood transfusion and perinatally. A comprehensive analysis of genome composition, genetic diversity, and gene expression profiling of seven B. microti isolates revealed that genetic variation in isolates from the Northeast United States is almost exclusively associated with genes encoding the surface proteome and secretome of the parasite. Furthermore, we found that polymorphism is restricted to a small number of genes, which are highly expressed during infection. In order to identify pathogen-encoded factors involved in host-parasite interactions, we screened a proteome array comprised of 174 B. microti proteins, including several predicted members of the parasite secretome. Using this immuno-proteomic approach we identified several novel antigens that trigger strong host immune responses during the onset of infection. The genomic and immunological data presented herein provide the first insights into the determinants of B. microti interaction with its mammalian hosts and their relevance for understanding the selective pressures acting on parasite evolution
Methods for protein complex prediction and their contributions towards understanding the organization, function and dynamics of complexes
Complexes of physically interacting proteins constitute fundamental
functional units responsible for driving biological processes within cells. A
faithful reconstruction of the entire set of complexes is therefore essential
to understand the functional organization of cells. In this review, we discuss
the key contributions of computational methods developed till date
(approximately between 2003 and 2015) for identifying complexes from the
network of interacting proteins (PPI network). We evaluate in depth the
performance of these methods on PPI datasets from yeast, and highlight
challenges faced by these methods, in particular detection of sparse and small
or sub- complexes and discerning of overlapping complexes. We describe methods
for integrating diverse information including expression profiles and 3D
structures of proteins with PPI networks to understand the dynamics of complex
formation, for instance, of time-based assembly of complex subunits and
formation of fuzzy complexes from intrinsically disordered proteins. Finally,
we discuss methods for identifying dysfunctional complexes in human diseases,
an application that is proving invaluable to understand disease mechanisms and
to discover novel therapeutic targets. We hope this review aptly commemorates a
decade of research on computational prediction of complexes and constitutes a
valuable reference for further advancements in this exciting area.Comment: 1 Tabl
Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuroscience.
Identifying low-dimensional features that describe large-scale neural recordings is a major challenge in neuroscience. Repeated temporal patterns (sequences) are thought to be a salient feature of neural dynamics, but are not succinctly captured by traditional dimensionality reduction techniques. Here, we describe a software toolbox-called seqNMF-with new methods for extracting informative, non-redundant, sequences from high-dimensional neural data, testing the significance of these extracted patterns, and assessing the prevalence of sequential structure in data. We test these methods on simulated data under multiple noise conditions, and on several real neural and behavioral datas. In hippocampal data, seqNMF identifies neural sequences that match those calculated manually by reference to behavioral events. In songbird data, seqNMF discovers neural sequences in untutored birds that lack stereotyped songs. Thus, by identifying temporal structure directly from neural data, seqNMF enables dissection of complex neural circuits without relying on temporal references from stimuli or behavioral outputs
Recommended from our members
The Computational Diet: A Review of Computational Methods Across Diet, Microbiome, and Health.
Food and human health are inextricably linked. As such, revolutionary impacts on health have been derived from advances in the production and distribution of food relating to food safety and fortification with micronutrients. During the past two decades, it has become apparent that the human microbiome has the potential to modulate health, including in ways that may be related to diet and the composition of specific foods. Despite the excitement and potential surrounding this area, the complexity of the gut microbiome, the chemical composition of food, and their interplay in situ remains a daunting task to fully understand. However, recent advances in high-throughput sequencing, metabolomics profiling, compositional analysis of food, and the emergence of electronic health records provide new sources of data that can contribute to addressing this challenge. Computational science will play an essential role in this effort as it will provide the foundation to integrate these data layers and derive insights capable of revealing and understanding the complex interactions between diet, gut microbiome, and health. Here, we review the current knowledge on diet-health-gut microbiota, relevant data sources, bioinformatics tools, machine learning capabilities, as well as the intellectual property and legislative regulatory landscape. We provide guidance on employing machine learning and data analytics, identify gaps in current methods, and describe new scenarios to be unlocked in the next few years in the context of current knowledge
Transcriptional landscape of neuronal and cancer stem cells
Tumor mass is composed by heterogeneous cell population including a subset of “cancer stem cells” (CSC).
Oncogenic signals foster CSC by transforming tissue stem cells or by reprogramming progenitor/differentiated
cells towards stemness. Thus, CSC share features with cancer and stem cells (e.g. self-renewal, hierarchical
developmental program leading to differentiated cells, epithelial/mesenchimal transition) and these latter are
maintained by the constitutive activation of stemness-promoting signals. CSC could trigger tumor formation,
drive to resistance to conventional therapeutics and underlie patients’ relapse. Indeed, stem cell signatures
have been associated with poor prognosis in various.
This background makes the identification of CSC molecular features mandatory to highlight the survival inner
working and to design novel CSC specific therapeutic strategies.
Medulloblastoma (MB) is the most common childhood malignant brain tumor and a leading cause of cancerrelated
morbidity and mortality. Current multimodal therapies are effective in about 50% of patients but often
cause long-term side effects, i.e. developmental, neurological, neuroendocrine and psychosocial deficits
(Northcott PA Nature Rev cancer 2012). For many years, MB treated as a single tumor entity despite the
divergent tumor histology, patients’ outcome and drug sensitivity, and also by the diversity of the stem cell of
origin. Very recently the scenario of human MB has dramatically changed since its heterogeneous biology has
been addressed by high-throughput gene expression analysis (oligonucleotide microarrays) or by the powerful
genomic next-generation sequencing. These led to the identification of four tumor subgroups (WNT, SHH,
Group 3 and Group 4) uncovering the existence of a highly diverse mutational spectra and gene expression.
However a quantitative approach has not yet been applied to the transcriptional landscape of Medulloblastoma
stem cells (MbSC) through RNA Next Generation Sequencing (RNA-Seq) technology. This is a relevant issue,
since RNA-Seq is able to interrogate the genome wide global transcriptome including new transcripts,
alternative spliced isoforms and non-coding RNAs.
Lower rhombic lip progenitors of the dorsal brainstem are considered the trigger cells in WNT tumors; in SHH
subgroup initiation cells are Prominin1+ CD15+ stem cells from the subventricular zone requiring the
commitment to Math1+ granule cell progenitors [GCP] of the external granule cell layer [EGL]; while Math1+ or
Math1- EGL-GCP or Prominin1+/lineage-negative stem cells sustain the MYC driven Group 3.
MbSC derived from SHH tumors and postnatal normal cerebellar stem cells (NcSC) have been reported to
share several features. A key signal for both of them is Hedgehog. Furthermore, both NcSC and MbSC display
up-regulation of stemness genes (e.g Sox2, Nestin, Nanog, Prom1). Finally, constitutive activation of the Shh
pathway by conditional deletion of Ptch1 inhibitory receptor in NcSC, promote medulloblastoma in vivo,
producing a mouse model of the human SHH tumor. Acquisition of stemness features may therefore represent
the first step of oncogenic conversion. Cooperation with additional oncogenic signals is however needed to
enhance MbSC tumorigenicity.
In order to understand the MbSCs transcriptional programs, we analyze by RNA-Seq, MbSC derived from
Ptch1+/- tumors (Ptch1+/- MbSC). This choice, of a genetically determined model of MB, has allowed us to
work with Ptch1+/- MbSC together with appropriate NcSC counterpart, and to analyze biological replicates
doing statistical analysis.
We identify a number of transcripts, annotated ones, novel isoforms, and long non-coding RNAs,
characterizing MbSC and/or NcSC. Some of these genes control stemness or are cancer related and
conserved in human medulloblastomas. Interestingly a subset of them, belonging to cell stress response, are
of prognostic relevance being significantly related to clinical outcome. Correlation of genes expression
characterizing MbSC with survival information from our human medulloblastomas database further
demonstrates the significance of these findings. Our data suggest that the modulation of normal and cancer
stem cell functions observed in vitro is effective in dissecting the transcriptional programs underlying the in
vivo behavior of human medulloblastomas
Recovering complete and draft population genomes from metagenome datasets.
Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem of chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution
- …