13,249 research outputs found
Computational strategies for dissecting the high-dimensional complexity of adaptive immune repertoires
The adaptive immune system recognizes antigens via an immense array of
antigen-binding antibodies and T-cell receptors, the immune repertoire. The
interrogation of immune repertoires is of high relevance for understanding the
adaptive immune response in disease and infection (e.g., autoimmunity, cancer,
HIV). Adaptive immune receptor repertoire sequencing (AIRR-seq) has driven the
quantitative and molecular-level profiling of immune repertoires thereby
revealing the high-dimensional complexity of the immune receptor sequence
landscape. Several methods for the computational and statistical analysis of
large-scale AIRR-seq data have been developed to resolve immune repertoire
complexity in order to understand the dynamics of adaptive immunity. Here, we
review the current research on (i) diversity, (ii) clustering and network,
(iii) phylogenetic and (iv) machine learning methods applied to dissect,
quantify and compare the architecture, evolution, and specificity of immune
repertoires. We summarize outstanding questions in computational immunology and
propose future directions for systems immunology towards coupling AIRR-seq with
the computational discovery of immunotherapeutics, vaccines, and
immunodiagnostics.Comment: 27 pages, 2 figure
The International Virus Bioinformatics Meeting 2020.
The International Virus Bioinformatics Meeting 2020 was originally planned to take place in Bern, Switzerland, in March 2020. However, the COVID-19 pandemic put a spoke in the wheel of almost all conferences to be held in 2020. After moving the conference to 8-9 October 2020, we got hit by the second wave and finally decided at short notice to go fully online. On the other hand, the pandemic has made us even more aware of the importance of accelerating research in viral bioinformatics. Advances in bioinformatics have led to improved approaches to investigate viral infections and outbreaks. The International Virus Bioinformatics Meeting 2020 has attracted approximately 120 experts in virology and bioinformatics from all over the world to join the two-day virtual meeting. Despite concerns being raised that virtual meetings lack possibilities for face-to-face discussion, the participants from this small community created a highly interactive scientific environment, engaging in lively and inspiring discussions and suggesting new research directions and questions. The meeting featured five invited and twelve contributed talks, on the four main topics: (1) proteome and RNAome of RNA viruses, (2) viral metagenomics and ecology, (3) virus evolution and classification and (4) viral infections and immunology. Further, the meeting featured 20 oral poster presentations, all of which focused on specific areas of virus bioinformatics. This report summarizes the main research findings and highlights presented at the meeting
An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols
We describe an effort to annotate a corpus of natural language instructions
consisting of 622 wet lab protocols to facilitate automatic or semi-automatic
conversion of protocols into a machine-readable format and benefit biological
research. Experimental results demonstrate the utility of our corpus for
developing machine learning approaches to shallow semantic parsing of
instructional texts. We make our annotated Wet Lab Protocol Corpus available to
the research community
Using DNA microarrays to study host-microbe interactions.
Complete genomic sequences of microbial pathogens and hosts offer sophisticated new strategies for studying host-pathogen interactions. DNA microarrays exploit primary sequence data to measure transcript levels and detect sequence polymorphisms, for every gene, simultaneously. The design and construction of a DNA microarray for any given microbial genome are straightforward. By monitoring microbial gene expression, one can predict the functions of uncharacterized genes, probe the physiologic adaptations made under various environmental conditions, identify virulence-associated genes, and test the effects of drugs. Similarly, by using host gene microarrays, one can explore host response at the level of gene expression and provide a molecular description of the events that follow infection. Host profiling might also identify gene expression signatures unique for each pathogen, thus providing a novel tool for diagnosis, prognosis, and clinical management of infectious disease
Recommended from our members
The Promises and Pitfalls of Machine Learning for Detecting Viruses in Aquatic Metagenomes
Tools allowing for the identification of viral sequences in host-associated and environmental metagenomes allows for a better understanding of the genetics and ecology of viruses and their hosts. Recently, new approaches using machine learning methods to distinguish viral from bacterial signal using k-mer sequence signatures were published for identifying viral contigs in metagenomes. The promise of these content-based approaches is the ability to discover new viruses, with no or few known relatives. In this perspective paper, we examine the use of the content-based machine learning tool VirFinder for the identification of viral sequences in aquatic metagenomes and explore the possibility of using ecosystem-focused models targeted to marine metagenomes. We discuss the impact of the training set composition on the tool performance and the current limitation for the retrieval of low abundance viral sequences in metagenomes. We identify potential biases that could arise from machine learning approaches for viral hunting in real-world datasets and suggest possible avenues to overcome them.National Science Foundation [1640775]Open access journal.This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]
GSAE: an autoencoder with embedded gene-set nodes for genomics functional characterization
Bioinformatics tools have been developed to interpret gene expression data at
the gene set level, and these gene set based analyses improve the biologists'
capability to discover functional relevance of their experiment design. While
elucidating gene set individually, inter gene sets association is rarely taken
into consideration. Deep learning, an emerging machine learning technique in
computational biology, can be used to generate an unbiased combination of gene
set, and to determine the biological relevance and analysis consistency of
these combining gene sets by leveraging large genomic data sets. In this study,
we proposed a gene superset autoencoder (GSAE), a multi-layer autoencoder model
with the incorporation of a priori defined gene sets that retain the crucial
biological features in the latent layer. We introduced the concept of the gene
superset, an unbiased combination of gene sets with weights trained by the
autoencoder, where each node in the latent layer is a superset. Trained with
genomic data from TCGA and evaluated with their accompanying clinical
parameters, we showed gene supersets' ability of discriminating tumor subtypes
and their prognostic capability. We further demonstrated the biological
relevance of the top component gene sets in the significant supersets. Using
autoencoder model and gene superset at its latent layer, we demonstrated that
gene supersets retain sufficient biological information with respect to tumor
subtypes and clinical prognostic significance. Superset also provides high
reproducibility on survival analysis and accurate prediction for cancer
subtypes.Comment: Presented in the International Conference on Intelligent Biology and
Medicine (ICIBM 2018) at Los Angeles, CA, USA and published in BMC Systems
Biology 2018, 12(Suppl 8):14
- …