209 research outputs found
HPIDB - a unified resource for host-pathogen interactions
<p>Abstract</p> <p>Background</p> <p>Protein-protein interactions (PPIs) play a crucial role in initiating infection in a host-pathogen system. Identification of these PPIs is important for understanding the underlying biological mechanism of infection and identifying putative drug targets. Database resources for studying host-pathogen systems are scarce and are either host specific or dedicated to specific pathogens.</p> <p>Results</p> <p>Here we describe "HPIDB” a host-pathogen PPI database, which will serve as a unified resource for host-pathogen interactions. Specifically, HPIDB integrates experimental PPIs from several public databases into a single, non-redundant web accessible resource. The database can be searched with a variety of options such as sequence identifiers, symbol, taxonomy, publication, author, or interaction type. The output is provided in a tab delimited text file format that is compatible with Cytoscape, an open source resource for PPI visualization. HPIDB allows the user to search protein sequences using BLASTP to retrieve homologous host/pathogen sequences. For high-throughput analysis, the user can search multiple protein sequences at a time using BLASTP and obtain results in tabular and sequence alignment formats. The taxonomic categorization of proteins (bacterial, viral, fungi, etc.) involved in PPI enables the user to perform category specific BLASTP searches. In addition, a new tool is introduced, which allows searching for homologous host-pathogen interactions in the HPIDB database. </p> <p>Conclusions</p> <p>HPIDB is a unified, comprehensive resource for host-pathogen PPIs. The user interface provides new features and tools helpful for studying host-pathogen interactions. HPIDB can be accessed at <url>http://agbase.msstate.edu/hpi/main.html</url>.</p
An automated proteomic data analysis workflow for mass spectrometry
<p>Abstract</p> <p>Background</p> <p>Mass spectrometry-based protein identification methods are fundamental to proteomics. Biological experiments are usually performed in replicates and proteomic analyses generate huge datasets which need to be integrated and quantitatively analyzed. The Sequest™ search algorithm is a commonly used algorithm for identifying peptides and proteins from two dimensional liquid chromatography electrospray ionization tandem mass spectrometry (2-D LC ESI MS<sup>2</sup>) data. A number of proteomic pipelines that facilitate high throughput 'post data acquisition analysis' are described in the literature. However, these pipelines need to be updated to accommodate the rapidly evolving data analysis methods. Here, we describe a proteomic data analysis pipeline that specifically addresses two main issues pertinent to protein identification and differential expression analysis: 1) estimation of the probability of peptide and protein identifications and 2) non-parametric statistics for protein differential expression analysis. Our proteomic analysis workflow analyzes replicate datasets from a single experimental paradigm to generate a list of identified proteins with their probabilities and significant changes in protein expression using parametric and non-parametric statistics.</p> <p>Results</p> <p>The input for our workflow is Bioworks™ 3.2 Sequest (or a later version, including cluster) output in XML format. We use a decoy database approach to assign probability to peptide identifications. The user has the option to select "quality thresholds" on peptide identifications based on the P value. We also estimate probability for protein identification. Proteins identified with peptides at a user-specified threshold value from biological experiments are grouped as either control or treatment for further analysis in ProtQuant. ProtQuant utilizes a parametric (ANOVA) method, for calculating differences in protein expression based on the quantitative measure ΣXcorr. Alternatively ProtQuant output can be further processed using non-parametric Monte-Carlo resampling statistics to calculate P values for differential expression. Correction for multiple testing of ANOVA and resampling P values is done using Benjamini and Hochberg's method. The results of these statistical analyses are then combined into a single output file containing a comprehensive protein list with probabilities and differential expression analysis, associated P values, and resampling statistics.</p> <p>Conclusion</p> <p>For biologists carrying out proteomics by mass spectrometry, our workflow facilitates automated, easy to use analyses of Bioworks (3.2 or later versions) data. All the methods used in the workflow are peer-reviewed and as such the results of our workflow are compliant with proteomic data submission guidelines to public proteomic data repositories including PRIDE. Our workflow is a necessary intermediate step that is required to link proteomics data to biological knowledge for generating testable hypotheses.</p
Understanding the Pathogenesis of Cytopathic and Noncytopathic Bovine Viral Diarrhea Virus Infection Using Proteomics
Transcriptomic Analysis of Peritoneal Cells in a Mouse Model of Sepsis: Confirmatory and Novel Results in Early and Late Sepsis.
Background
The events leading to sepsis start with an invasive infection of a primary organ of the body followed by an overwhelming systemic response. Intra-abdominal infections are the second most common cause of sepsis. Peritoneal fluid is the primary site of infection in these cases. A microarray-based approach was used to study the temporal changes in cells from the peritoneal cavity of septic mice and to identify potential biomarkers and therapeutic targets for this subset of sepsis patients. Results
We conducted microarray analysis of the peritoneal cells of mice infected with a non-pathogenic strain of Escherichia coli. Differentially expressed genes were identified at two early (1 h, 2 h) and one late time point (18 h). A multiplexed bead array analysis was used to confirm protein expression for several cytokines which showed differential expression at different time points based on the microarray data. Gene Ontology based hypothesis testing identified a positive bias of differentially expressed genes associated with cellular development and cell death at 2 h and 18 h respectively. Most differentially expressed genes common to all 3 time points had an immune response related function, consistent with the observation that a few bacteria are still present at 18 h. Conclusions
Transcriptional regulators like PLAGL2, EBF1, TCF7, KLF10 and SBNO2, previously not described in sepsis, are differentially expressed at early and late time points. Expression pattern for key biomarkers in this study is similar to that reported in human sepsis, indicating the suitability of this model for future studies of sepsis, and the observed differences in gene expression suggest species differences or differences in the response of blood leukocytes and peritoneal leukocytes
Identification of novel non-coding small RNAs from Streptococcus pneumoniae TIGR4 using high-resolution genome tiling arrays
<p>Abstract</p> <p>Background</p> <p>The identification of non-coding transcripts in human, mouse, and <it>Escherichia coli </it>has revealed their widespread occurrence and functional importance in both eukaryotic and prokaryotic life. In prokaryotes, studies have shown that non-coding transcripts participate in a broad range of cellular functions like gene regulation, stress and virulence. However, very little is known about non-coding transcripts in <it>Streptococcus pneumoniae </it>(pneumococcus), an obligate human respiratory pathogen responsible for significant worldwide morbidity and mortality. Tiling microarrays enable genome wide mRNA profiling as well as identification of novel transcripts at a high-resolution.</p> <p>Results</p> <p>Here, we describe a high-resolution transcription map of the <it>S. pneumoniae </it>clinical isolate TIGR4 using genomic tiling arrays. Our results indicate that approximately 66% of the genome is expressed under our experimental conditions. We identified a total of 50 non-coding small RNAs (sRNAs) from the intergenic regions, of which 36 had no predicted function. Half of the identified sRNA sequences were found to be unique to <it>S. </it><it>pneumoniae </it>genome. We identified eight overrepresented sequence motifs among sRNA sequences that correspond to sRNAs in different functional categories. Tiling arrays also identified approximately 202 operon structures in the genome.</p> <p>Conclusions</p> <p>In summary, the pneumococcal operon structures and novel sRNAs identified in this study enhance our understanding of the complexity and extent of the pneumococcal 'expressed' genome. Furthermore, the results of this study open up new avenues of research for understanding the complex RNA regulatory network governing <it>S. pneumoniae </it>physiology and virulence.</p
Deep Sensitivity Analysis for Objective-Oriented Combinatorial Optimization
Pathogen control is a critical aspect of modern poultry farming, providing
important benefits for both public health and productivity. Effective poultry
management measures to reduce pathogen levels in poultry flocks promote food
safety by lowering risks of food-borne illnesses. They also support animal
health and welfare by preventing infectious diseases that can rapidly spread
and impact flock growth, egg production, and overall health. This study frames
the search for optimal management practices that minimize the presence of
multiple pathogens as a combinatorial optimization problem. Specifically, we
model the various possible combinations of management settings as a solution
space that can be efficiently explored to identify configurations that
optimally reduce pathogen levels. This design incorporates a neural network
feedback-based method that combines feature explanations with global
sensitivity analysis to ensure combinatorial optimization in multiobjective
settings. Our preliminary experiments have promising results when applied to
two real-world agricultural datasets. While further validation is still needed,
these early experimental findings demonstrate the potential of the model to
derive targeted feature interactions that adaptively optimize pathogen control
under varying real-world constraints.Comment: The 2023 International Conference on Computational Science &
Computational Intelligence (CSCI'23
EndToEndML: An Open-Source End-to-End Pipeline for Machine Learning Applications
Artificial intelligence (AI) techniques are widely applied in the life
sciences. However, applying innovative AI techniques to understand and
deconvolute biological complexity is hindered by the learning curve for life
science scientists to understand and use computing languages. An open-source,
user-friendly interface for AI models, that does not require programming skills
to analyze complex biological data will be extremely valuable to the
bioinformatics community. With easy access to different sequencing technologies
and increased interest in different 'omics' studies, the number of biological
datasets being generated has increased and analyzing these high-throughput
datasets is computationally demanding. The majority of AI libraries today
require advanced programming skills as well as machine learning, data
preprocessing, and visualization skills. In this research, we propose a
web-based end-to-end pipeline that is capable of preprocessing, training,
evaluating, and visualizing machine learning (ML) models without manual
intervention or coding expertise. By integrating traditional machine learning
and deep neural network models with visualizations, our library assists in
recognizing, classifying, clustering, and predicting a wide range of
multi-modal, multi-sensor datasets, including images, languages, and
one-dimensional numerical data, for drug discovery, pathogen classification,
and medical diagnostics.Comment: 2024 7th International Conference on Information and Computer
Technologies (ICICT
Comprehensive proteomic analysis of bovine spermatozoa of varying fertility rates and identification of biomarkers associated with fertility
<p>Abstract</p> <p>Background</p> <p>Male infertility is a major problem for mammalian reproduction. However, molecular details including the underlying mechanisms of male fertility are still not known. A thorough understanding of these mechanisms is essential for obtaining consistently high reproductive efficiency and to ensure lower cost and time-loss by breeder.</p> <p>Results</p> <p>Using high and low fertility bull spermatozoa, here we employed differential detergent fractionation multidimensional protein identification technology (DDF-Mud PIT) and identified 125 putative biomarkers of fertility. We next used quantitative Systems Biology modeling and canonical protein interaction pathways and networks to show that high fertility spermatozoa differ from low fertility spermatozoa in four main ways. Compared to sperm from low fertility bulls, sperm from high fertility bulls have higher expression of proteins involved in: energy metabolism, cell communication, spermatogenesis, and cell motility. Our data also suggests a hypothesis that low fertility sperm DNA integrity may be compromised because cell cycle: G<sub>2</sub>/M DNA damage checkpoint regulation was most significant signaling pathway identified in low fertility spermatozoa.</p> <p>Conclusion</p> <p>This is the first comprehensive description of the bovine spermatozoa proteome. Comparative proteomic analysis of high fertility and low fertility bulls, in the context of protein interaction networks identified putative molecular markers associated with high fertility phenotype.</p
- …
