51 research outputs found

    HuCoPIA: An Atlas of Human vs. SARS-CoV-2 Interactome and the Comparative Analysis with Other Coronaviridae Family Viruses

    Get PDF
    SARS-CoV-2, a novel betacoronavirus strain, has caused a pandemic that has claimed the lives of nearly 6.7M people worldwide. Vaccines and medicines are being developed around the world to reduce the disease spread, fatality rates, and control the new variants. Understanding the protein-protein interaction mechanism of SARS-CoV-2 in humans, and their comparison with the previous SARS-CoV and MERS strains, is crucial for these efforts. These interactions might be used to assess vaccination effectiveness, diagnose exposure, and produce effective biotherapeutics. Here, we present the HuCoPIA database, which contains approximately 100,000 protein-protein interactions between humans and three strains (SARS-CoV-2, SARS-CoV, and MERS) of betacoronavirus. The interactions in the database are divided into common interactions between all three strains and those unique to each strain. It also contains relevant functional annotation information of human proteins. The HuCoPIA database contains SARS-CoV-2 (41,173), SARS-CoV (31,997), and MERS (26,862) interactions, with functional annotation of human proteins like subcellular localization, tissue-expression, KEGG pathways, and Gene ontology information. We believe HuCoPIA will serve as an invaluable resource to diverse experimental biologists, and will help to advance the research in better understanding the mechanism of betacoronaviruses

    alfaNET: A Database of Alfalfa-Bacterial Stem Blight Protein–Protein Interactions Revealing the Molecular Features of the Disease-Causing Bacteria

    Get PDF
    Alfalfa has emerged as one of the most important forage crops, owing to its wide adaptation and high biomass production worldwide. In the last decade, the emergence of bacterial stem blight (caused by Pseudomonas syringae pv. syringae ALF3) in alfalfa has caused around 50% yield losses in the United States. Studies are being conducted to decipher the roles of the key genes and pathways regulating the disease, but due to the sparse knowledge about the infection mechanisms of Pseudomonas, the development of resistant cultivars is hampered. The database alfaNET is an attempt to assist researchers by providing comprehensive Pseudomonas proteome annotations, as well as a host–pathogen interactome tool, which predicts the interactions between host and pathogen based on orthology. alfaNET is a user-friendly and efficient tool and includes other features such as subcellular localization annotations of pathogen proteins, gene ontology (GO) annotations, network visualization, and effector protein prediction. Users can also browse and search the database using particular keywords or proteins with a specific length. Additionally, the BLAST search tool enables the user to perform a homology sequence search against the alfalfa and Pseudomonas proteomes. With the successful implementation of these attributes, alfaNET will be a beneficial resource to the research community engaged in implementing molecular strategies to mitigate the disease. alfaNET is freely available for public use at http://bioinfo.usu.edu/alfanet/

    Deciphering the complete human-monkeypox virus interactome: Identifying immune responses and potential drug targets

    Get PDF
    Monkeypox virus (MPXV) is a dsDNA virus, belonging to Poxviridae family. The outbreak of monkeypox disease in humans is critical in European and Western countries, owing to its origin in African regions. The highest number of cases of the disease were found in the United States, followed by Spain and Brazil. Understanding the complete infection mechanism of diverse MPXV strains and their interaction with humans is important for therapeutic drug development, and to avoid any future epidemics. Using computational systems biology, we deciphered the genome-wide protein-protein interactions (PPIs) between 22 MPXV strains and human proteome. Based on phylogenomics and disease severity, 3 different strains of MPXV: Zaire-96-I-16, MPXV-UK_P2, and MPXV_USA_2022_MA001 were selected for comparative functional analysis of the proteins involved in the interactions. On an average, we predicted around 92,880 non-redundant PPIs between human and MPXV proteomes, involving 8014 host and 116 pathogen proteins from the 3 strains. The gene ontology (GO) enrichment analysis revealed 10,624 common GO terms in which the host proteins of 3 strains were highly enriched. These include significant GO terms such as platelet activation (GO:0030168), GABA-A receptor complex (GO:1902711), and metalloendopeptidase activity (GO:0004222). The host proteins were also significantly enriched in calcium signaling pathway (hsa04020), MAPK signaling pathway (hsa04010), and inflammatory mediator regulation of TRP channels (hsa04750). These significantly enriched GO terms and KEGG pathways are known to be implicated in immunomodulatory and therapeutic role in humans during viral infection. The protein hubs analysis revealed that most of the MPXV proteins form hubs with the protein kinases and AGC kinase C-terminal domains. Furthermore, subcellular localization revealed that most of the human proteins were localized in cytoplasm (29.22%) and nucleus (26.79%). A few drugs including Fostamatinib, Tamoxifen and others were identified as potential drug candidates against the monkeypox virus disease. This study reports the genome-scale PPIs elucidation in human-monkeypox virus pathosystem, thus facilitating the research community with functional insights into the monkeypox disease infection mechanism and augment the drug development

    \u3ci\u3eranchSATdb\u3c/i\u3e: A Genome-Wide Simple Sequence Repeat (SSR) Markers Database of Livestock Species for Mutant Germplasm Characterization and Improving Farm Animal Health

    Get PDF
    Microsatellites, also known as simple sequence repeats (SSRs), are polymorphic loci that play an important role in genome research, animal breeding, and disease control. Ranch animals are important components of agricultural landscape. The ranch animal SSR database, ranchSATdb, is a web resource which contains 15,520,263 putative SSR markers. This database provides a comprehensive tool for performing end-to-end marker selection, from SSRs prediction to generating marker primers and their cross-species feasibility, visualization of the resulting markers, and finding similarities between the genomic repeat sequences all in one place without the need to switch between other resources. The user-friendly online interface allows users to browse SSRs by genomic coordinates, repeat motif sequence, chromosome, motif type, motif frequency, and functional annotation. Users may enter their preferred flanking area around the repeat to retrieve the nucleotide sequence, they can investigate SSRs present in the genic or the genes between SSRs, they can generate custom primers, and they can also execute in silico validation of primers using electronic PCR. For customized sequences, an SSR prediction pipeline called miSATminer is also built. New species will be added to this website’s database on a regular basis throughout time. To improve animal health via genomic selection, we hope that ranchSATdb will be a useful tool for mapping quantitative trait loci (QTLs) and marker-assisted selection. The web-resource is freely accessible at https://bioinfo.usu.edu/ranchSATdb/

    RSLpred: an integrative system for predicting subcellular localization of rice proteins combining compositional and evolutionary information

    Get PDF
    The attainment of complete map-based sequence for rice (Oryza sativa) is clearly a major milestone for the research community. Identifying the localization of encoded proteins is the key to understanding their functional characteristics and facilitating their purification. Our proposed method, RSLpred, is an effort in this direction for genome-scale subcellular prediction of encoded rice proteins. First, the support vector machine (SVM)-based modules have been developed using traditional amino acid-, dipeptide- (i+1) and four parts-amino acid composition and achieved an overall accuracy of 81.43, 80.88 and 81.10%, respectively. Secondly, a similarity search-based module has been developed using position-specific iterated-basic local alignment search tool and achieved 68.35% accuracy. Another module developed using evolutionary information of a protein sequence extracted from position-specific scoring matrix achieved an accuracy of 87.10%. In this study, a large number of modules have been developed using various encoding schemes like higher-order dipeptide composition, N- and C-terminal, splitted amino acid composition and the hybrid information. In order to benchmark RSLpred, it was tested on an independent set of rice proteins where it outperformed widely used prediction methods such as TargetP, Wolf-PSORT, PA-SUB, Plant-Ploc and ESLpred. To assist the plant research community, an online web tool 'RSLpred' has been developed for subcellular prediction of query rice proteins, which is freely accessible at http://www.imtech.res.in/raghava/rslpred

    Comparative Genome-Wide Analysis of MicroRNAs and Their Target Genes in Roots of Contrasting \u3cem\u3eIndica\u3c/em\u3e Rice Cultivars under Reproductive-Stage Drought

    Get PDF
    Recurrent occurrence of drought stress in varying intensity has become a common phenomenon in the present era of global climate change, which not only causes severe yield losses but also challenges the cultivation of rice. This raises serious concerns for sustainable food production and global food security. The root of a plant is primarily responsible to perceive drought stress and acquire sufficient water for the survival/optimal growth of the plant under extreme climatic conditions. Earlier studies reported the involvement/important roles of microRNAs (miRNAs) in plants’ responses to environmental/abiotic stresses. A number (738) of miRNAs is known to be expressed in different tissues under varying environmental conditions in rice, but our understanding of the role, mode of action, and target genes of the miRNAs are still elusive. Using contrasting rice [IR-64 (reproductive-stage drought sensitive) and N-22 (drought-tolerant)] cultivars, imposed with terminal (reproductive-stage) drought stress, we demonstrate differential expression of 270 known and 91 novel miRNAs in roots of the contrasting rice cultivars in response to the stress. Among the known miRNAs, osamiR812, osamiR166, osamiR156, osamiR167, and osamiR396 were the most differentially expressed miRNAs between the rice cultivars. In the root of N-22, 18 known and 12 novel miRNAs were observed to be exclusively expressed, while only two known (zero novels) miRNAs were exclusively expressed in the roots of IR-64. The majority of the target gene(s) of the miRNAs were drought-responsive transcription factors playing important roles in flower, grain development, auxin signaling, root development, and phytohormone-crosstalk. The novel miRNAs identified in this study may serve as good candidates for the genetic improvement of rice for terminal drought stress towards developing climate-smart rice for sustainable food production

    Predicting genome-scale Arabidopsis-Pseudomonas syringae interactome using domain and interolog-based approaches

    Get PDF
    Background: Every year pathogenic organisms cause billions of dollars' worth damage to crops and livestock. In agriculture, study of plant-microbe interactions is demanding a special attention to develop management strategies for the destructive pathogen induced diseases that cause huge crop losses every year worldwide. Pseudomonas syringae is a major bacterial leaf pathogen that causes diseases in a wide range of plant species. Among its various strains, pathovar tomato strain DC3000 (PstDC3000) is asserted to infect the plant host Arabidopsis thaliana and thus, has been accepted as a model system for experimental characterization of the molecular dynamics of plant-pathogen interactions. Protein-protein interactions (PPIs) play a critical role in initiating pathogenesis and maintaining infection. Understanding the PPI network between a host and pathogen is a critical step for studying the molecular basis of pathogenesis. The experimental study of PPIs at a large scale is very scarce and also the high throughput experimental results show high false positive rate. Hence, there is a need for developing efficient computational models to predict the interaction between host and pathogen in a genome scale, and find novel candidate effectors and/or their targets.Results: In this study, we used two computational approaches, the interolog and the domain-based to predict the interactions between Arabidopsis and PstDC3000 in genome scale. The interolog method relies on protein sequence similarity to conduct the PPI prediction. A Pseudomonas protein and an Arabidopsis protein are predicted to interact with each other if an experimentally verified interaction exists between their respective homologous proteins in another organism. The domain-based method uses domain interaction information, which is derived from known protein 3D structures, to infer the potential PPIs. If a Pseudomonas and an Arabidopsis protein contain an interacting domain pair, one can expect the two proteins to interact with each other. The interolog-based method predicts ~0.79M PPIs involving around 7700 Arabidopsis and 1068 Pseudomonas proteins in the full genome. The domain-based method predicts 85650 PPIs comprising 11432 Arabidopsis and 887 Pseudomonas proteins. Further, around 11000 PPIs have been identified as interacting from both the methods as a consensus.Conclusion: The present work predicts the protein-protein interaction network between Arabidopsis thaliana and Pseudomonas syringae pv. tomato DC3000 in a genome wide scale with a high confidence. Although the predicted PPIs may contain some false positives, the computational methods provide reasonable amount of interactions which can be further validated by high throughput experiments. This can be a useful resource to the plant community to characterize the host-pathogen interaction in Arabidopsis and Pseudomonas system. Further, these prediction models can be applied to the agriculturally relevant crops.Peer reviewedNational Institute for Microbial Forensics and Food and Agricultural BiosecurityBiochemistry and Molecular Biolog

    Identification and characterization of plastid-type proteins from sequence-attributed features using machine learning

    Get PDF
    Background: Plastids are an important component of plant cells, being the site of manufacture and storage of chemical compounds used by the cell, and contain pigments such as those used in photosynthesis, starch synthesis/storage, cell color etc. They are essential organelles of the plant cell, also present in algae. Recent advances in genomic technology and sequencing efforts is generating a huge amount of DNA sequence data every day. The predicted proteome of these genomes needs annotation at a faster pace. In view of this, one such annotation need is to develop an automated system that can distinguish between plastid and non-plastid proteins accurately, and further classify plastid-types based on their functionality. We compared the amino acid compositions of plastid proteins with those of non-plastid ones and found significant differences, which were used as a basis to develop various feature-based prediction models using similarity-search and machine learning.Results: In this study, we developed separate Support Vector Machine (SVM) trained classifiers for characterizing the plastids in two steps: first distinguishing the plastid vs. non-plastid proteins, and then classifying the identified plastids into their various types based on their function (chloroplast, chromoplast, etioplast, and amyloplast). Five diverse protein features: amino acid composition, dipeptide composition, the pseudo amino acid composition, Nterminal-Center-Cterminal composition and the protein physicochemical properties are used to develop SVM models. Overall, the dipeptide composition-based module shows the best performance with an accuracy of 86.80% and Matthews Correlation Coefficient (MCC) of 0.74 in phase-I and 78.60% with a MCC of 0.44 in phase-II. On independent test data, this model also performs better with an overall accuracy of 76.58% and 74.97% in phase-I and phase-II, respectively. The similarity-based PSI-BLAST module shows very low performance with about 50% prediction accuracy for distinguishing plastid vs. non-plastids and only 20% in classifying various plastid-types, indicating the need and importance of machine learning algorithms.Conclusion: The current work is a first attempt to develop a methodology for classifying various plastid-type proteins. The prediction modules have also been made available as a web tool, PLpred available at http://bioinfo.okstate.edu/PLpred/ for real time identification/characterization. We believe this tool will be very useful in the functional annotation of various genomes.Peer reviewedNational Institute for Microbial Forensics and Food and Agricultural BiosecurityBiochemistry and Molecular Biolog

    LacSubPred: Predicting subtypes of Laccases, an important lignin metabolism-related enzyme class, using in silico approaches

    Get PDF
    Background: Laccases (E.C. 1.10.3.2) are multi-copper oxidases that have gained importance in many industries such as biofuels, pulp production, textile dye bleaching, bioremediation, and food production. Their usefulness stems from the ability to act on a diverse range of phenolic compounds such as o-/p-quinols, aminophenols, polyphenols, polyamines, aryl diamines, and aromatic thiols. Despite acting on a wide range of compounds as a family, individual Laccases often exhibit distinctive and varied substrate ranges. This is likely due to Laccases involvement in many metabolic roles across diverse taxa. Classification systems for multi-copper oxidases have been developed using multiple sequence alignments, however, these systems seem to largely follow species taxonomy rather than substrate ranges, enzyme properties, or specific function. It has been suggested that the roles and substrates of various Laccases are related to their optimal pH. This is consistent with the observation that fungal Laccases usually prefer acidic conditions, whereas plant and bacterial Laccases prefer basic conditions. Based on these observations, we hypothesize that a descriptor-based unsupervised learning system could generate homology independent classification system for better describing the functional properties of Laccases.Results: In this study, we first utilized unsupervised learning approach to develop a novel homology independent Laccase classification system. From the descriptors considered, physicochemical properties showed the best performance. Physicochemical properties divided the Laccases into twelve subtypes. Analysis of the clusters using a t-test revealed that the majority of the physicochemical descriptors had statistically significant differences between the classes. Feature selection identified the most important features as negatively charges residues, the peptide isoelectric point, and acidic or amidic residues. Secondly, to allow for classification of new Laccases, a supervised learning system was developed from the clusters. The models showed high performance with an overall accuracy of 99.03%, error of 0.49%, MCC of 0.9367, precision of 94.20%, sensitivity of 94.20%, and specificity of 99.47% in a 5-fold cross-validation test. In an independent test, our models still provide a high accuracy of 97.98%, error rate of 1.02%, MCC of 0.8678, precision of 87.88%, sensitivity of 87.88% and specificity of 98.90%.Conclusion: This study provides a useful classification system for better understanding of Laccases from their physicochemical properties perspective. We also developed a publically available web tool for the characterization of Laccase protein sequences (http://lacsubpred.bioinfo.ucr.edu/). Finally, the programs used in the study are made available for researchers interested in applying the system to other enzyme classes (https://github.com/tweirick/SubClPred).Peer reviewedNational Institute for Microbial Forensics and Food and Agricultural BiosecurityBiochemistry and Molecular Biolog
    • …
    corecore