14,235 research outputs found

    Data Mining Approach for Amino Acid Sequence Classification

    Get PDF
    Computerized applications are employed all around the world, an enormous amount of data is collected. The essential information contained in large amounts of data is attracting scholars from a variety of disciplines to examine how to extract the hidden knowledge inside them. The technique of obtaining or mining usable and valuable knowledge from enormous amounts of data is known as data mining. Text mining, picture mining, sequential pattern mining, web mining, and so on are all examples of data mining fields. Sequencing mining is one of the most important technologies in this field, as it aids in the discovery of sequential connections in data. Sequence mining is used in a variety of applications, including customers' buying trends analysis, web access trends analysis, atmospheric observation, amino acid sequences, Gene sequencing, and so on. Sequence mining techniques are utilized in protein and DNA analysis for sequence alignment, pattern searching, and pattern categorization. Researchers are exhibiting an interest in the subject of amino acid sequence categorization in the field of amino acid sequence analysis. It has the ability to find recurrent patterns in homologous proteins. This study describes the numerous methods used by numerous studies to categories proteins and gives an overview of the most important sequence classification techniques

    Adaptive evolution in the toxicity of a spider’s venom enzymes

    Get PDF

    Estimating selection pressures on HIV-1 using phylogenetic likelihood models

    Get PDF
    Human immunodeficiency virus (HIV-1) can rapidly evolve due to selection pressures exerted by HIV-specific immune responses, antiviral agents, and to allow the virus to establish infection in different compartments in the body. Statistical models applied to HIV-1 sequence data can help to elucidate the nature of these selection pressures through comparisons of non-synonymous (or amino acid changing) and synonymous (or amino acid preserving) substitution rates. These models also need to take into account the non-independence of sequences due to their shared evolutionary history. We review how we have developed these methods and have applied them to characterize the evolution of HIV-1 in vivo.To illustrate our methods, we present an analysis of compartment-specific evolution of HIV-1 env in blood and cerebrospinal fluid and of site-to-site variation in the gag gene of subtype C HIV-1

    Identification and analysis of seven effector protein families with different adaptive and evolutionary histories in plant-associated members of the Xanthomonadaceae.

    Get PDF
    The Xanthomonadaceae family consists of species of non-pathogenic and pathogenic γ-proteobacteria that infect different hosts, including humans and plants. In this study, we performed a comparative analysis using 69 fully sequenced genomes belonging to this family, with a focus on identifying proteins enriched in phytopathogens that could explain the lifestyle and the ability to infect plants. Using a computational approach, we identified seven phytopathogen-enriched protein families putatively secreted by type II secretory system: PheA (CM-sec), LipA/LesA, VirK, and four families involved in N-glycan degradation, NixE, NixF, NixL, and FucA1. In silico and phylogenetic analyses of these protein families revealed they all have orthologs in other phytopathogenic or symbiotic bacteria, and are involved in the modulation and evasion of the immune system. As a proof of concept, we performed a biochemical characterization of LipA from Xac306 and verified that the mutant strain lost most of its lipase and esterase activities and displayed reduced virulence in citrus. Since this study includes closely related organisms with distinct lifestyles and highlights proteins directly related to adaptation inside plant tissues, novel approaches might use these proteins as biotechnological targets for disease control, and contribute to our understanding of the coevolution of plant-associated bacteria

    Protein Tertiary Model Assessment Using Granular Machine Learning Techniques

    Get PDF
    The automatic prediction of protein three dimensional structures from its amino acid sequence has become one of the most important and researched fields in bioinformatics. As models are not experimental structures determined with known accuracy but rather with prediction it’s vital to determine estimates of models quality. We attempt to solve this problem using machine learning techniques and information from both the sequence and structure of the protein. The goal is to generate a machine that understands structures from PDB and when given a new model, predicts whether it belongs to the same class as the PDB structures (correct or incorrect protein models). Different subsets of PDB (protein data bank) are considered for evaluating the prediction potential of the machine learning methods. Here we show two such machines, one using SVM (support vector machines) and another using fuzzy decision trees (FDT). First using a preliminary encoding style SVM could get around 70% in protein model quality assessment accuracy, and improved Fuzzy Decision Tree (IFDT) could reach above 80% accuracy. For the purpose of reducing computational overhead multiprocessor environment and basic feature selection method is used in machine learning algorithm using SVM. Next an enhanced scheme is introduced using new encoding style. In the new style, information like amino acid substitution matrix, polarity, secondary structure information and relative distance between alpha carbon atoms etc is collected through spatial traversing of the 3D structure to form training vectors. This guarantees that the properties of alpha carbon atoms that are close together in 3D space and thus interacting are used in vector formation. With the use of fuzzy decision tree, we obtained a training accuracy around 90%. There is significant improvement compared to previous encoding technique in prediction accuracy and execution time. This outcome motivates to continue to explore effective machine learning algorithms for accurate protein model quality assessment. Finally these machines are tested using CASP8 and CASP9 templates and compared with other CASP competitors, with promising results. We further discuss the importance of model quality assessment and other information from proteins that could be considered for the same

    Response to dietary tannin challenges in view of the browser/grazer dichotomy in an Ethiopian setting : Bonga sheep versus Kaffa goats

    Get PDF
    It has been suggested that goats (typical browser) are better adapted to digest tannin-rich diets than sheep (typical grazer). To evaluate this, Bonga sheep and Kaffa goats were used in a 2x3 randomized crossover design with two species, three diets, and three periods (15-day adaptation+7-day collection). The dietary treatments consisted of grass-based hay only (tannin-free diet=FT), a high-tannin diet (36 % Albizia schimperiana (AS)+9 % Ficus elastica (FE)+ 55 % FT (HT)), and HT+polyethylene glycol 6000 (PEG). Animals were individually fed at 50 g dry matter (DM)/kg body weight (BW) and had free access to clean drinking water and mineralized salt licks. Nutrient intake, apparent nutrient digestibility, nutrient conversion ratios, and live weight changes were determined. Condensed tannin concentrations in AS and FE were 110 and 191 g/kg DM, respectively. Both sheep and goats ate 47 % more of HT than FT, and dry matter intake further increased by 9 % when PEG was added, with clear difference in effect size between goats and sheep (P<0.001). The effects of the tannin-rich diet and PEG addition were similarly positive for DM digestibility between sheep and goats, but crude protein (CP) digestibility was higher in HT+PEG-fed goats than in sheep fed the same diet. However, PEG addition induced a larger improvement in growth performance and feed efficiency ratio in sheep than in goat (P<0.001). The addition of PEG as a tannin binder improved digestion and performance in both species, but with the highest effect size in sheep

    Molecular model of the outward facing state of the human P-glycoprotein (ABCB1), and comparison to a model of the human MRP5 (ABCC5)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Multidrug resistance is a particular limitation to cancer chemotherapy, antibiotic treatment and HIV medication. The ABC (ATP binding cassette) transporters human P-glycoprotein (ABCB1) and the human MRP5 (ABCC5) are involved in multidrug resistance.</p> <p>Results</p> <p>In order to elucidate structural and molecular concepts of multidrug resistance, we have constructed a molecular model of the ATP-bound outward facing conformation of the human multidrug resistance protein ABCB1 using the Sav1866 crystal structure as a template, and compared the ABCB1 model with a previous ABCC5 model. The electrostatic potential surface (EPS) of the ABCB1 substrate translocation chamber, which transports cationic amphiphilic and lipophilic substrates, was neutral with negative and weakly positive areas. In contrast, EPS of the ABCC5 substrate translocation chamber, which transports organic anions, was generally positive. Positive-negative ratios of amino acids in the TMDs of ABCB1 and ABCC5 were also analyzed, and the positive-negative ratio of charged amino acids was higher in the ABCC5 TMDs than in the ABCB1 TMDs. In the ABCB1 model residues Leu65 (transmembrane helix 1 (TMH1)), Ile306 (TMH5), Ile340 (TMH6) and Phe343 (TMH6) may form a binding site, and this is in accordance with previous site directed mutagenesis studies.</p> <p>Conclusion</p> <p>The Sav1866 X-ray structure may serve as a suitable template for the ABCB1 model, as it did with ABCC5. The EPS in the substrate translocation chambers and the positive-negative ratio of charged amino acids were in accordance with the transport of cationic amphiphilic and lipophilic substrates by ABCB1, and the transport of organic anions by ABCC5.</p
    corecore