25 research outputs found

    Protein Buffering in Model Systems and in Whole Human Saliva

    Get PDF
    The aim of this study was to quantify the buffer attributes (value, power, range and optimum) of two model systems for whole human resting saliva, the purified proteins from whole human resting saliva and single proteins. Two model systems, the first containing amyloglucosidase and lysozyme, and the second containing amyloglucosidase and α-amylase, were shown to provide, in combination with hydrogencarbonate and di-hydrogenphosphate, almost identical buffer attributes as whole human resting saliva. It was further demonstrated that changes in the protein concentration as small as 0.1% may change the buffer value of a buffer solution up to 15 times. Additionally, it was shown that there was a protein concentration change in the same range (0.16%) between saliva samples collected at the time periods of 13:00 and others collected at 9:00 am and 17:00. The mode of the protein expression changed between these samples corresponded to the change in basic buffer power and the change of the buffer value at pH 6.7. Finally, SDS Page and Ruthenium II tris (bathophenantroline disulfonate) staining unveiled a constant protein expression in all samples except for one 50 kDa protein band. As the change in the expression pattern of that 50 kDa protein band corresponded to the change in basic buffer power and the buffer value at pH 6.7, it was reasonable to conclude that this 50 kDa protein band may contain the protein(s) belonging to the protein buffer system of human saliva

    A multivariate prediction model for microarray cross-hybridization

    Get PDF
    BACKGROUND: Expression microarray analysis is one of the most popular molecular diagnostic techniques in the post-genomic era. However, this technique faces the fundamental problem of potential cross-hybridization. This is a pervasive problem for both oligonucleotide and cDNA microarrays; it is considered particularly problematic for the latter. No comprehensive multivariate predictive modeling has been performed to understand how multiple variables contribute to (cross-) hybridization. RESULTS: We propose a systematic search strategy using multiple multivariate models [multiple linear regressions, regression trees, and artificial neural network analyses (ANNs)] to select an effective set of predictors for hybridization. We validate this approach on a set of DNA microarrays with cytochrome p450 family genes. The performance of our multiple multivariate models is compared with that of a recently proposed third-order polynomial regression method that uses percent identity as the sole predictor. All multivariate models agree that the 'most contiguous base pairs between probe and target sequences,' rather than percent identity, is the best univariate predictor. The predictive power is improved by inclusion of additional nonlinear effects, in particular target GC content, when regression trees or ANNs are used. CONCLUSION: A systematic multivariate approach is provided to assess the importance of multiple sequence features for hybridization and of relationships among these features. This approach can easily be applied to larger datasets. This will allow future developments of generalized hybridization models that will be able to correct for false-positive cross-hybridization signals in expression experiments

    Methods for Comparing a DNA Sequence with a Protein Sequence

    Get PDF
    We describe two methods for constructing an optimal global alignment of, and an optimal local alignment between, a DNA sequence and a protein sequence. The alignment model of the methods addresses the problems of frameshifts and introns in the DNA sequence. The methods require computer memory proportional to the sequence lengths, so they can rigorously process very huge sequences. The simplified versions of the methods were implemented as computer programs named NAP and LAP. The experimental results demonstrate that the programs are sensitive and powerful tools for finding genes by DNA-protein sequence homology

    Adaptive GDDA-BLAST: Fast and Efficient Algorithm for Protein Sequence Embedding

    Get PDF
    A major computational challenge in the genomic era is annotating structure/function to the vast quantities of sequence information that is now available. This problem is illustrated by the fact that most proteins lack comprehensive annotations, even when experimental evidence exists. We previously theorized that embedded-alignment profiles (simply “alignment profiles” hereafter) provide a quantitative method that is capable of relating the structural and functional properties of proteins, as well as their evolutionary relationships. A key feature of alignment profiles lies in the interoperability of data format (e.g., alignment information, physio-chemical information, genomic information, etc.). Indeed, we have demonstrated that the Position Specific Scoring Matrices (PSSMs) are an informative M-dimension that is scored by quantitatively measuring the embedded or unmodified sequence alignments. Moreover, the information obtained from these alignments is informative, and remains so even in the “twilight zone” of sequence similarity (<25% identity) [1]–[5]. Although our previous embedding strategy was powerful, it suffered from contaminating alignments (embedded AND unmodified) and high computational costs. Herein, we describe the logic and algorithmic process for a heuristic embedding strategy named “Adaptive GDDA-BLAST.” Adaptive GDDA-BLAST is, on average, up to 19 times faster than, but has similar sensitivity to our previous method. Further, data are provided to demonstrate the benefits of embedded-alignment measurements in terms of detecting structural homology in highly divergent protein sequences and isolating secondary structural elements of transmembrane and ankyrin-repeat domains. Together, these advances allow further exploration of the embedded alignment data space within sufficiently large data sets to eventually induce relevant statistical inferences. We show that sequence embedding could serve as one of the vehicles for measurement of low-identity alignments and for incorporation thereof into high-performance PSSM-based alignment profiles

    Human genetic polymorphisms in T1R1 and T1R3 taste receptor subunits affect their function.

    Get PDF
    International audienceUmami is the typical taste induced by monosodium glutamate (MSG), which is thought to be detected by the heterodimeric G protein-coupled receptor, T1R1 and T1R3. Previously, we showed that MSG detection thresholds differ substantially between individuals and we further showed that nontaster and hypotaster subjects are associated with nonsynonymous single polymorphisms occurring in the T1R1 and T1R3 genes. Here, we show using functional expression that both amino acid substitutions (A110V and R507Q) in the N-terminal ligand-binding domain of T1R1 and the 2 other ones (F749S and R757C), located in the transmembrane domain of T1R3, severely impair in vitro T1R1/T1R3 response to MSG. A molecular model of the ligand-binding region of T1R1/T1R3 provides a mechanistic explanation supporting functional expression data. The data presented here support causal relations between the genotype and previous in vivo psychophysical studies in human evaluating sensitivity to MSG

    The accuracy of several multiple sequence alignment programs for proteins

    Get PDF
    BACKGROUND: There have been many algorithms and software programs implemented for the inference of multiple sequence alignments of protein and DNA sequences. The "true" alignment is usually unknown due to the incomplete knowledge of the evolutionary history of the sequences, making it difficult to gauge the relative accuracy of the programs. RESULTS: We tested nine of the most often used protein alignment programs and compared their results using sequences generated with the simulation software Simprot which creates known alignments under realistic and controlled evolutionary scenarios. We have simulated more than 30000 alignment sets using various evolutionary histories in order to define strengths and weaknesses of each program tested. We found that alignment accuracy is extremely dependent on the number of insertions and deletions in the sequences, and that indel size has a weaker effect. We also considered benchmark alignments from the latest version of BAliBASE and the results relative to BAliBASE- and Simprot-generated data sets were consistent in most cases. CONCLUSION: Our results indicate that employing Simprot's simulated sequences allows the creation of a more flexible and broader range of alignment classes than the usual methods for alignment accuracy assessment. Simprot also allows for a quick and efficient analysis of a wider range of possible evolutionary histories that might not be present in currently available alignment sets. Among the nine programs tested, the iterative approach available in Mafft (L-INS-i) and ProbCons were consistently the most accurate, with Mafft being the faster of the two

    UTILIZACIÓN DE UNA PLATAFORMA DE PROCESAMIENTO DISTRIBUIDO PARA LA DETECCIÓN DE POTENCIAL PLAGIO CON INDICADORES DE PROBABILIDAD DE CERTEZA DE LAS TAREAS ENVIADAS A UN SISTEMA DE ADMINISTRACIÓN DE CURSOS

    Get PDF
    This paper presents the analysis, design, implementation and testing of a module for potential plagiarism detection of homework sent to a Course Management System, using the distributed processing platform Hadoop. This paper analyze the plagiarism problem that happened in homework done digitally by students and it are receipted by the Course Management Systems. Also, the paper shows a conceptual analysis for understanding how the necessity of comparing two sequences is present in other branches of science and how the solution has been proposed with the use of technological tools. Likewise, the document details how the problems were faced by dividing the process in two parts: the pre-processing of documents to generate plain text files and the implementation of the Smith-Waterman algorithm with the PhD. Robert W. Irving’s improvements. Finally, the paper shows a summary of testing results done over the distributed processing platform.En este artículo se expone el análisis, diseño, implementación y pruebas de un módulo para la detección de potencial plagio de las tareas enviadas a un Sistema de Administración de Cursos, utilizando la plataforma de procesamiento distribuido Hadoop. En el presente trabajo se analiza la problemática del plagio que ocurre en las tareas elaboradas digitalmente por los estudiantes y que son receptadas por los Sistemas de Administración de Cursos. Además, se realiza un análisis conceptual para comprender cómo la necesidad de comparar dos secuencias está presente en otras ramas de la ciencia y cómo la solución ha sido propuesta con el uso de herramientas informáticas. Así mismo, se exponen las tecnologías utilizadas para el desarrollo del módulo; y se detalla cómo se hizo frente a la problemática, dividiendo el proceso en dos partes: el pre-procesamiento de los documentos para generar archivos en texto plano; y la implementación del algoritmo de Smith-Waterman con las mejoras planteadas por PhD. Robert W. Irving. Finalmente, se muestra un resumen con los resultados de las pruebas realizadas sobre el ambiente de procesamiento distribuido
    corecore