66 research outputs found

    A new pooling strategy for high-throughput screening: the Shifted Transversal Design

    Get PDF
    BACKGROUND: In binary high-throughput screening projects where the goal is the identification of low-frequency events, beyond the obvious issue of efficiency, false positives and false negatives are a major concern. Pooling constitutes a natural solution: it reduces the number of tests, while providing critical duplication of the individual experiments, thereby correcting for experimental noise. The main difficulty consists in designing the pools in a manner that is both efficient and robust: few pools should be necessary to correct the errors and identify the positives, yet the experiment should not be too vulnerable to biological shakiness. For example, some information should still be obtained even if there are slightly more positives or errors than expected. This is known as the group testing problem, or pooling problem. RESULTS: In this paper, we present a new non-adaptive combinatorial pooling design: the "shifted transversal design" (STD). It relies on arithmetics, and rests on two intuitive ideas: minimizing the co-occurrence of objects, and constructing pools of constant-sized intersections. We prove that it allows unambiguous decoding of noisy experimental observations. This design is highly flexible, and can be tailored to function robustly in a wide range of experimental settings (i.e., numbers of objects, fractions of positives, and expected error-rates). Furthermore, we show that our design compares favorably, in terms of efficiency, to the previously described non-adaptive combinatorial pooling designs. CONCLUSION: This method is currently being validated by field-testing in the context of yeast-two-hybrid interactome mapping, in collaboration with Marc Vidal's lab at the Dana Farber Cancer Institute. Many similar projects could benefit from using the Shifted Transversal Design

    MatrixDB, a database focused on extracellular protein–protein and protein–carbohydrate interactions

    Get PDF
    Summary: MatrixDB (http://matrixdb.ibcp.fr) is a database reporting mammalian protein–protein and protein–carbohydrate interactions involving extracellular molecules. It takes into account the full interaction repertoire of the extracellular matrix involving full-length molecules, fragments and multimers. The current version of MatrixDB contains 1972 interactions corresponding to 4412 experiments and involving 259 extracellular biomolecules

    Mutations in DNAH1, which encodes an inner arm heavy chain dynein, lead to male infertility from multiple morphological abnormalities of the sperm flagella.

    Get PDF
    International audienceTen to fifteen percent of couples are confronted with infertility and a male factor is involved in approximately half the cases. A genetic etiology is likely in most cases yet only few genes have been formally correlated with male infertility. Homozygosity mapping was carried out on a cohort of 20 North African individuals, including 18 index cases, presenting with primary infertility resulting from impaired sperm motility caused by a mosaic of multiple morphological abnormalities of the flagella (MMAF) including absent, short, coiled, bent, and irregular flagella. Five unrelated subjects out of 18 (28%) carried a homozygous variant in DNAH1, which encodes an inner dynein heavy chain and is expressed in testis. RT-PCR, immunostaining, and electronic microscopy were carried out on samples from one of the subjects with a mutation located on a donor splice site. Neither the transcript nor the protein was observed in this individual, confirming the pathogenicity of this variant. A general axonemal disorganization including mislocalization of the microtubule doublets and loss of the inner dynein arms was observed. Although DNAH1 is also expressed in other ciliated cells, infertility was the only symptom of primary ciliary dyskinesia observed in affected subjects, suggesting that DNAH1 function in cilium is not as critical as in sperm flagellum

    New insights into protein-protein interaction data lead to increased estimates of the S. cerevisiae interactome size

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>As protein interactions mediate most cellular mechanisms, protein-protein interaction networks are essential in the study of cellular processes. Consequently, several large-scale interactome mapping projects have been undertaken, and protein-protein interactions are being distilled into databases through literature curation; yet protein-protein interaction data are still far from comprehensive, even in the model organism <it>Saccharomyces cerevisiae</it>. Estimating the interactome size is important for evaluating the completeness of current datasets, in order to measure the remaining efforts that are required.</p> <p>Results</p> <p>We examined the yeast interactome from a new perspective, by taking into account how thoroughly proteins have been studied. We discovered that the set of literature-curated protein-protein interactions is qualitatively different when restricted to proteins that have received extensive attention from the scientific community. In particular, these interactions are less often supported by yeast two-hybrid, and more often by more complex experiments such as biochemical activity assays. Our analysis showed that high-throughput and literature-curated interactome datasets are more correlated than commonly assumed, but that this bias can be corrected for by focusing on well-studied proteins. We thus propose a simple and reliable method to estimate the size of an interactome, combining literature-curated data involving well-studied proteins with high-throughput data. It yields an estimate of at least 37, 600 direct physical protein-protein interactions in <it>S. cerevisiae</it>.</p> <p>Conclusions</p> <p>Our method leads to higher and more accurate estimates of the interactome size, as it accounts for interactions that are genuine yet difficult to detect with commonly-used experimental assays. This shows that we are even further from completing the yeast interactome map than previously expected.</p

    A new pooling strategy for high-throughput screening: the Shifted Transversal Design-0

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "A new pooling strategy for high-throughput screening: the Shifted Transversal Design"</p><p>BMC Bioinformatics 2006;7():28-28.</p><p>Published online 19 Jan 2006</p><p>PMCID:PMC1409803.</p><p>Copyright © 2006 Thierry-Mieg; licensee BioMed Central Ltd.</p>me number q and builds the set of pools STD(n; q; t·Γ+2·E+1), as specified in corollary 2. Recall that n is the total number of variables and Γ is the compression power, i.e. the smallest such that q≄ n. This figure summarizes the behavior of these pools when the actual number of errors exceeds E, and distinguishes between the two types of errors: false positives and false negatives. In the dark blue region, all errors are detected and corrected. In the intermediate blue rectangles, correction is not guaranteed but detection is: in an unfavorable conformation of positives and errors, correction of all errors may fail, but this failure cannot go unnoticed, and the user can therefore plan additional experiments. In the cyan square, detection is usually also guaranteed, except if E is very small (E < 2·Γ-1): in this case, the line y = 3·E+1-x splits the square in two, and detection is only guaranteed in the bottom left portion, where the total number of errors is at most 3·E+1. Finally, in the outer pale cyan zone, no guarantee is provided

    Interpool: interpreting smart-pooling results

    No full text
    International audienc

    Modélisation informatique et analyse prédictive des interactions protéine-protéine chez Caenorhabditis elegans

    No full text
    L'objectif de cette thĂšse consiste en la modĂ©lisation informatiqueet l'analyse prĂ©dictive d'interactions protĂ©ine-protĂ©ine chez Caenorhabditis elegans. L'approche adoptĂ©e est la suivante. Dans un premier temps, nous avons participĂ© Ă  la production de donnĂ©es d'interaction dans le cadre du projet de cartographie systĂ©matique par double hybride des interactions protĂ©ine-protĂ©ine chez C. elegans, dirigĂ© par le professeur Marc Vidal au Dana Farber Cancer Institute, Boston. Nous avons Ă©tĂ© responsable de tous les aspects bioinformatiques : entre autres, conception d'amorces de PCR pour le clonage, dĂ©veloppement d'algorithmes pour la production des donnĂ©es, conception d'une base de donnĂ©es pour le stockage, mise en place d'une interface web pour la publication des rĂ©sultats. L'Ă©tape suivante a Ă©tĂ© consacrĂ©e Ă  la construction d'une base de donnĂ©es d'interactions protĂ©ine-protĂ©ine multi-organismes, fĂ©dĂ©rant les donnĂ©es d'interaction du laboratoire de Marc Vidal avec d'autres donnĂ©es disponibles dans des bases spĂ©cialisĂ©es. Une attention particuliĂšre a Ă©tĂ© prĂȘtĂ©e au choix des descripteurs retenus pour la caractĂ©risation des protĂ©ines. Les descripteurs retenus sont Ă©ventuellement la localisation cellulaire et les mots-clĂ©s issus de SwissProt, ainsi que les domaines de Pfam, Prostite, et plus rĂ©cemment InterPro. Enfin, une troisiĂšme partie a concernĂ© la conception et la mise en oeuvre d'un systĂšme prĂ©dictif d'interactions protĂ©ine-protĂ©ine. Notre objectif est d'orienter les recherches menĂ©es dans le laboratoire du professeur Vidal, en proposant des paires de protĂ©ines susceptibles d'interagir. La mĂ©thode dĂ©veloppĂ©e relĂšve du domaine de l'Extraction de connaissances Ă  partir de DonnĂ©es (KDD). La majeure partie du travail porte sur la conception de procĂ©dures originales de prĂ©-traitement et de post-traitement pertinentes. Les rĂ©sultats sont encourageants, et permettent d'envisager rapidement une validation biologique en collaboration avec le laboratoire du professeur Vidal.GRENOBLE1-BU Sciences (384212103) / SudocSudocFranceF
    • 

    corecore