47 research outputs found

    Semantic and generative models for lossy text compression

    Get PDF
    The apparent divergence between the research paradigms of text and image compression has led us to consider the potential for applying methods developed for one domain to the other. This paper examines the idea of "lossy" text compression, which transmits an approximation to the input text rather than the text itself. In image coding, lossy techniques have proven to yield compression factors that are vastly superior to those of the best lossless schemes, and we show that this a also the case for text. Two different methods are described here, one inspired by the use of fractals in image compression. They can be combined into an extremely effective technique that provides much better compression than the present state of the art and yet preserves a reasonable degree of match between the original and received text. The major challenge for lossy text compression is identified as the reliable evaluation of the quality of this match

    Multiple novel prostate cancer susceptibility signals identified by fine-mapping of known risk loci among Europeans

    Get PDF
    Genome-wide association studies (GWAS) have identified numerous common prostate cancer (PrCa) susceptibility loci. We have fine-mapped 64 GWAS regions known at the conclusion of the iCOGS study using large-scale genotyping and imputation in 25 723 PrCa cases and 26 274 controls of European ancestry. We detected evidence for multiple independent signals at 16 regions, 12 of which contained additional newly identified significant associations. A single signal comprising a spectrum of correlated variation was observed at 39 regions; 35 of which are now described by a novel more significantly associated lead SNP, while the originally reported variant remained as the lead SNP only in 4 regions. We also confirmed two association signals in Europeans that had been previously reported only in East-Asian GWAS. Based on statistical evidence and linkage disequilibrium (LD) structure, we have curated and narrowed down the list of the most likely candidate causal variants for each region. Functional annotation using data from ENCODE filtered for PrCa cell lines and eQTL analysis demonstrated significant enrichment for overlap with bio-features within this set. By incorporating the novel risk variants identified here alongside the refined data for existing association signals, we estimate that these loci now explain ∼38.9% of the familial relative risk of PrCa, an 8.9% improvement over the previously reported GWAS tag SNPs. This suggests that a significant fraction of the heritability of PrCa may have been hidden during the discovery phase of GWAS, in particular due to the presence of multiple independent signals within the same regio

    Semantic and Generative Models for Lossy Text Compression

    No full text
    This paper investigates the resulting trade-off between subjective quality of the transmission and its compression factor. Two different methods are described, which can be combined into an extremely effective technique that provides far better compression than the present state of the art and yet preserves a reasonable degree of perceived match between the original and received text. The major challenge for lossy text compression is the quantitative evaluation of the quality of this matc

    Large-scale interaction profiling of PDZ domains through proteomic peptide-phage display using human and viral phage peptidomes

    No full text
    The human proteome contains a plethora of short linear motifs (SLiMs) that serve as binding interfaces for modular protein domains. Such interactions are crucial for signaling and other cellular processes, but are difficult to detect because of their low to moderate affinities. Here we developed a dedicated approach, proteomic peptide-phage display (ProP-PD), to identify domain–SLiM interactions. Specifically, we generated phage libraries containing all human and viral C-terminal peptides using custom oligonucleotide microarrays. With these libraries we screened the nine PSD-95/Dlg/ZO-1 (PDZ) domains of human Densin-180, Erbin, Scribble, and Disks large homolog 1 for peptide ligands. We identified several known and putative interactions potentially relevant to cellular signaling pathways and confirmed interactions between full-length Scribble and the target proteins β-PIX, plakophilin-4, and guanylate cyclase soluble subunit α-2 using colocalization and coimmunoprecipitation experiments. The affinities of recombinant Scribble PDZ domains and the synthetic peptides representing the C termini of these proteins were in the 1- to 40-μM range. Furthermore, we identified several well-established host–virus protein–protein interactions, and confirmed that PDZ domains of Scribble interact with the C terminus of Tax-1 of human T-cell leukemia virus with micromolar affinity. Previously unknown putative viral protein ligands for the PDZ domains of Scribble and Erbin were also identified. Thus, we demonstrate that our ProP-PD libraries are useful tools for probing PDZ domain interactions. The method can be extended to interrogate all potential eukaryotic, bacterial, and viral SLiMs and we suggest it will be a highly valuable approach for studying cellular and pathogen–host protein–protein interactions

    Mammalian-Membrane-Two-Hybrid (MaMTH): a novel split-ubiquitin assay for investigation of signaling pathways in human cells

    No full text
    Cell signaling, one of the key processes involved in human health and disease, is coordinated by numerous membrane protein-protein interactions (PPIs) that change in response to stimuli. Currently, there is a lack of assays that can detect these changes in stimuli- and disease-related contexts. Here, we present a novel split-ubiquitin-method for the detection of integral membrane PPIs in human cells, termed Mammalian-Membrane-Two-Hybrid (MaMTH). We highlight the strength of this technology by showing that it detects stimuli (hormone/agonist)- and phosphorylationdependent PPIs. Importantly, it can detect changes in PPIs conferred by mutations such as those in oncogenic ErbB-receptor variants or by treatment with drugs like the tyrosine-kinase inhibitor erlotinib. Using MaMTH as a screening assay, we identified CRKII as an interactor of oncogenic EGFRL858R, promoting persistent activation of aberrant signaling. In conclusion, our study illustrates that MaMTH is a powerful tool for investigating dynamic interactomes of human integral membrane proteins.The work was supported by grants from the Ontario Genomics Institute (303547), Canadian Institutes of Health Research (Catalyst - NHG99091; ppp-125785), Canadian Foundation for Innovation (IOF-LOF), Natural Sciences and Engineering Research Council of Canada (RGPIN 372393-12), Canadian Cystic Fibrosis Foundation (300348), Canadian Cancer Society (2010-700406), Novartis, UNiversity Health Network (GL2-01-018), FWF-Erwin Schrödinger Fellowship progra

    Interaction domains of Sos1/Grb2 are finely tuned for cooperative control of embryonic stem cell fate

    Get PDF
    SummaryMetazoan evolution involves increasing protein domain complexity, but how this relates to control of biological decisions remains uncertain. The Ras guanine nucleotide exchange factor (RasGEF) Sos1 and its adaptor Grb2 are multidomain proteins that couple fibroblast growth factor (FGF) signaling to activation of the Ras-Erk pathway during mammalian development and drive embryonic stem cells toward the primitive endoderm (PrE) lineage. We show that the ability of Sos1/Grb2 to appropriately regulate pluripotency and differentiation factors and to initiate PrE development requires collective binding of multiple Sos1/Grb2 domains to their protein and phospholipid ligands. This provides a cooperative system that only allows lineage commitment when all ligand-binding domains are occupied. Furthermore, our results indicate that the interaction domains of Sos1 and Grb2 have evolved so as to bind ligands not with maximal strength but with specificities and affinities that maintain cooperativity. This optimized system ensures that PrE lineage commitment occurs in a timely and selective manner during embryogenesis
    corecore