126 research outputs found

    Pivotal estimation in high-dimensional regression via linear programming

    Full text link
    We propose a new method of estimation in high-dimensional linear regression model. It allows for very weak distributional assumptions including heteroscedasticity, and does not require the knowledge of the variance of random errors. The method is based on linear programming only, so that its numerical implementation is faster than for previously known techniques using conic programs, and it allows one to deal with higher dimensional models. We provide upper bounds for estimation and prediction errors of the proposed estimator showing that it achieves the same rate as in the more restrictive situation of fixed design and i.i.d. Gaussian errors with known variance. Following Gautier and Tsybakov (2011), we obtain the results under weaker sensitivity assumptions than the restricted eigenvalue or assimilated conditions

    Molecular heterogeneity at the network level: high-dimensional testing, clustering and a TCGA case study.

    Get PDF
    MOTIVATION: Molecular pathways and networks play a key role in basic and disease biology. An emerging notion is that networks encoding patterns of molecular interplay may themselves differ between contexts, such as cell type, tissue or disease (sub)type. However, while statistical testing of differences in mean expression levels has been extensively studied, testing of network differences remains challenging. Furthermore, since network differences could provide important and biologically interpretable information to identify molecular subgroups, there is a need to consider the unsupervised task of learning subgroups and networks that define them. This is a nontrivial clustering problem, with neither subgroups nor subgroup-specific networks known at the outset. RESULTS: We leverage recent ideas from high-dimensional statistics for testing and clustering in the network biology setting. The methods we describe can be applied directly to most continuous molecular measurements and networks do not need to be specified beforehand. We illustrate the ideas and methods in a case study using protein data from The Cancer Genome Atlas (TCGA). This provides evidence that patterns of interplay between signalling proteins differ significantly between cancer types. Furthermore, we show how the proposed approaches can be used to learn subtypes and the molecular networks that define them. AVAILABILITY AND IMPLEMENTATION: As the Bioconductor package nethet. CONTACT: [email protected] or [email protected]. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

    Electrostatically gated membrane permeability in inorganic protocells

    Get PDF
    Although several strategies are now available to produce functional microcompartments analogous to primitive cell-like structures, little progress has been made in generating protocell constructs with self-controlled membrane permeability. Here we describe the preparation of water-dispersible colloidosomes based on silica nanoparticles and delineated by a continuous semipermeable inorganic membrane capable of self-activated, electrostatically gated permeability. We use crosslinking and covalent grafting of a pH-responsive copolymer to generate an ultrathin elastic membrane that exhibits selective release and uptake of small molecules. This behaviour, which depends on the charge of the copolymer coronal layer, serves to trigger enzymatic dephosphorylation reactions specifically within the protocell aqueous interior. This system represents a step towards the design and construction of alternative types of artificial chemical cells and protocell models based on spontaneous processes of inorganic self-organization

    Estimating Parameters of Speciation Models Based on Refined Summaries of the Joint Site-Frequency Spectrum

    Get PDF
    Understanding the processes and conditions under which populations diverge to give rise to distinct species is a central question in evolutionary biology. Since recently diverged populations have high levels of shared polymorphisms, it is challenging to distinguish between recent divergence with no (or very low) inter-population gene flow and older splitting events with subsequent gene flow. Recently published methods to infer speciation parameters under the isolation-migration framework are based on summarizing polymorphism data at multiple loci in two species using the joint site-frequency spectrum (JSFS). We have developed two improvements of these methods based on a more extensive use of the JSFS classes of polymorphisms for species with high intra-locus recombination rates. First, using a likelihood based method, we demonstrate that taking into account low-frequency polymorphisms shared between species significantly improves the joint estimation of the divergence time and gene flow between species. Second, we introduce a local linear regression algorithm that considerably reduces the computational time and allows for the estimation of unequal rates of gene flow between species. We also investigate which summary statistics from the JSFS allow the greatest estimation accuracy for divergence time and migration rates for low (around 10) and high (around 100) numbers of loci. Focusing on cases with low numbers of loci and high intra-locus recombination rates we show that our methods for the estimation of divergence time and migration rates are more precise than existing approaches

    Complex patterns of local adaptation in teosinte

    Get PDF
    Populations of widely distributed species often encounter and adapt to specific environmental conditions. However, comprehensive characterization of the genetic basis of adaptation is demanding, requiring genome-wide genotype data, multiple sampled populations, and a good understanding of population structure. We have used environmental and high-density genotype data to describe the genetic basis of local adaptation in 21 populations of teosinte, the wild ancestor of maize. We found that altitude, dispersal events and admixture among subspecies formed a complex hierarchical genetic structure within teosinte. Patterns of linkage disequilibrium revealed four mega-base scale inversions that segregated among populations and had altitudinal clines. Based on patterns of differentiation and correlation with environmental variation, inversions and nongenic regions play an important role in local adaptation of teosinte. Further, we note that strongly differentiated individual populations can bias the identification of adaptive loci. The role of inversions in local adaptation has been predicted by theory and requires attention as genome-wide data become available for additional plant species. These results also suggest a potentially important role for noncoding variation, especially in large plant genomes in which the gene space represents a fraction of the entire genome

    A pan-cancer proteomic perspective on The Cancer Genome Atlas.

    Get PDF
    Protein levels and function are poorly predicted by genomic and transcriptomic analysis of patient tumours. Therefore, direct study of the functional proteome has the potential to provide a wealth of information that complements and extends genomic, epigenomic and transcriptomic analysis in The Cancer Genome Atlas (TCGA) projects. Here we use reverse-phase protein arrays to analyse 3,467 patient samples from 11 TCGA 'Pan-Cancer' diseases, using 181 high-quality antibodies that target 128 total proteins and 53 post-translationally modified proteins. The resultant proteomic data are integrated with genomic and transcriptomic analyses of the same samples to identify commonalities, differences, emergent pathways and network biology within and across tumour lineages. In addition, tissue-specific signals are reduced computationally to enhance biomarker and target discovery spanning multiple tumour lineages. This integrative analysis, with an emphasis on pathways and potentially actionable proteins, provides a framework for determining the prognostic, predictive and therapeutic relevance of the functional proteome

    Generic Delivery of Payload of Nanoparticles Intracellularly via Hybrid Polymer Capsules for Bioimaging Applications

    Get PDF
    Towards the goal of development of a generic nanomaterial delivery system and delivery of the ‘as prepared’ nanoparticles without ‘further surface modification’ in a generic way, we have fabricated a hybrid polymer capsule as a delivery vehicle in which nanoparticles are loaded within their cavity. To this end, a generic approach to prepare nanomaterials-loaded polyelectrolyte multilayered (PEM) capsules has been reported, where polystyrene sulfonate (PSS)/polyallylamine hydrochloride (PAH) polymer capsules were employed as nano/microreactors to synthesize variety of nanomaterials (metal nanoparticles; lanthanide doped inorganic nanoparticles; gadolinium based nanoparticles, cadmium based nanoparticles; different shapes of nanoparticles; co-loading of two types of nanoparticles) in their hollow cavity. These nanoparticles-loaded capsules were employed to demonstrate generic delivery of payload of nanoparticles intracellularly (HeLa cells), without the need of individual nanoparticle surface modification. Validation of intracellular internalization of nanoparticles-loaded capsules by HeLa cells was ascertained by confocal laser scanning microscopy. The green emission from Tb3+ was observed after internalization of LaF3:Tb3+(5%) nanoparticles-loaded capsules by HeLa cells, which suggests that nanoparticles in hybrid capsules retain their functionality within the cells. In vitro cytotoxicity studies of these nanoparticles-loaded capsules showed less/no cytotoxicity in comparison to blank capsules or untreated cells, thus offering a way of evading direct contact of nanoparticles with cells because of the presence of biocompatible polymeric shell of capsules. The proposed hybrid delivery system can be potentially developed to avoid a series of biological barriers and deliver multiple cargoes (both simultaneous and individual delivery) without the need of individual cargo design/modification

    Transcriptome Analysis and SNP Development Can Resolve Population Differentiation of Streblospio benedicti, a Developmentally Dimorphic Marine Annelid

    Get PDF
    Next-generation sequencing technology is now frequently being used to develop genomic tools for non-model organisms, which are generally important for advancing studies of evolutionary ecology. One such species, the marine annelid Streblospio benedicti, is an ideal system to study the evolutionary consequences of larval life history mode because the species displays a rare offspring dimorphism termed poecilogony, where females can produce either many small offspring or a few large ones. To further develop S. benedicti as a model system for studies of life history evolution, we apply 454 sequencing to characterize the transcriptome for embryos, larvae, and juveniles of this species, for which no genomic resources are currently available. Here we performed a de novo alignment of 336,715 reads generated by a quarter GS-FLX (Roche 454) run, which produced 7,222 contigs. We developed a novel approach for evaluating the site frequency spectrum across the transcriptome to identify potential signatures of selection. We also developed 84 novel single nucleotide polymorphism (SNP) markers for this species that are used to distinguish coastal populations of S. benedicti. We validated the SNPs by genotyping individuals of different developmental modes using the BeadXPress Golden Gate assay (Illumina). This allowed us to evaluate markers that may be associated with life-history mode

    The Leucine Zipper Domains of the Transcription Factors GCN4 and c-Jun Have Ribonuclease Activity

    Get PDF
    Basic-region leucine zipper (bZIP) proteins are one of the largest transcription factor families that regulate a wide range of cellular functions. Owing to the stability of their coiled coil structure leucine zipper (LZ) domains of bZIP factors are widely employed as dimerization motifs in protein engineering studies. In the course of one such study, the X-ray structure of the retro-version of the LZ moiety of yeast transcriptional activator GCN4 suggested that this retro-LZ may have ribonuclease activity. Here we show that not only the retro-LZ but also the authentic LZ of GCN4 has weak but distinct ribonuclease activity. The observed cleavage of RNA is unspecific, it is not suppressed by the ribonuclease A inhibitor RNasin and involves the breakage of 3′,5′-phosphodiester bonds with formation of 2′,3′-cyclic phosphates as the final products as demonstrated by HPLC/electrospray ionization mass spectrometry. Several mutants of the GCN4 leucine zipper are catalytically inactive, providing important negative controls and unequivocally associating the enzymatic activity with the peptide under study. The leucine zipper moiety of the human factor c-Jun as well as the entire c-Jun protein are also shown to catalyze degradation of RNA. The presented data, which was obtained in the test-tube experiments, adds GCN4 and c-Jun to the pool of proteins with multiple functions (also known as moonlighting proteins). If expressed in vivo, the endoribonuclease activity of these bZIP-containing factors may represent a direct coupling between transcription activation and controlled RNA turnover. As an additional result of this work, the retro-leucine zipper of GCN4 can be added to the list of functional retro-peptides

    Monoterpene Variation Mediated Attack Preference Evolution of the Bark Beetle Dendroctonus valens

    Get PDF
    Several studies suggest that some bark beetle like to attack large trees. The invasive red turpentine beetle (RTB), Dendroctonus valens LeConte, one of the most destructive forest pests in China, is known to exhibit this behavior. Our previous study demonstrated that RTBs preferred to attack large-diameter trees (diameter at breast height, DBH ≥30 cm) over small-diameter trees (DBH ≤10 cm) in the field. In the current study, we studied the attacking behavior and the underlying mechanisms in the laboratory. Behavioral assays showed that RTBs preferred the bark of large-DBH trees and had a higher attack rate on the bolts of these trees. Y-tube assays showed that RTBs preferred the volatiles released by large-DBH trees to those released by small-DBH trees. Subsequent analysis revealed that both large- and small-DBH trees had the same composition of monoterpenes, but the concentration of each component differed; thus it appeared that the concentrations acted as cues for RTBs to locate the right-sized host which was confirmed by further behavioral assays. Moreover, large-DBH pine trees provided more spacious habitat and contained more nutrients, such as nitrogen, than did small-DBH pine trees, which benefited RTBs' fecundity and larval development. RTBs seem to have evolved mechanisms to locate those large hosts that will allow them to maximize their fitness. Monoterpene variation mediated attack preference implies the potential for the management of RTB
    • …
    corecore