42 research outputs found

    PyBDA: a command line tool for automated analysis of big biological data sets

    Get PDF
    Analysing large and high-dimensional biological data sets poses significant computational difficulties for bioinformaticians due to lack of accessible tools that scale to hundreds of millions of data points. We developed a novel machine learning command line tool called PyBDA for automated, distributed analysis of big biological data sets. By using Apache Spark in the backend, PyBDA scales to data sets beyond the size of current applications. It uses Snakemake in order to automatically schedule jobs to a high-performance computing cluster. We demonstrate the utility of the software by analyzing image-based RNA interference data of 150 million single cells. PyBDA allows automated, easy-to-use data analysis using common statistical methods and machine learning algorithms. It can be used with simple command line calls entirely making it accessible to a broad user base. PyBDA is available at https://pybda.rtfd.io

    Microscopy-based Assays for High-throughput Screening of Host Factors Involved in Brucella Infection of Hela Cells

    Get PDF
    Brucella species are facultative intracellular pathogens that infect animals as their natural hosts. Transmission to humans is most commonly caused by direct contact with infected animals or by ingestion of contaminated food and can lead to severe chronic infections. Brucella can invade professional and non-professional phagocytic cells and replicates within endoplasmic reticulum (ER)-derived vacuoles. The host factors required for Brucella entry into host cells, avoidance of lysosomal degradation, and replication in the ER-like compartment remain largely unknown. Here we describe two assays to identify host factors involved in Brucella entry and replication in HeLa cells. The protocols describe the use of RNA interference, while alternative screening methods could be applied. The assays are based on the detection of fluorescently labeled bacteria in fluorescently labeled host cells using automated wide-field microscopy. The fluorescent images are analyzed using a standardized image analysis pipeline in CellProfiler which allows single cell-based infection scoring. In the endpoint assay, intracellular replication is measured two days after infection. This allows bacteria to traffic to their replicative niche where proliferation is initiated around 12 hr after bacterial entry. Brucella which have successfully established an intracellular niche will thus have strongly proliferated inside host cells. Since intracellular bacteria will greatly outnumber individual extracellular or intracellular non-replicative bacteria, a strain constitutively expressing GFP can be used. The strong GFP signal is then used to identify infected cells. In contrast, for the entry assay it is essential to differentiate between intracellular and extracellular bacteria. Here, a strain encoding for a tetracycline-inducible GFP is used. Induction of GFP with simultaneous inactivation of extracellular bacteria by gentamicin enables the differentiation between intracellular and extracellular bacteria based on the GFP signal, with only intracellular bacteria being able to express GFP. This allows the robust detection of single intracellular bacteria before intracellular proliferation is initiated

    Improved pathway reconstruction from RNA interference screens by exploiting off-target effects

    Get PDF
    Pathway reconstruction has proven to be an indispensable tool for analyzing the molecular mechanisms of signal transduction underlying cell function. Nested effects models (NEMs) are a class of probabilistic graphical models designed to reconstruct signalling pathways from high-dimensional observations resulting from perturbation experiments, such as RNA interference (RNAi). NEMs assume that the short interfering RNAs (siRNAs) designed to knockdown specific genes are always on-target. However, it has been shown that most siRNAs exhibit strong off-target effects, which further confound the data, resulting in unreliable reconstruction of networks by NEMs.; Here, we present an extension of NEMs called probabilistic combinatorial nested effects models (pc-NEMs), which capitalize on the ancillary siRNA off-target effects for network reconstruction from combinatorial gene knockdown data. Our model employs an adaptive simulated annealing search algorithm for simultaneous inference of network structure and error rates inherent to the data. Evaluation of pc-NEMs on simulated data with varying number of phenotypic effects and noise levels as well as real data demonstrates improved reconstruction compared to classical NEMs. Application to Bartonella henselae infection RNAi screening data yielded an eight node network largely in agreement with previous works, and revealed novel binary interactions of direct impact between established components.; The software used for the analysis is freely available as an R package at https://github.com/cbg-ethz/pcNEM.git.; Supplementary data are available at Bioinformatics online

    A Role for the VPS Retromer in Brucella Intracellular Replication Revealed by Genomewide siRNA Screening

    Get PDF
    Brucella, the agent causing brucellosis, is a major zoonotic pathogen with worldwide distribution. Brucella resides and replicates inside infected host cells in membrane-bound compartments called Brucella- containing vacuoles (BCVs). Following uptake, Brucella resides in endosomal BCVs (eBCVs) that gradually mature from early to late endosomal features. Through a poorly understood process that is key to the intracellular lifestyle of Brucella, the eBCV escapes fusion with lysosomes by transitioning to the replicative BCV (rBCV), a replicative niche directly connected to the endoplasmic reticulum (ER). Despite the notion that this complex intracellular lifestyle must depend on a multitude of host factors, a holistic view on which of these components control Brucella cell entry, trafficking, and replication is still missing. Here we used a systematic cell-based small interfering RNA (siRNA) knockdown screen in HeLa cells infected with Brucella abortus and identified 425 components of the human infectome for Brucella infection. These include multiple components of pathways involved in central processes such as the cell cycle, actin cytoskeleton dynamics, or vesicular trafficking. Using assays for pathogen entry, knockdown complementation, and colocalization at single-cell resolution, we identified the requirement of the VPS retromer for Brucella to escape the lysosomal degradative pathway and to establish its intracellular replicative niche. We thus validated the VPS retromer as a novel host factor critical for Brucella intracellular trafficking. Further, our genomewide data shed light on the interplay between central host processes and the biogenesis of the Brucella replicative niche.; IMPORTANCE; With >300,000 new cases of human brucellosis annually, Brucella is regarded as one of the most important zoonotic bacterial pathogens worldwide. The agent causing brucellosis resides inside host cells within vacuoles termed Brucella- containing vacuoles (BCVs). Although a few host components required to escape the degradative lysosomal pathway and to establish the ER-derived replicative BCV (rBCV) have already been identified, the global understanding of this highly coordinated process is still partial, and many factors remain unknown. To gain deeper insight into these fundamental questions, we performed a genomewide RNA interference (RNAi) screen aiming at discovering novel host factors involved in the Brucella intracellular cycle. We identified 425 host proteins that contribute to Brucella cellular entry, intracellular trafficking, and replication. Together, this study sheds light on previously unknown host pathways required for the Brucella infection cycle and highlights the VPS retromer components as critical factors for the establishment of the Brucella intracellular replicative niche

    Specific inhibition of diverse pathogens in human cells by synthetic microRNA-like oligonucleotides inferred from RNAi screens

    Get PDF
    Systematic genetic perturbation screening in human cells remains technically challenging. Typically, large libraries of chemically synthesized siRNA oligonucleotides are used, each designed to degrade a specific cellular mRNA via the RNA interference (RNAi) mechanism. Here, we report on data from three genome-wide siRNA screens, conducted to uncover host factors required for infection of human cells by two bacterial and one viral pathogen. We find that the majority of phenotypic effects of siRNAs are unrelated to the intended “on-target” mechanism, defined by full complementarity of the 21-nt siRNA sequence to a target mRNA. Instead, phenotypes are largely dictated by “off-target” effects resulting from partial complementarity of siRNAs to multiple mRNAs via the “seed” region (i.e., nucleotides 2–8), reminiscent of the way specificity is determined for endogenous microRNAs. Quantitative analysis enabled the prediction of seeds that strongly and specifically block infection, independent of the intended on-target effect. This prediction was confirmed experimentally by designing oligos that do not have any on-target sequence match at all, yet can strongly reproduce the predicted phenotypes. Our results suggest that published RNAi screens have primarily, and unintentionally, screened the sequence space of microRNA seeds instead of the intended on-target space of protein-coding genes. This helps to explain why previously published RNAi screens have exhibited relatively little overlap. Our analysis suggests a possible way of identifying “seed reagents” for controlling phenotypes of interest and establishes a general strategy for extracting valuable untapped information from past and future RNAi screens

    Strategies and solutions to maintain and retain data from high content imaging, analysis, and screening assays

    No full text
    Data analysis and management in high content screening (HCS) has progressed significantly in the past 10 years. The analysis of the large volume of data generated in HCS experiments represents a significant challenge and is currently a bottleneck in many screening projects. In most screening laboratories, HCS has become a standard technology applied routinely to various applications from target identification to hit identification to lead optimization. An HCS data management and analysis infrastructure shared by several research groups can allow efficient use of existing IT resources and ensures company-wide standards for data quality and result generation. This chapter outlines typical HCS workflows and presents IT infrastructure requirements for multi-well plate-based HCS

    Genome-Wide siRNA Screen Identifies Complementary Signaling Pathways Involved in Listeria Infection and Reveals Different Actin Nucleation Mechanisms during Listeria Cell Invasion and Actin Comet Tail Formation

    Get PDF
    Listeria monocytogenes enters nonphagocytic cells by a receptor-mediated mechanism that is dependent on a clathrin-based molecular machinery and actin rearrangements. Bacterial intra- and intercellular movements are also actin dependent and rely on the actin nucleating Arp2/3 complex, which is activated by host-derived nucleation-promoting factors downstream of the cell receptor Met during entry and by the bacterial nucleation-promoting factor ActA during comet tail formation. By genome-wide small interfering RNA (siRNA) screening for host factors involved in bacterial infection, we identified diverse cellular signaling networks and protein complexes that support or limit these processes. In addition, we could precise previously described molecular pathways involved in Listeria invasion. In particular our results show that the requirements for actin nucleators during Listeria entry and actin comet tail formation are different. Knockdown of several actin nucleators, including SPIRE2, reduced bacterial invasion while not affecting the generation of comet tails. Most interestingly, we observed that in contrast to our expectations, not all of the seven subunits of the Arp2/3 complex are required for Listeria entry into cells or actin tail formation and that the subunit requirements for each of these processes differ, highlighting a previously unsuspected versatility in Arp2/3 complex composition and function. IMPORTANCE: Listeria is a bacterial pathogen that induces its internalization within the cytoplasm of human cells and has been used for decades as a major molecular tool to manipulate cells in order to explore and discover cellular functions. We have inactivated individually, for the first time in epithelial cells, all the genes of the human genome to investigate whether each gene modifies positively or negatively the Listeria infectious process. We identified novel signaling cascades that have never been associated with Listeria infection. We have also revisited the role of the molecular complex Arp2/3 involved in the polymerization of the actin cytoskeleton, which was shown previously to be required for Listeria entry and movement inside host cells, and we demonstrate that contrary to the general dogma, some subunits of the complex are dispensable for both Listeria entry and bacterial movement

    PyBDA: a command line tool for automated analysis of big biological data sets

    No full text
    Background Analysing large and high-dimensional biological data sets poses significant computational difficulties for bioinformaticians due to lack of accessible tools that scale to hundreds of millions of data points. Results We developed a novel machine learning command line tool called PyBDA for automated, distributed analysis of big biological data sets. By using Apache Spark in the backend, PyBDA scales to data sets beyond the size of current applications. It uses Snakemake in order to automatically schedule jobs to a high-performance computing cluster. We demonstrate the utility of the software by analyzing image-based RNA interference data of 150 million single cells. Conclusion PyBDA allows automated, easy-to-use data analysis using common statistical methods and machine learning algorithms. It can be used with simple command line calls entirely making it accessible to a broad user base. PyBDA is available at https://pybda.rtfd.io.ISSN:1471-210

    NEMix: single-cell nested effects models for probabilistic pathway stimulation

    Get PDF
    Nested effects models have been used successfully for learning subcellular networks from high-dimensional perturbation effects that result from RNA interference (RNAi) experiments. Here, we further develop the basic nested effects model using high-content single-cell imaging data from RNAi screens of cultured cells infected with human rhinovirus. RNAi screens with single-cell readouts are becoming increasingly common, and they often reveal high cell-to-cell variation. As a consequence of this cellular heterogeneity, knock-downs result in variable effects among cells and lead to weak average phenotypes on the cell population level. To address this confounding factor in network inference, we explicitly model the stimulation status of a signaling pathway in individual cells. We extend the framework of nested effects models to probabilistic combinatorial knock-downs and propose NEMix, a nested effects mixture model that accounts for unobserved pathway activation. We analyzed the identifiability of NEMix and developed a parameter inference scheme based on the Expectation Maximization algorithm. In an extensive simulation study, we show that NEMix improves learning of pathway structures over classical NEMs significantly in the presence of hidden pathway stimulation. We applied our model to single-cell imaging data from RNAi screens monitoring human rhinovirus infection, where limited infection efficiency of the assay results in uncertain pathway stimulation. Using a subset of genes with known interactions, we show that the inferred NEMix network has high accuracy and outperforms the classical nested effects model without hidden pathway activity. NEMix is implemented as part of the R/Bioconductor package 'nem' and available at www.cbg.ethz.ch/software/NEMix
    corecore