16 research outputs found

    Visualizing the Structure of Large Trees

    Get PDF
    This study introduces a new method of visualizing complex tree structured objects. The usefulness of this method is illustrated in the context of detecting unexpected features in a data set of very large trees. The major contribution is a novel two-dimensional graphical representation of each tree, with a covariate coded by color. The motivating data set contains three dimensional representations of brain artery systems of 105 subjects. Due to inaccuracies inherent in the medical imaging techniques, issues with the reconstruction algo- rithms and inconsistencies introduced by manual adjustment, various discrepancies are present in the data. The proposed representation enables quick visual detection of the most common discrepancies. For our driving example, this tool led to the modification of 10% of the artery trees and deletion of 6.7%. The benefits of our cleaning method are demonstrated through a statistical hypothesis test on the effects of aging on vessel structure. The data cleaning resulted in improved significance levels.Comment: 17 pages, 8 figure

    Dimension reduction in principal component analysis for trees

    Get PDF
    The statistical analysis of tree structured data is a new topic in statistics with wide application areas. Some Principal Component Analysis (PCA) ideas were previously developed for binary tree spaces. In this study, we extend these ideas to the more general space of rooted and labeled trees. We re-de ne concepts such as tree-line and forward principal component tree-line for this more general space, and generalize the optimal algorithm that fi nds them. We then develop an analog of classical dimension reduction technique in PCA for the tree space. To do this, we de ne the components that carry the least amount of variation of a tree data set, called backward principal components. We present an optimal algorithm to find them. Furthermore, we investigate the relationship of these the forward principal components, and prove a path-independency property between the forward and backward techniques. We apply our methods to a data set of brain artery data set of 98 subjects. Using our techniques, we investigate how aging affects the brain artery structure of males and females. We also analyze a data set of organization structure of a large US company and explore the structural differences across different types of departments within the company

    A Nonparametric Regression Model With Tree-Structured Response

    Get PDF
    Highly developed science and technology from the last two decades motivated the study of complex data objects. In this paper, we consider the topological properties of a population of tree-structured objects. Our interest centers on modeling the relationship between a tree-structured response and other covariates. For tree objects, this poses serious challenges since most regression methods rely on linear operations in Euclidean space. We generalize the notion of nonparametric regression to the case of a tree-structured response variable. In addition, a fast algorithm with theoretical justification is developed. We implement the proposed method to analyze a data set of human brain artery trees. An important lesson is that smoothing in the full tree space can reveal much deeper scientific insights than the simple smoothing of summary statistics

    Clinical validation of a Cas13-based assay for the detection of SARS-CoV-2 RNA

    No full text
    © 2020, The Author(s), under exclusive licence to Springer Nature Limited. Nucleic acid detection by isothermal amplification and the collateral cleavage of reporter molecules by CRISPR-associated enzymes is a promising alternative to quantitative PCR. Here, we report the clinical validation of the specific high-sensitivity enzymatic reporter unlocking (SHERLOCK) assay using the enzyme Cas13a from Leptotrichia wadei for the detection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)—the virus that causes coronavirus disease 2019 (COVID-19)—in 154 nasopharyngeal and throat swab samples collected at Siriraj Hospital, Thailand. Within a detection limit of 42 RNA copies per reaction, SHERLOCK was 100% specific and 100% sensitive with a fluorescence readout, and 100% specific and 97% sensitive with a lateral-flow readout. For the full range of viral load in the clinical samples, the fluorescence readout was 100% specific and 96% sensitive. For 380 SARS-CoV-2-negative pre-operative samples from patients undergoing surgery, SHERLOCK was in 100% agreement with quantitative PCR with reverse transcription. The assay, which we show is amenable to multiplexed detection in a single lateral-flow strip incorporating an internal control for ribonuclease contamination, should facilitate SARS-CoV-2 detection in settings with limited resources

    RNA-guided DNA insertion with CRISPR-associated transposases

    No full text
    CRISPR-Cas nucleases are powerful tools for manipulating nucleic acids; however, targeted insertion of DNA remains a challenge, as it requires host cell repair machinery. Here we characterize a CRISPR-associated transposase from cyanobacteria Scytonema hofmanni (ShCAST) that consists of Tn7-like transposase subunits and the type V-K CRISPR effector (Cas12k). ShCAST catalyzes RNA-guided DNA transposition by unidirectionally inserting segments of DNA 60 to 66 base pairs downstream of the protospacer. ShCAST integrates DNA into targeted sites in the Escherichia coli genome with frequencies of up to 80% without positive selection. This work expands our understanding of the functional diversity of CRISPR-Cas systems and establishes a paradigm for precision DNA insertion.National Institutes of Health (Grant 1R01-HG009761)National Institutes of Health (Grant 1R01-MH110049)National Institutes of Health (Grant 1DP1-HL141201

    Dual modes of CRISPR-associated transposon homing

    No full text
    Tn7-like transposons have co-opted CRISPR systems, including class 1 type I-F, I-B, and class 2 type V-K. Intriguingly, although these CRISPR-associated transposases (CASTs) undergo robust CRISPR RNA (crRNA)-guided transposition, they are almost never found in sites targeted by the crRNAs encoded by the cognate CRISPR array. To understand this paradox, we investigated CAST V-K and I-B systems and found two distinct modes of transposition: (1) crRNA-guided transposition and (2) CRISPR array-independent homing. We show distinct CAST systems utilize different molecular mechanisms to target their homing site. Type V-K CAST systems use a short, delocalized crRNA for RNA-guided homing, whereas type I-B CAST systems, which contain two distinct target selector proteins, use TniQ for RNA-guided DNA transposition and TnsD for homing to an attachment site. These observations illuminate a key step in the life cycle of CAST systems and highlight the diversity of molecular mechanisms mediating transposon homing

    Mammalian retrovirus-like protein PEG10 packages its own mRNA and can be pseudotyped for mRNA delivery

    No full text
    Hitching a ride with a retroelement Retroviruses and retroelements have inserted their genetic code into mammalian genomes throughout evolution. Although many of these integrated virus-like sequences pose a threat to genomic integrity, some have been retooled by mammalian cells to perform essential roles in development. Segel et al . found that one of these retroviral-like proteins, PEG10, directly binds to and secretes its own mRNA in extracellular virus–like capsids. These virus-like particles were then pseudotyped with fusogens to deliver functional mRNA cargos to mammalian cells. This potentially provides an endogenous vector for RNA-based gene therapy. —D
    corecore