17 research outputs found

    FileS7 ensGene_revised.gtf

    No full text
    Revised .gtf file of Ensembl gene predictions. Coordinates of gene predictions were converted to the revised assembly coordinates. All Ensembl-predicted genes were included, except ENSGACT00000019430, which spans two scaffolds (11 and 79) that are not adjacent in the revised genome assembly. File is zipped

    convertCoordinate.R

    No full text
    This R function converts between the 'old' and 'new' stickleback assembly coordinate systems. The 'old' coordinate system is the assembly described in the Jones et al 2012 stickleback genome paper. It requires access to the FileS4 NewScaffoldOrder.csv file. It has 4 inputs: chr, pos, direction, and scafFile. It returns a list of [chromosome, position]. See README or comments in convertCoordinate.R for further details

    FileS4 NewScaffoldOrder.csv

    No full text
    Revised scaffold order for each chromosome (consensus of FTC and BEPA). Revised coordinates (based on this study) and original assembly coordinates are presented. Orientations are defined relative to original genome assembly. The orientation of some scaffolds was not detected in this study. These scaffolds are labeled as having 'unknown' orientation; their orientation was not altered relative to their orientation in the original genome assembly. Chromosome 'M' is the mitochondrial genome sequence, which was not analyzed in this study but is replicated in the revised genome assembly

    FileS6 revisedAssemblyMasked.fa.zip

    No full text
    Repeat masked fasta file containing revised genome assembly based on consensus scaffold order and orientation as described in File S4 in the Glazer et al. manuscript. Repeat masked fasta file is based off the repeat masked version of the original genome assembly, which was masked with RepeatMasker. File is zipped

    README

    No full text
    Summary of all files in this Dryad packag

    A variety of SNP genetic architectures affect exon model phenotypes.

    No full text
    <p>A–C) Diverse gene model phenotypes associated with a single SNP. A) A SNP is part of a start codon in vdB* and a splice acceptor in vdB on contig 21: 133200–133500 B). An entire five-exon gene is predicted in vdB but not vdB* on contig 24: 56700–57500. C) A rare example in which an intergenic SNP causes a gene to flip strands on contig 22: 2627100–2627839. D–F) Examples of events that are completely explained by multiple SNPs. All possible SNP combinations and associated phenotypes are shown in green. D) One SNP is part of a splice donor and a second SNP is part of a splice acceptor. Both must be present for the additional exon to be predicted. Contig 19: 82900–82200. E) Three exonic SNPs affect the presence of the first exon and the length of the middle exon. Contig 24: 503000–504500. F) Four intronic SNPs explain the presence/absence of an intron. Contig 22: 2204700–2204850. G) An example of an event that is not completely explained. The two SNPs in bold are significantly correlated with the event, but examining those SNPs alone does not fully explain the phenotype. We also observe a gene model that contains a third phenotype with a longer second exon, unobserved in either vdB or vdB*. Contig 20: 3227900–3228500.</p

    Locations of SNPs that perfectly associate with gene models.

    No full text
    <p>Note that as each event can belong to one or more category (Intron gain/loss, Exon gain/loss, and/or Shift), the Total is not necessarily equal to the sum of the three categories.</p

    Naturally occurring SNPs cause fewer gene model differences than randomly placed SNPs.

    No full text
    <p>A) Gene model differences between vdB and vdB*, where all base pair changes were made at naturally occurring SNP positions between UCB and vdB. B) Average gene model differences between vdB and 100 genomes in which SNPs were randomly relocated. Area in circle is proportional to the number of changes observed.</p

    Most events are significantly correlated with one or more SNPs.

    No full text
    <p>A) Histogram of the correlation between every pair-wise event-SNP combination. A correlation of 1.0 (diamond marker) indicates perfect agreement between the SNP genotype and event phenotype in all intermediate genomes. The black line shows a theoretical binomial distribution of correlations between independent events and SNPs. B) Histogram showing the largest SNP-event correlation for each event. C) The event phenotype frequency, the fraction of intermediate genomes containing the less common phenotype of each event, binned as in B. For all graphs, the significance cutoff is shown by the red line.</p
    corecore