Abstract

This tarball files contains folders with the results of running MEME and MAST in three sets of proteins (i.e., the Training set, Positive control, and the Negative control). The data are organized in three directories (See Methods in the manuscript for more details):   MEME:  Contains the results of running MEME on a sample of 50 sequences for each one of the nine repeats characteristic of the Piezo family (TC: 1.A.75), resulting in 450 sequences total. Three Motifs 25 aa long where identified showing an E-value   MAST:  This folder contains two subfolders:   1. Folder Piezo_homologs contains the results of running MAST on 1229 non-redundant homologs of the Piezo Family using the motif models in folder MEME (E-value   2. Folder Negative_control contains the results of running MAST on the entire protein content of TCDB (22,215 proteins) using the motif models in folder MEME (E-value   Sequences This folder contains three sequence files used in the analysis.   1. File Piezo_50seqs_per_repeat.faa contains the random samples of 50 sequences per repeat (450 sequences total) used to run MEME. The Repeat to which each sequence belongs is identified with the prefix Rn_, where n = 1...9   2. File negative_control.faa contains all 22,215 TCDB sequences in the negative control.   3. File Piezo_homologs.faa contains 1229 non-redundant homologs of the Piezo family.</p

    Similar works

    Full text

    thumbnail-image

    Available Versions