3 research outputs found
Microsatellite_genotypes_txt
This file contains the microsatellite genotypes for each individual penguin, set by colony, in a format that can be used directly in Arlequin (presented here in a .txt file). The 8 microsatellite loci are labelled at the top in a comment
Python script for filtering .SAM formatted mapping files aligned with BWA mem
The filter.py script works on sorted .SAM formatted mapping files from BWA mem alignment. For every pair of mapped forward and reverse reads, it parses out the CIGAR field (column 6 of the SAM file) and the MD tag to calculate the number of insertions, deletions, and mismatches. If a pair of reads have mismatches less than or equal to five and insertion/deletions less than or equal to two, then the pair is kept and printed to linux standard output. SAM header lines are ignored by the parser but also printed to standard output for compatible down-stream analysis
GPS_Coordinates_Colonies
This file contains the GPS coordinates (Longitude West and Latitude South) and UTM coordinates for the 14 colonies where samples were obtained