Abstract

By limiting sequencing to those sequences transcribed as mRNA, whole exome sequencing is a costefficient technique often used in disease-association studies. We developed two target enrichment designs based on the recently released annotation of the canine genome: the exome-plus design and the exome-CDS design. The exome-plus design combines the exons of the CanFam 3.1 Ensembl annotation, more recently discovered protein-coding exons and a variety of non-coding RNA regions (microRNAs, long non-coding RNAs and antisense transcripts), leading to a total size of ≈152 Mb. The exome-CDS was designed as a subset of the exome-plus by omitting all 3' and 5' untranslated regions. This reduced the size of the exome-CDS to ≈71 Mb. To test the capturing performance, four exome-plus captures were sequenced on a NextSeq 500 with each capture containing four precapture pooled, barcoded samples. At an average sequencing depth of 68.3x, 80% of the regions and well over 90% of the targeted base pairs were completely covered at least 5 times with high reproducibility. Based on the performance of the exome-plus, we estimated the performance of the exome-CDS. Overall, these designs provide flexible solutions for a variety of research questions and are likely to be reliable tools in disease studies. In 2014, the first report detailing the design and performance of a whole exome sequencing (WES) enrichment assay for the dog was published by our group 1 . Aiming to selectively sequence all the regions that are transcribed to mRNA, WES is a reliable tool used to identify disease-causing or predisposing mutations at a fraction of the price of whole genome sequencing (WGS) studies. A limitation of WES is that it is based on our current knowledge of the annotation of the genome and that many disease causing mutations are likely to fall outside protein-coding regions. With new information becoming available, updates and extensions are required. Recently, an improved annotation for the dog genome has been published and new data on non-protein coding genes has been obtained 2 . Based on this data, two new target enrichment designs for dogs, called the exome-plus and the exome-CDS, were developed. The exome-plus offers the most comprehensive design. The exome-CDS is a subset of the exome-plus, focusing on the coding DNA sequences (CDS) by excluding the 3′ and 5′ untranslated regions (UTRs). Thes

    Similar works