Multiple species comparative analysis of human chromosome 22 between markers D22S1687 and D22S419 and gene expression profiling in zebrafish.

Abstract

Major large scale insertions or deletions that resulted in gene number differences between human and chimpanzee were discovered in the IGLL and LCR22s within this region, with four human insertions from 6 Kb to 75 Kb and three chimpanzee insertions from 12 Kb to 74 Kb observed in the IGLL region, two human insertions of 59 Kb and 36 Kb in LCR22-6, and a 67 Kb chimpanzee insertion in LCR22-8. Small scale insertions and deletions, in addition to exon shuffling, elevated nucleotide divergence rate and positive selection were also observed in the putative genes, partially duplicated genes and pseudogenes in the IGLL and LCR22s. Thus, the second major conclusion of this study is the major differences between human and chimpanzee in this region lies in the highly repetitive regions of the IGLL and the LCR22s.Comparison of a 4.5 Mb region of human chromosome 22 between markers D22s1687 and D22s419, with the syntenic region in chimpanzee had revealed overall DNA sequence identity of approximately 97.6%, Ka/Ks ratio of known protein coding genes at approximately 0.25, with the majority of amino acid changes between hydrophilic amino acids, followed by changes between hydrophobic amino acids, and the least changes between hydrophobic to hydrophilic amino acids or vise versa. Thus, the first major conclusion of this study is that overall, this chromosomal region is highly conserved between human and chimpanzee, and the known protein coding genes are undergoing purifying selections, in which 75% of nucleotide substitutions that led to amino acid changes were eliminated by adaptive evolution.Through whole mount in situ hybridization studies, a total of 12 human orthologs in zebrafish, including 4 newly predicted putative genes with no previously known expression profile and function, showed specific expression in the developing zebrafish embryonic central nervous system, optic system, the neural crest cells, ottic vesicle, liver, and notochord. Thus, the third major conclusion from this present study is that many predicted genes which currently lack expression data and functional information likely are time and tissue specific during developmental processes

    Similar works