Functional Annotation of Proteome
Encoded by Human
Chromosome 22
- Publication date
- Publisher
Abstract
As
part of the chromosome-centric human proteome project (C-HPP)
initiative, we report our progress on the annotation of chromosome 22.
Chromosome 22, spanning 51 million base pairs, was the first chromosome
to be sequenced. Gene dosage alterations on this chromosome have been
shown to be associated with a number of congenital anomalies. In addition,
several rare but aggressive tumors have been associated with this
chromosome. A number of important gene families including immunoglobulin
lambda locus, Crystallin beta family, and APOBEC gene family are located
on this chromosome. On the basis of proteomic profiling of 30 histologically
normal tissues and cells using high-resolution mass spectrometry,
we show protein evidence of 367 genes on chromosome 22. Importantly,
this includes 47 proteins, which are currently annotated as “missing”
proteins. We also confirmed the translation start sites of 120 chromosome 22-encoded
proteins. Employing a comprehensive proteogenomics analysis pipeline,
we provide evidence of novel coding regions on this chromosome which
include upstream ORFs and novel exons in addition to correcting existing
gene structures. We describe tissue-wise expression of the proteins
and the distribution of gene families on this chromosome. These data
have been deposited to ProteomeXchange with the identifier PXD000561