5 research outputs found
Proteogenomic Analysis of Human Colon Carcinoma Cell Lines LIM1215, LIM1899, and LIM2405
As
part of the genome-wide and chromosome-centric human proteomic
project (C-HPP), we have integrated shotgun proteomics approach and
a genome-wide transcriptomic approach (RNA-Seq) of a set of human
colon cancer cell lines (LIM1215, LIM1899 and LIM2405) that were selected
to represent a wide range of pathological states of colorectal cancer.
The combination of a standard proteomics approach (1D-gel electrophoresis
coupled to LC/ion trap mass spectrometry) and RNA-Seq allowed us to
exploit the greater depth of the transcriptomics measurement (ā¼9800
transcripts per cell line) versus the protein observations (ā¼1900
protein identifications per cell line). Conversely, the proteomics
data were helpful in identifying both cancer associated proteins with
differential expression patterns as well as protein networks and pathways
which appear to be deregulated in these cell lines. Examples of potential
markers include mortalin, nucleophosmin, ezrin, LASP1, alpha and beta
forms of spectrin, exportin, the carcinoembryonic antigen family,
EGFR and MET. Interaction analyses identified the large intermediate
filament family, the protein folding network and adapter proteins
in focal adhesion networks, which included the CDC42 and RHOA signaling
pathways that may have potential for identifying phenotypic states
representing poorly and moderately differentiated states of CRC, with
or without metastases
Proteogenomic Analysis of Human Colon Carcinoma Cell Lines LIM1215, LIM1899, and LIM2405
As
part of the genome-wide and chromosome-centric human proteomic
project (C-HPP), we have integrated shotgun proteomics approach and
a genome-wide transcriptomic approach (RNA-Seq) of a set of human
colon cancer cell lines (LIM1215, LIM1899 and LIM2405) that were selected
to represent a wide range of pathological states of colorectal cancer.
The combination of a standard proteomics approach (1D-gel electrophoresis
coupled to LC/ion trap mass spectrometry) and RNA-Seq allowed us to
exploit the greater depth of the transcriptomics measurement (ā¼9800
transcripts per cell line) versus the protein observations (ā¼1900
protein identifications per cell line). Conversely, the proteomics
data were helpful in identifying both cancer associated proteins with
differential expression patterns as well as protein networks and pathways
which appear to be deregulated in these cell lines. Examples of potential
markers include mortalin, nucleophosmin, ezrin, LASP1, alpha and beta
forms of spectrin, exportin, the carcinoembryonic antigen family,
EGFR and MET. Interaction analyses identified the large intermediate
filament family, the protein folding network and adapter proteins
in focal adhesion networks, which included the CDC42 and RHOA signaling
pathways that may have potential for identifying phenotypic states
representing poorly and moderately differentiated states of CRC, with
or without metastases
Functional Annotation of Proteome Encoded by Human Chromosome 22
As
part of the chromosome-centric human proteome project (C-HPP)
initiative, we report our progress on the annotation of chromosome 22.
Chromosome 22, spanning 51 million base pairs, was the first chromosome
to be sequenced. Gene dosage alterations on this chromosome have been
shown to be associated with a number of congenital anomalies. In addition,
several rare but aggressive tumors have been associated with this
chromosome. A number of important gene families including immunoglobulin
lambda locus, Crystallin beta family, and APOBEC gene family are located
on this chromosome. On the basis of proteomic profiling of 30 histologically
normal tissues and cells using high-resolution mass spectrometry,
we show protein evidence of 367 genes on chromosome 22. Importantly,
this includes 47 proteins, which are currently annotated as āmissingā
proteins. We also confirmed the translation start sites of 120 chromosome 22-encoded
proteins. Employing a comprehensive proteogenomics analysis pipeline,
we provide evidence of novel coding regions on this chromosome which
include upstream ORFs and novel exons in addition to correcting existing
gene structures. We describe tissue-wise expression of the proteins
and the distribution of gene families on this chromosome. These data
have been deposited to ProteomeXchange with the identifier PXD000561
Functional Annotation of Proteome Encoded by Human Chromosome 22
As
part of the chromosome-centric human proteome project (C-HPP)
initiative, we report our progress on the annotation of chromosome 22.
Chromosome 22, spanning 51 million base pairs, was the first chromosome
to be sequenced. Gene dosage alterations on this chromosome have been
shown to be associated with a number of congenital anomalies. In addition,
several rare but aggressive tumors have been associated with this
chromosome. A number of important gene families including immunoglobulin
lambda locus, Crystallin beta family, and APOBEC gene family are located
on this chromosome. On the basis of proteomic profiling of 30 histologically
normal tissues and cells using high-resolution mass spectrometry,
we show protein evidence of 367 genes on chromosome 22. Importantly,
this includes 47 proteins, which are currently annotated as āmissingā
proteins. We also confirmed the translation start sites of 120 chromosome 22-encoded
proteins. Employing a comprehensive proteogenomics analysis pipeline,
we provide evidence of novel coding regions on this chromosome which
include upstream ORFs and novel exons in addition to correcting existing
gene structures. We describe tissue-wise expression of the proteins
and the distribution of gene families on this chromosome. These data
have been deposited to ProteomeXchange with the identifier PXD000561
A Chromosome-centric Human Proteome Project (C-HPP) to Characterize the Sets of Proteins Encoded in Chromosome 17
We report progress assembling the parts list for chromosome
17 and illustrate the various processes that we have developed to
integrate available data from diverse genomic and proteomic knowledge
bases. As primary resources, we have used GPMDB, neXtProt, PeptideAtlas,
Human Protein Atlas (HPA), and GeneCards. All sites share the common
resource of Ensembl for the genome modeling information. We have defined
the chromosome 17 parts list with the following information: 1169
protein-coding genes, the numbers of proteins confidently identified
by various experimental approaches as documented in GPMDB, neXtProt,
PeptideAtlas, and HPA, examples of typical data sets obtained by RNASeq
and proteomic studies of epithelial derived tumor cell lines (disease
proteome) and a normal proteome (peripheral mononuclear cells), reported
evidence of post-translational modifications, and examples of alternative
splice variants (ASVs). We have constructed a list of the 59 āmissingā
proteins as well as 201 proteins that have inconclusive mass spectrometric
(MS) identifications. In this report we have defined a process to
establish a baseline for the incorporation of new evidence on protein
identification and characterization as well as related information
from transcriptome analyses. This initial list of āmissingā
proteins that will guide the selection of appropriate samples for
discovery studies as well as antibody reagents. Also we have illustrated
the significant diversity of protein variants (including post-translational modifications, PTMs) using regions on chromosome 17 that contain important oncogenes. We emphasize the need for mandated deposition of proteomics data in public databases, the further development of improved PTM, ASV, and single nucleotide variant (SNV) databases, and the construction of Web sites that can integrate and regularly update such information. In addition, we describe the distribution of both clustered and scattered sets of protein families on the chromosome. Since chromosome 17 is rich in cancer-associated genes, we have focused the clustering of cancer-associated genes in such genomic regions and have used the ERBB2 amplicon as an example of the value of a proteogenomic approach in which one integrates transcriptomic with proteomic information and captures evidence of coexpression through coordinated regulation