23 research outputs found
Comparison of exon 5 sequences from 35 class I genes of the BALB/c mouse
DNA sequences of the fifth exon, which encodes the transmembrane domain, were determined for the BALB/c mouse class I MHC genes and used to study the relationships between them. Based on nucleotide sequence similarity, the exon 5 sequences can be divided into seven groups. Although most members within each group are at least 80% similar to each other, comparison between groups reveals that the groups share little similarity. However, in spite of the extensive variation of the fifth exon sequences, analysis of their predicted amino acid translations reveals that only four class I gene fifth exons have frameshifts or stop codons that terminate their translation and prevent them from encoding a domain that is both hydrophobic and long enough to span a lipid bilayer. Exactly 27 of the remaining fifth exons could encode a domain that is similar to those of the transplantation antigens in that it consists of a proline-rich connecting peptide, a transmembrane segment, and a cytoplasmic portion with membrane-anchoring basic residues. The conservation of this motif in the majority of the fifth exon translations in spite of extensive variation suggests that selective pressure exists for these exons to maintain their ability to encode a functional transmembrane domain, raising the possibility that many of the nonclassical class I genes encode functionally important products
Diversity and evolution of the immunoglobulin gene superfamily
The Immunoglobulin Gene Superfamily is characterized by a common protein
homology unit that is present in arguably the largest and most diverse set of genes and
gene families of any protein motif. This distribution indicates that the homology unit is
a remarkably versatile functional unit. Its central role in defining the complex
phenotypes of the immune and nervous systems, likewise, is testament to the ability of
the motif to support an amazing and unique degree of diversification. Understanding
more about the function, structure and evolution of the Immunoglobulin Gene
Superfamily can provide insights into both the general issues of complex system
evolution as well as the specific nature of the various systems the superfamily plays a
central role in. This thesis is a collection of work aimed at a more thorough
understanding of these elements. Particularly, these works summarize much of our
current understanding of the members of the Immunoglobulin Gene Superfamily along
with speculations on their evolutionary history as well as both the evolutionary and
somatic mechanisms responsible for their diversity. This work includes initial
descriptions of several features relevant to somatic diversification of rearranging
immune receptors, including: l) the role of joining imprecision in the generation of
junctional diversity in immunoglobulin kappa chain; 2) the initial description of the T-cell
beta chain J/C locus; 3) the translation of T-cell beta chain D gene segments in all
three reading frames; 4) the occurrence of a cryptic rearrangement signal in most
rearranging V families; 5) the first description of the mechanisms of class switching
between heavy chain mu and delta genes; 6) the limited diversity of germline T-cell
beta chains; 7) the shared complementary determining region structure of T-cell beta
chains and immunoglobulin heavy chains. Also, from these efforts, new members of
the superfamily have been identified including MHC class I molecules, L3T4 and
Myelin Associated Glycoprotein. Various observations concerning the evolutionary
relationships of these molecules and motifs have been made. Particularly, a variation
on the basic homology unit motif has been proposed that probably more nearly
represents the primordial sequence and function.
As a result of these discoveries, a new, comprehensive picture of the
immunoglobulin superfamily is emerging that has implications for interpreting current
functional relationships in the context of the evolutionary history of the members.
Particularly, it is suggested from this work that the ability of the homology unit to
accommodate diversity has made possible the evolution of the superfamily. Given the
tremendous diversity within the superfamily, it might be assumed that selective
pressures favoring diversity have driven its evolution. However, much of the analysis
within this collection suggests that, on the contrary, diversity is an inherent feature of
the conserved protein and gene structure of the homology unit and that it was the a
priori diversity itself that drove and shaped the evolution of the complex systems that
employ the homology unit today. This basic diversity is the consequence of three
characteristics of the homology unit. First, the tertiary structure of the protein motif is
such that homology units tend to interact preferentially to form homo- or heterodimers,
forming the basis of many of the receptors and the receptor/ligand interactions common
within the superfamily. These combinatorial associations increase both the somatic and
evolutionary potential for diversification. This can lead to the rather sudden
appearance of new functional associations between existing members of the superfamily
preadapted for otherwise unrelated functions. Second, except for a minimal number of
amino acid residues involved in critical intra- and interchain interactions, the primary
structure of these units can vary dramatically and still provide for essentially the same
tertiary structure. This has been borne out by various crystallographic studies. The
variability is particularly true of the loop structures normally identified with antigen
specificity, but seen in other extended families as well. Reduced constraints on
structural sequences inherently promote the establishment of variation within
populations. Third, with very few exceptions the genes of the superfamily, the
homology units are not only encoded by discrete exons, but these exons have a shared
1/2 splicing rule. That is, each is begun with the second 2 bases of a codon and ended
with the first base. This allows the in-frame splicing of any number of tandem
homology units, while maintaining functional protein domains. This rule generally
applies to the non-homology unit exons of member genes as well. This allows, through
relatively simple genetic events, the development of new contexts for homology unit
expression, both by simple expansion and contraction of homology unit number and
exon shuffling. This is probably at work, as well, in the frequent occurrence and
utilization of alternative transcripts seen throughout the superfamily. Many of the
recognized occurrences of alternative splicing, such as that between membrane-bound
and secreted forms, indicate that this gene structure provides for a further level of
functional diversity and the expansion of the virtual genetic information.
Beyond the explicit discussion of the superfamily members, this work also
speaks to various issues of evolution in general. In particular, the history of the
superfamily suggests the importance of canalization and non-gradual episodes of
evolutionary change. It can contribute, as well, to the discussion of adaptive versus
neutral change.</p
Nucleotide Sequence of a Light Chain Gene of the Mouse I-A Subregion: Aβ^d
Iɑ (I region-associated) antigens are cell-surface glycoproteins involved in the regulation of immune responsiveness. They are composed of one heavy (ɑ) and one light (β) polypeptide chain. We have sequenced the gene encoding the Aβ^d chain of the BALB/c mouse. The presence of six exons is predicted by comparison with the complementary DNA sequences of human β chains and with partial protein sequence data for the Aβ^d polypeptide. Sequence comparisons have been made to other proteins involved in immune responses and the consequent implications for the evolutionary relationships of these genes are discussed
DNA Sequence of the Gene Encoding the E_É‘ Ia Polypeptide of the BALB/c Mouse
A 3.4-kilobase DNA fragment containing the gene coding for the EÉ‘ chain of an Ia (I region-associated) antigen from the BALB/c mouse has been sequenced. It contains at least three exons, which correlate with the major structural
domains of the EÉ‘ chain-the two external domains al and É‘2, and the transmembrane-cytoplasmic domain. The coding sequence of the mouse EÉ‘ gene shows striking homology to its human counterpart at the DNA and protein levels. The
translated ɑ2 exon demonstrates significant similarity to β_2-microglohulin, to immunoglobulin constant region domains, and to certain domains of transplantation
antigens. These observations and those of others suggest that the Ia antigen, transplantation antigen, and immunoglohulin gene families share a common ancestor
Hidden Markov Models in Molecular Biology: New Algorithms and Applications
Hidden Markov Models (HMMs) can be applied to several important problems in molecular biology. We introduce a new convergent learning algorithm for HMMs that, unlike the classical Baum-Welch algorithm is smooth and can be applied on-line or in batch mode, with or without the usual Viterbi most likely path approximation. Left-right HMMs with insertion and deletion states are then trained to represent several protein families including immunoglobulins and kinases. In all cases, the models derived capture all the important statistical properties of the families and can be used efficiently in a number of important tasks such as multiple alignment, motif detection, and classification. 3 and Division of Biology, California Institute of Technology. y and Department of Psychology, Stanford University. 1 INTRODUCTION Hidden Markov Models (e.g., Rabiner, 1989) and the more general EM algorithm in statistics can be applied to the modeling and analysis of biological primary sequenc..
Isolation and sequence of L3T4 complementary DNA clones: expression in T cells and brain
T lymphocytes express on their surface not only a specific receptor for antigen and major histocompatibility complex proteins, but also a number of additional glycoproteins that are thought to play accessory roles in the processes of recognition and signal transduction. L3T4 is one such T-cell surface protein that is expressed on most mouse thymocytes and on mature mouse T cells that recognize class II (Ia) major histocompatibility complex proteins. Such cells are predominantly of the helper/inducer phenotype. In this study, complementary DNA clones encoding L3T4 were isolated and sequenced. The predicted protein sequence shows that L3T4 is a member of the immunoglobulin gene superfamily. It is encoded by a single gene that does not require rearrangement prior to expression. Although the protein has not previously been demonstrated on nonhematopoietic cells, two messenger RNA species specific for L3T4 are found in brain. The minor species comigrates with the L3T4 transcript in T cells, whereas the major species is 1 kilobase smaller
A systolic array processor for biological information signal processing
The Biological Information Signal Processing (BISP) is a system for high speed sequence comparisons designed to support the computation requirements for mapping and sequencing the human and other genomes. The heart of a BISP system is a versatile processor chip that can conduct the most time consuming sequence comparison functions, establishing both global and local relationships between two DNA or protein sequences. Because of
the application’s strong computation and communication
requirements, a programmable systolic array architecture was
developed. A BISP system can include a large number of
processing elements; the initial BISP demonstration system
consists of 768 BISP elements, capable of delivering more than 6.25 x 10^9 integer operations per second. The system can be expanded to include over 4,000 elements, This paper describes the comparison algorithm and outlines the BISP chip and system designs. Estimated performance of the BISP system is compared with several different computer architectures
A systolic array processor for biological information signal processing
The Biological Information Signal Processing (BISP) is a system for high speed sequence comparisons designed to support the computation requirements for mapping and sequencing the human and other genomes. The heart of a BISP system is a versatile processor chip that can conduct the most time consuming sequence comparison functions, establishing both global and local relationships between two DNA or protein sequences. Because of
the application’s strong computation and communication
requirements, a programmable systolic array architecture was
developed. A BISP system can include a large number of
processing elements; the initial BISP demonstration system
consists of 768 BISP elements, capable of delivering more than 6.25 x 10^9 integer operations per second. The system can be expanded to include over 4,000 elements, This paper describes the comparison algorithm and outlines the BISP chip and system designs. Estimated performance of the BISP system is compared with several different computer architectures