5 research outputs found

    Transcriptional landscape estimation from tiling array data using a model of signal shift and drift

    Get PDF
    Motivation: High-density oligonucleotide tiling array technology holds the promise of a better description of the complexity and the dynamics of transcriptional landscapes. In organisms such as bacteria and yeasts, transcription can be measured on a genome-wide scale with a resolution >25 bp. The statistical models currently used to handle these data remain however very simple, the most popular being the piecewise constant Gaussian model with a fixed number of breakpoints

    Alignment, Clustering and Extraction of Structured Motifs in DNA Promoter Sequences

    Get PDF
    A simple motif is a short DNA sequence found in the promoter region and believed to act as a binding site for a transcription factor protein. A structured motif is a sequence of simple motifs (boxes) separated by short sequences (gaps). Biologists theorize that the presence of these motifs play a key role in gene expression regulation. Discovering these patterns is an important step towards understanding protein-gene and gene-gene interaction thus facilitates the building of accurate gene regulatory network models. DNA sequence motif extraction is an important problem in bioinformatics. Many studies have proposed algorithms to solve the problem instance of simple motif extraction. Only in the past decade has the more complex structured motif extraction problem been examined by researchers. The problem is inherently challenging as structured motif patterns are segmented into several boxes separated by variable size gaps for each instance. These boxes may not be exact copies, but may have multiple mismatched positions. The challenge is extenuated by the lack of resources for real datasets covering a wide range of possible cases. Also, incomplete annotation of real data leads to the discovery of unknown motifs that may be regarded as false positives. Furthermore, current algorithms demand unreasonable amount of prior knowledge to successfully extract the target pattern. The contributions of this research are four new algorithms. First, SMGenerate generates simulated datasets of implanted motifs that covers a wide range of biologically possible cases. Second, SMAlign aligns a pair of structured motifs optimally and efficiently given their gap constraints. Third, SMCluster produces multiple alignment of structured motifs through hierarchical clustering using SMAlign\u27s affinity score. Finally, SMExtract extracts structured motifs from a set of sequences by using SMCluster to construct the target pattern from the top reported two-box patterns (fragments), extracted using an existing algorithm (Exmotif) and a two-box template. The main advantage of SMExtract is its efficiency to extract longer degenerate patterns while requiring less prior knowledge, about the pattern to be extracted, than current algorithms

    Stability and specificity of transmembrane domain self-association by mutagenesis and protein design

    Get PDF
    Stability and specificity of transmembrane domain self-association by In this thesis, I investigate the sequence dependence of homodimerization of the transmembrane domain of the pro-apoptotic C. elegans protein BNIP3. Using site directed mutagenesis and two assays for dimerization, I show that the tight association of the CeBNIP3 transmembrane domain relies on overlapping but distinct sets of residues depending on the assay: in membranes, the critical residues are N183xxSFxxxGxxxG 194, whereas in detergents, the key residues are S186FxxGxxxGxxxS 198. The small residue Ser 186, the bulky residue Phe 187, and small residues Gly 190 and Gly 194 play key roles in CeBNIP3 dimerization in both assays. However, CeBNIP3 TMD self-association in detergents, but not membranes, depends critically on Ser 198; self-association in lipid bilayers, but not detergents, depends on Asn 183. Comparison with the previously identified dimerization motif for the human BNIP3 ortholog (SHxxAlxxGlxxG) shows that the residues that drive CeBNIP3 dimerization in membranes are chemically similar to, but distinct from, those that drive HsBNIP3 association. To explore how interfacial BNIP3 residues determine dimer stability and specificity, I generated a combinatorial library from the CeBNIP3 and HsBNIP3 motifs, (STT)(H/N)xx(A/S)(I/F)xxG(I/A)xxG, and tested the hybrid sequences for dimerization. All combinations of interfacial residues support strong to extremely strong dimerization in membranes, suggesting that the two parental sequences adopt similar structures. Not all sequences form dimers in detergents, and dimerization propensity correlates weakly with sequence hydrophobicity. Manipulating the solvent conditions to enhance the hydrophobic effect increases dimerization of some sequences but not others. The CeBNIP3 and HsBNIP3 transmembrane domains form homodimers but not heterodimers in detergents. Hybrid motif sequences show differing propensities to form heterodimers with wildtype CeBNIP3 TMD and HsBNIP3 TMD: some hybrids discriminate, binding only one wildtype sequence, and some interact with both. My findings identify the sequence elements responsible for stability and specificity of BNIP3-type transmembrane domain dimerization. My results also show that the hydrophobicity of membrane spans strongly influences their behavior in detergent assays of protein-protein interactions. The demonstration that altering the aqueous solvent conditions can improve the stability of integral membrane proteins in detergents may be of general importance in membrane protein biochemistry
    corecore