1,143 research outputs found

    QUANTITATION AND IMMUNOCYTOCHEMICAL LOCALIZATION OF HUMAN SKIN COLLAGENASE IN BASAL CELL CARCINOMA

    Get PDF
    Human skin collagenase was quantitated by radioimmunoassay in 21 basal cell carcinomas. Immunoreactive collagenase protein was found to be approximately 2-fold greater in extracts of these tumors than in extracts of normal skin, suggesting that this enzyme may be important in the pathogenesis of soft tissue destruction in vivo. To further define the role of collagenase in such destruction, immunofluorescent staining with specific antiserum to human skin collagenase was used to localize collagenase in the basal cell carcinomas. The enzyme was found only in the stromal elements surrounding the tumor islands. No staining of the epithelial components of the basal cell carcinomas was found. These findings suggest that the normal connective tissue elements may have been stimulated to produce an increased amount of collagenase and emphasize the importance of epithelial-stromal interaction in soft tissue invasiveness

    Accelerating Bayesian hierarchical clustering of time series data with a randomised algorithm

    Get PDF
    We live in an era of abundant data. This has necessitated the development of new and innovative statistical algorithms to get the most from experimental data. For example, faster algorithms make practical the analysis of larger genomic data sets, allowing us to extend the utility of cutting-edge statistical methods. We present a randomised algorithm that accelerates the clustering of time series data using the Bayesian Hierarchical Clustering (BHC) statistical method. BHC is a general method for clustering any discretely sampled time series data. In this paper we focus on a particular application to microarray gene expression data. We define and analyse the randomised algorithm, before presenting results on both synthetic and real biological data sets. We show that the randomised algorithm leads to substantial gains in speed with minimal loss in clustering quality. The randomised time series BHC algorithm is available as part of the R package BHC, which is available for download from Bioconductor (version 2.10 and above) via http://bioconductor.org/packages/2.10/bioc/html/BHC.html. We have also made available a set of R scripts which can be used to reproduce the analyses carried out in this paper. These are available from the following URL. https://sites.google.com/site/randomisedbhc/

    Affinity Inequality among Serum Antibodies That Originate in Lymphoid Germinal Centers

    Get PDF
    Upon natural infection with pathogens or vaccination, antibodies are produced by a process called affinity maturation. As affinity maturation ensues, average affinity values between an antibody and ligand increase with time. Purified antibodies isolated from serum are invariably heterogeneous with respect to their affinity for the ligands they bind, whether macromolecular antigens or haptens (low molecular weight approximations of epitopes on antigens). However, less is known about how the extent of this heterogeneity evolves with time during affinity maturation. To shed light on this issue, we have taken advantage of previously published data from Eisen and Siskind (1964). Using the ratio of the strongest to the weakest binding subsets as a metric of heterogeneity (or affinity inequality), we analyzed antibodies isolated from individual serum samples. The ratios were initially as high as 50-fold, and decreased over a few weeks after a single injection of small antigen doses to around unity. This decrease in the effective heterogeneity of antibody affinities with time is consistent with Darwinian evolution in the strong selection limit. By contrast, neither the average affinity nor the heterogeneity evolves much with time for high doses of antigen, as competition between clones of the same affinity is minimal.Ragon Institute of MGH, MIT and HarvardSamsung Scholarship FoundationNational Science Foundation (U.S.). Graduate Research Fellowship (Grant 1122374

    Gene Function Classification Using Bayesian Models with Hierarchy-Based Priors

    Get PDF
    We investigate the application of hierarchical classification schemes to the annotation of gene function based on several characteristics of protein sequences including phylogenic descriptors, sequence based attributes, and predicted secondary structure. We discuss three Bayesian models and compare their performance in terms of predictive accuracy. These models are the ordinary multinomial logit (MNL) model, a hierarchical model based on a set of nested MNL models, and a MNL model with a prior that introduces correlations between the parameters for classes that are nearby in the hierarchy. We also provide a new scheme for combining different sources of information. We use these models to predict the functional class of Open Reading Frames (ORFs) from the E. coli genome. The results from all three models show substantial improvement over previous methods, which were based on the C5 algorithm. The MNL model using a prior based on the hierarchy outperforms both the non-hierarchical MNL model and the nested MNL model. In contrast to previous attempts at combining these sources of information, our approach results in a higher accuracy rate when compared to models that use each data source alone. Together, these results show that gene function can be predicted with higher accuracy than previously achieved, using Bayesian models that incorporate suitable prior information

    The Program of Gene Transcription for a Single Differentiating Cell Type during Sporulation in Bacillus subtilis

    Get PDF
    Asymmetric division during sporulation by Bacillus subtilis generates a mother cell that undergoes a 5-h program of differentiation. The program is governed by a hierarchical cascade consisting of the transcription factors: σ(E), σ(K), GerE, GerR, and SpoIIID. The program consists of the activation and repression of 383 genes. The σ(E) factor turns on 262 genes, including those for GerR and SpoIIID. These DNA-binding proteins downregulate almost half of the genes in the σ(E) regulon. In addition, SpoIIID turns on ten genes, including genes involved in the appearance of σ(K) (.) Next, σ(K) activates 75 additional genes, including that for GerE. This DNA-binding protein, in turn, represses half of the genes that had been activated by σ(K) while switching on a final set of 36 genes. Evidence is presented that repression and activation contribute to proper morphogenesis. The program of gene expression is driven forward by its hierarchical organization and by the repressive effects of the DNA-binding proteins. The logic of the program is that of a linked series of feed-forward loops, which generate successive pulses of gene transcription. Similar regulatory circuits could be a common feature of other systems of cellular differentiation

    Variable strength of forest stand attributes and weather conditions on the questing activity of Ixodes ricinus ticks over years in managed forests

    Get PDF
    Given the ever-increasing human impact through land use and climate change on the environment, we crucially need to achieve a better understanding of those factors that influence the questing activity of ixodid ticks, a major disease-transmitting vector in temperate forests. We investigated variation in the relative questing nymph densities of Ixodes ricinus in differently managed forest types for three years (2008–2010) in SW Germany by drag sampling. We used a hierarchical Bayesian modeling approach to examine the relative effects of habitat and weather and to consider possible nested structures of habitat and climate forces. The questing activity of nymphs was considerably larger in young forest successional stages of thicket compared with pole wood and timber stages. Questing nymph density increased markedly with milder winter temperatures. Generally, the relative strength of the various environmental forces on questing nymph density differed across years. In particular, winter temperature had a negative effect on tick activity across sites in 2008 in contrast to the overall effect of temperature across years. Our results suggest that forest management practices have important impacts on questing nymph density. Variable weather conditions, however, might override the effects of forest management practices on the fluctuations and dynamics of tick populations and activity over years, in particular, the preceding winter temperatures. Therefore, robust predictions and the detection of possible interactions and nested structures of habitat and climate forces can only be quantified through the collection of long-term data. Such data are particularly important with regard to future scenarios of forest management and climate warming

    Phosphate transporters in marine phytoplankton and their viruses: Cross-domain commonalities in viral-host gene exchanges

    Get PDF
    Phosphate (PO 4) is an important limiting nutrient in marine environments. Marine cyanobacteria scavenge PO 4 using the high-affinity periplasmic phosphate binding protein PstS. The pstS gene has recently been identified in genomes of cyanobacterial viruses as well. Here, we analyse genes encoding transporters in genomes from viruses that infect eukaryotic phytoplankton. We identified inorganic PO 4 transporter-encoding genes from the PHO4 superfamily in several virus genomes, along with other transporter-encoding genes. Homologues of the viral pho4 genes were also identified in genome sequences from the genera that these viruses infect. Genome sequences were available from host genera of all the phytoplankton viruses analysed except the host genus Bathycoccus. Pho4 was recovered from Bathycoccus by sequencing a targeted metagenome from an uncultured Atlantic Ocean population. Phylogenetic reconstruction showed that pho4 genes from pelagophytes, haptophytes and infecting viruses were more closely related to homologues in prasinophytes than to those in what, at the species level, are considered to be closer relatives (e.g. diatoms). We also identified PHO4 superfamily members in ocean metagenomes, including new metagenomes from the Pacific Ocean. The environmental sequences grouped with pelagophytes, haptophytes, prasinophytes and viruses as well as bacteria. The analyses suggest that multiple independent pho4 gene transfer events have occurred between marine viruses and both eukaryotic and bacterial hosts. Additionally, pho4 genes were identified in available genomes from viruses that infect marine eukaryotes but not those that infect terrestrial hosts. Commonalities in marine host-virus gene exchanges indicate that manipulation of host-PO 4 uptake is an important adaptation for viral proliferation in marine systems. Our findings suggest that PO 4-availability may not serve as a simple bottom-up control of marine phytoplankton. © 2011 Society for Applied Microbiology and Blackwell Publishing Ltd

    UNCLES: Method for the identification of genes differentially consistently co-expressed in a specific subset of datasets

    Get PDF
    Background: Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Results: Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. Conclusions: The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.The National Institute for Health Research (NIHR) under its Programme Grants for Applied Research Programme (Grant Reference Number RP-PG-0310-1004)

    Gene encoder: a feature selection technique through unsupervised deep learning-based clustering for large gene expression data

    Get PDF
    © 2020, Springer-Verlag London Ltd., part of Springer Nature. Cancer is a severe condition of uncontrolled cell division that results in a tumor formation that spreads to other tissues of the body. Therefore, the development of new medication and treatment methods for this is in demand. Classification of microarray data plays a vital role in handling such situations. The relevant gene selection is an important step for the classification of microarray data. This work presents gene encoder, an unsupervised two-stage feature selection technique for the cancer samples’ classification. The first stage aggregates three filter methods, namely principal component analysis, correlation, and spectral-based feature selection techniques. Next, the genetic algorithm is used, which evaluates the chromosome utilizing the autoencoder-based clustering. The resultant feature subset is used for the classification task. Three classifiers, namely support vector machine, k-nearest neighbors, and random forest, are used in this work to avoid the dependency on any one classifier. Six benchmark gene expression datasets are used for the performance evaluation, and a comparison is made with four state-of-the-art related algorithms. Three sets of experiments are carried out to evaluate the proposed method. These experiments are for the evaluation of the selected features based on sample-based clustering, adjusting optimal parameters, and for selecting better performing classifier. The comparison is based on accuracy, recall, false positive rate, precision, F-measure, and entropy. The obtained results suggest better performance of the current proposal
    corecore