174 research outputs found

    Acoustic absorption of hemp-lime construction

    Get PDF
    Hemp-lime concrete is a sustainable alternative to standard wall construction materials. It boasts excellent hygrothermal properties in part deriving from its porous structure. This paper investigates the acoustic properties of hemp-lime concrete, using binders developed from hydrated lime and pozzolans as well as hydraulic and cementicious binders. To assess the acoustic absorption of hemp-lime walls, as they are commonly finished in practical construction, wall sections are rendered and the resulting impact on absorption is evaluated. Hemp-concretes with lime-pozzolan binders display superior acoustic properties relative to more hydraulic binders. These are diminished when rendered, as the open surface porosity is affected, however hemp-lime construction offers the potential to meet standard and guideline targets for spaces requiring acoustic treatment

    Assemblage adaptatif de génomes et de méta-génomes par passage de messages

    Get PDF
    De manière générale, les procédés et processus produisent maintenant plus de données qu’un humain peut en assimiler. Les grosses données (Big Data), lorsque bien analysées, augmentent la compréhension des processus qui sont opérationnels à l’intérieur de systèmes et, en conséquence, encouragent leur amélioration. Analyser les séquences de l’acide désoxyribonucléique (ADN) permet de mieux comprendre les êtres vivants, en exploitant par exemple la biologie des systèmes. Les séquenceurs d’ADN à haut débit sont des instruments massivement parallèles et produisent beaucoup de données. Les infrastructures informatiques, comme les superordinateurs ou l’informatique infonuagique, sont aussi massivement parallèles de par leur nature distribuée. Par contre, les ordinateurs ne comprennent ni le français, ni l’anglais – il faut les programmer. Les systèmes logiciels pour analyser les données génomiques avec des superordinateurs doivent être aussi massivement parallèles. L’interface de passage de messages permet de créer de tels logiciels et une conception granulaire permet d’entrelacer la communication et le calcul à l’intérieur des processus d’un système de calcul. De tels systèmes produisent des résultats rapidement à partir de données. Ici, les logiciels RayPlatform, Ray (incluant les flux de travail appelé Ray Meta et Ray Communities) et Ray Cloud Browser sont présentés. L’application principale de cette famille de produits est l’assemblage et le profilage adaptatifs de génomes par passage de messages.Generally speaking, current processes – industrial, for direct-to-consumers, or researchrelated – yield far more data than humans can manage. Big Data is a trend of its own and concerns itself with the betterment of humankind through better understanding of processes and systems. To achieve that end, the mean is to leverage massive amounts of big data in order to better comprehend what they contain, mean, and imply. DNA sequencing is such a process and contributes to the discovery of knowledge in genetics and other fields. DNA sequencing instruments are parallel objects and output unprecedented volumes of data. Computer infrastructures, cloud and other means of computation open the door to the analysis of data stated above. However, they need to be programmed for they are not acquainted with natural languages. Massively parallel software must match the parallelism of supercomputers and other distributed computing systems before attempting to decipher big data. Message passing – and the message passing interface – allows one to create such tools, and a granular design of blueprints consolidate production of results. Herein, a line of products that includes RayPlatform, Ray (which includes workflows called Ray Meta and Ray Communities for metagenomics) and Ray Cloud Browser are presented. Its main application is scalable (adaptive) assembly and profiling of genomes using message passing

    Algorithms for scalable and efficient population genomics and metagenomics

    Get PDF
    Microbes strongly impact human health and the ecosystem of which they are a part. Rapid improvements and decreasing costs in sequencing technologies have revolutionized the field of genomics and enabled important insights into microbial genome biology and microbiomes. However, new tools and approaches are needed to facilitate the efficient analysis of large sets of genomes and to associate genomic features with phenotypic characteristics better. Here, we built and utilized several tools for large-scale whole-genome analysis for different microbial characteristics, such as antimicrobial resistance and pathogenicity, that are important for human health. Chapters 2 and 3 demonstrate the needs and challenges of population genomics in associating antimicrobial resistance with genomic features. Our results highlight important limitations of reference database-driven analysis for genotype-phenotype association studies and demonstrate the utility of whole-genome population genomics in uncovering novel genomic factors associated with antimicrobial resistance. Chapter 4 describes PRAWNS, a fast and scalable bioinformatics tool that generates compact pan-genomic features. Existing approaches are unable to meet the needs of large-scale whole-genome analyses, either due to scalability limitations or the inability of the genomic features generated to support a thorough whole-genome assessment. We demonstrate that PRAWNS scales to thousands of genomes and provides a concise collection of genomic features which support the downstream analyses. In Chapter 5, we assess whether the combination of long and short-read sequencing can expedite the accurate reconstruction of a pathogen genome from a microbial community. We describe the challenges for pathogen detection in current foodborne illness outbreak monitoring. Our results show that the recovery of a pathogen genome can be accelerated using a combination of long and short-read sequencing after limited culturing of the microbial community. We evaluated several popular genome assembly approaches and identified areas for improvement. In Chapter 6, we describe SIMILE, a fast and scalable bioinformatics tool that enables the detection of genomic regions shared between several assembled metagenomes. In metagenomics, microbial communities are sequenced directly without culturing. Although metagenomics has furthered our understanding of the microbiome, comparing metagenomic samples is extremely difficult. We describe the need and challenges in comparing several metagenomic samples and present an approach that facilitates large-scale metagenomic comparisons

    De novo draft assembly of the Botrylloides leachii genome provides further insight into tunicate evolution

    Get PDF
    Tunicates are marine invertebrates that compose the closest phylogenetic group to the vertebrates. These chordates present a particularly diverse range of regenerative abilities and life-history strategies. Consequently, tunicates provide an extraordinary perspective into the emergence and diversity of these traits. Here we describe the genome sequencing, annotation and analysis of the Stolidobranchian Botrylloides leachii. We have produced a high-quality 159 Mb assembly, 82% of the predicted 194  Mb genome. Analysing genome size, gene number, repetitive elements, orthologs clustering and gene ontology terms show that B. leachii has a genomic architecture similar to that of most solitary tunicates, while other recently sequenced colonial ascidians have undergone genome expansion. In addition, ortholog clustering has identified groups of candidate genes for the study of colonialism and whole-body regeneration. By analysing the structure and composition of conserved gene linkages, we observed examples of cluster breaks and gene dispersions, suggesting that several lineage-specific genome rearrangements occurred during tunicate evolution. We also found lineage-specific gene gain and loss within conserved cell-signalling pathways. Such examples of genetic changes within conserved cell-signalling pathways commonly associated with regeneration and development that may underlie some of the diverse regenerative abilities observed in tunicates. Overall, these results provide a novel resource for the study of tunicates and of colonial ascidians

    Indexation et analyse de grandes collections de séquençages via des matrices de k-mers

    Get PDF
    The 21st century is bringing a tsunami of data in many fields, especially in bioinformatics. This paradigm shift requires the development of new processing methods capable of scaling up on such data. This work consists mainly in considering massive tera-scaled datasets from genomic sequencing. A common way to process these data is to represent them as a set of words of a fixed size, called k-mers. The k-mers are widely used as building blocks by many sequencing data analysis techniques. The challenge is to be able to represent the k-mers and their abundances in a large number of datasets. One possibility is the k-mer matrix, where each row is a k-mer associated with a vector of abundances and each column corresponds to a sample. Some k-mers are erroneous due to sequencing errors and must be discarded. The usual technique consists in discarding low-abundant k-mers. On complex datasets such as metagenomes, such a filter is not efficient and discards too many k-mers. The holistic view of abundances across samples allowed by the matrix representation also enables a new procedure for error detection on such datasets. In summary, we explore the concept of k-mer matrix and show its scalability in various applications, from indexing to analysis, and propose different tools for this purpose. On the indexing side, our tools have allowed indexing a large metagenomic dataset from the Tara Ocean project while keeping additional k-mers, usually discarded by the classical k-mer filtering technique. The next and important step is to make the index publicly available. On the analysis side, our matrix construction technique enables to speed up a differential k-mer analysis of a state-of-the-art tool by an order of magnitude.Le 21ème siècle subit un tsunami de données dans de nombreux domaines, notamment en bio-informatique. Ce changement de paradigme nécessite le développement de nouvelles méthodes de traitement capables de passer à l’échelle sur de telles données. Ce travail consiste principalement à considérer des jeux de données massifs provenant du séquençage génomique. Une façon courante de traiter ces données est de les représenter comme un ensemble de mots de taille fixe, appelés k-mers. Les k-mers sont très largement utilisés comme éléments de bases par de nombreuses méthodes d’analyses de données de séquençages. L’enjeu est de pouvoir représenter les k-mers et leurs abondances dans un grand nombre de jeux de données. Une possibilité est la matrice de k-mers, où chaque ligne est un k-mer associé à un vecteur d’abondances. Ces k-mers sont erronées en raison des erreurs de séquençage et doivent être filtrés. La technique habituelle consiste à écarter les k-mers peu abondants. Sur des ensembles de données complexes comme les métagénomes, un tel filtre n’est pas efficace et élimine un trop grand nombre de k-mers. La vision des abondances à travers les échantillons permise par la représentation matricielle permet également une nouvelle procédure de détection des erreurs dans les jeux de données complexes. En résumé, nous explorons le concept de matrice de k-mer et montrons ses capacités en termes de passage à l’échelle au travers de diverses applications, de l’indexation à l’analyse, et proposons différents outils à cette fin. Sur le plan de l’indexation, nos outils ont permis d’indexer un grand ensemble métagénomique du projet Tara Ocean tout en conservant des k-mers rares, habituellement écartés par les techniques de filtrage classiques. En matière d’analyse, notre technique de construction de matrices permet d’accélérer d’un ordre de grandeur l’analyse différentielle de k-mers

    Bayesian and machine learning approaches in metagenomics

    Get PDF
    In this doctoral thesis, we present a novel set of bioinformatics tools to address key problems in the field of metagenomics. This set includes a fully probabilistic framework for estimating the number of present genomes on a species level in a metagenomic sample, the use of variational encoders as an alternative method for dimensionality reduction of the coverage and the tetramer composition of metagenomic samples and a natural language processing method for compressing the number of gene frequencies in metagenomes for better prediction of their phenotypic traits. The first tool tackles the problem of metagenomic binning. A Bayesian non-parametric method is used in conjunction with a Gaussian mixture model to estimate more accurately the number of present genomes, and also correctly cluster the contigs into the appropriate bin. We call this method DP (Dirichlet Processes) algorithm. An attempt was made to improve the accuracy of the algorithm by incorporating extra information from the edges of the assembly graph, but this addition was not used to the final model as the signal from data used is too weak. This method is validated in a 20-genomes simulated mock community and is compared against the state-of-the-art binners in a 100 genome simulated community in different scenarios using different number of samples. The results show that this method perform at least in the same standards as the state-of-the-art methods, while outperforming them in some scenarios. This method is also applied on a real 11 sample infant gut dataset. The second tool is about the prediction of phenotypic traits in metagenomes. In this part, we build on the idea of using the frequencies of genes annotated, based on the Kyoto Encyclopedia of Genes and Genomes (KEGG), to predict the presence and absence of 83 functional and metabolic traits. We apply the doc2vec algorithm as a dimensionality reduction method on 9407 prokaryotic genomes, experimenting with different compression dimensions and training on various machine learning algorithms for the trait prediction part. We conclude that the dimensionality reduction improves the performance of the classifiers, and it achieves the best results when combined with L-1 logistic regression on 100 dimensions. In addition, we train the classifiers on using the uncompressed KO frequencies and we identify in which traits the compression offers no improvement, comparing the number of KOs present in each case. The third tool presented is about the use of variational autoencoders for compressing the coverage and tetramer composition before binnig in metagenomic samples. We combine the variational autoencoder architecture used in the VAMB binner for dimensionality reduction with the Bayesian non-parametric binning approach we presented above. We tested this novel combination using the same 20-genomes simulated mock community we used previously and we concluded that this combination performs better in clustering the contigs correctly than the DP algorithm on the species level. We also concluded that this combination does not perform well in real datasets, being unable to identify any `good' bins, assessed by the percentage of single-copy core genes present. The last part of this work is case study of the oral microbiome. It is estimated that the oral hosts over 700 species of bacteria. In this study, we analyze 131 oral samples metagenomic samples from 68 individuals. We follow an assembly-based approach and then we split the analysis in two directions. In the first approach, the contigs are binned and the abundance of each sample to each bin is calulated. In the second approach, the contigs are not binned; open reading frames are called and mapped to KEGG genes and the coverage of each gene in every sample is calculated. We associate these coverages with various metadata and attribute their variation in the presence of different species or KOs

    Using MapReduce Streaming for Distributed Life Simulation on the Cloud

    Get PDF
    Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

    Investigating the Stress and Strain fields in Porous Synthetic Bone Graft Substitute Materials with Varied Porosity Levels

    Get PDF
    PhD ThesisPorous hydroxyapatite (PHA) ceramic granules have been found to be highly successful synthetic bone graft substitute (BGS) materials, encouraging rapid, good quality bone healing. Key to the success of these materials is a hierarchical multi-scale pore structure, consisting of both macro pores (Larger than 50 m in diameter) giving the granules their characteristic foam-like pore structure, and smaller micro pores (less than 20 m in diameter) which are found within the ‘struts’ or ‘body’ of the foam structure. It has been widely reported that control of the level of total porosity (dominated by the macro pores) has an impact on bone healing in BGS, the rate of remodelling and the nature of bone growth. However, within porous hydroxyapatite (PHA) granules the level of strut porosity and micro porosity have also been found to be key to the rate and pattern of bone growth. This has been hypothesised to be due to the variation in the macro-structure and micro-architecture of the BGS resulting in different level of strains experienced within the niche environments of the implanted granule masses, which in turn stimulates or supresses bone growth through mechano-transduction pathways. The aim of this study was to develop, simulate and analyse finite element models of PHA granule masses. This is to identify whether changes in the level of strut and total porosities, could alter the patterns of stresses and strains exhibited within granule masses to effect local bone formation. Models for finite element analysis (FEA) were generated from micro-CT scans of cylinders packed with granule masses of different combinations of total and strut porosities. The procedure captured the natural porous architecture in a novel approach to the analysis of PHA. The study demonstrated that PHA granules as a material maintain their heterogeneity and density at different scales and thus lend themselves to homogenisation techniques to create representative volume entities (RVEs). The analysis incorporated RVEs of different sizes to investigate the continuity of the material behaviour. All the models were energetically validated. They were modelled using a linear elastic model as well as a plastic non-linear one typically used in soil and powder modelling applications. The non-linear Drucker-Prager cap model, was utilised combining a mathematical approach and mechanical testing techniques, to obtain the model’s parameters, in an attempt to eliminate the need for extensive mechanical tests. FEA on the representative volumes demonstrated a wholesale change in strain levels and distribution associated with the level of porosity. Changes in strut porosities showed a direct effect on the peak strain levels within the porous structures. Where the location of both stress and strain peaks as well as fields favoured the pore waists throughout all simulations with slight variation in the precision of concentration in response to changes in strut porosity. These observations could explain the differences observed in the structure of bone growth within BGS materials with matched total porosities but varied levels of strut porosities. Moreover, they may also explain the phenomena where by bone formation within PHA has been observed to occur simultaneously within a single pore via both endochondral and mesenchymal pathways. These results suggest that the models generated in this PhD could be used to further investigate the effect of structure and strain manipulation to control the rate and quality of bone regeneration within bone graft substitute
    • …
    corecore