55 research outputs found

    Inference of Many-Taxon Phylogenies

    Get PDF
    Phylogenetic trees are tree topologies that represent the evolutionary history of a set of organisms. In this thesis, we address computational challenges related to the analysis of large-scale datasets with Maximum Likelihood based phylogenetic inference. We have approached this using different strategies: reduction of memory requirements, reduction of running time, and reduction of man-hours

    Pertanika Journal of Science & Technology

    Get PDF

    Pertanika Journal of Science & Technology

    Get PDF

    High-Performance and Power-Aware Graph Processing on GPUs

    Get PDF
    Graphs are a common representation in many problem domains, including engineering, finance, medicine, and scientific applications. Different problems map to very large graphs, often involving millions of vertices. Even though very efficient sequential implementations of graph algorithms exist, they become impractical when applied on such actual very large graphs. On the other hand, graphics processing units (GPUs) have become widespread architectures as they provide massive parallelism at low cost. Parallel execution on GPUs may achieve speedup up to three orders of magnitude with respect to the sequential counterparts. Nevertheless, accelerating efficient and optimized sequential algorithms and porting (i.e., parallelizing) their implementation to such many-core architectures is a very challenging task. The task is made even harder since energy and power consumption are becoming constraints in addition, or in same case as an alternative, to performance. This work aims at developing a platform that provides (I) a library of parallel, efficient, and tunable implementations of the most important graph algorithms for GPUs, and (II) an advanced profiling model to analyze both performance and power consumption of the algorithm implementations. The platform goal is twofold. Through the library, it aims at saving developing effort in the parallelization task through a primitive-based approach. Through the profiling framework, it aims at customizing such primitives by considering both the architectural details and the target efficiency metrics (i.e., performance or power)

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    Development and validation of in silico tools for efficient library design and data analysis in high throughput screening campaigns

    Get PDF
    My PhD project findings have their major application in the early phase of the drug discovery process, in particular we have developed and validated two computational tools (Molecular Assembles and LiGen) to support the hit finding and the hit to lead phases. I have reported here novel methods to first design chemical libraries optimized for HTS and then profile them for a specific target receptor or enzyme. I also analyzed the generated bio-chemical data in order to obtain robust SARs and to select the most promising hits for the follow up. The described methods support the iterative process of validated hit series optimization up to the identification of a lead. In chapter 3, Ligand generator (LiGen), a de novo tool for structure based virtual screening, is presented. The development of LiGen is a project based on a collaboration among Dompé Farmaceutici SpA, CINECA and the University of Parma. In this multidisciplinary group, the integration of different skills has allowed the development, from scratch, of a virtual screening tool, able to compete in terms of performance with long standing, well-established molecular docking tools such as Glide, Autodock and PLANTS. LiGen, using a novel docking algorithm, is able to perform ligand flexible docking without performing a conformational sampling. LiGen also has other distinctive features with respect to other molecular docking programs: • LiGen uses the inverse pharmacophore derived from the binding site to identify the putative bioactive conformation of the molecules, thus avoiding the evaluation of molecular conformations which do not match the key features of the binding site. • LiGen implemenst a de novo molecule builder based on the accurate definition of chemical rules taking account of building block (reagents) reactivity. • LiGen is natively a multi-platform C++ portable code designed for HPC applications and optimized for the most recent hardware architectures like the Xeon Phi Accelerators. Chapter 3 also reports the further development and optimization of the software starting from the results obtained in the first optimization step performed to validate the software and to derive the default parameters. In chapter 4, the application of LiGen in the discovery and optimization of novel inhibitors of the complement factor 5 receptor (C5aR) is reported. Briefly, the C5a anaphylatoxin acting on its cognate G protein-coupled receptor C5aR is a potent pronociceptive mediator in several models of inflammatory and neuropathic pain. Although there has long been interest in the identification of C5aR inhibitors, their development has been complicated, as is the case with many peptidomimetic drugs, mostly due to the poor drug-like properties of these molecules. Herein, we report the de novo design of a potent and selective C5aR noncompetitive allosteric inhibitor, DF2593A. DF2593A design was guided by the hypothesis that an allosteric site, the “minor pocket”, previously characterized in CXCR1 and CXCR2, could be functionally conserved in the GPCR class.DF2593A potently inhibited C5a-induced migration of human and rodent neutrophils in vitro. Moreover, oral administration of DF2593A effectively reduced mechanical hyperalgesia in several models of acute and chronic inflammatory and neuropathic pain in vivo, without any apparent side effects. Chapter 5 describes another tool: Molecular Assemblies (MA), a novel metrics based on a hierarchical representation of the molecule based on different representations of the scaffold of the molecule and pruning rules. The algorithm used by MA, defining a priori a metrics (a set of rules), creates a representation of the chemical structure through hierarchical decomposition of the scaffold in fragments, in a pathway invariant way (this feature is novel with respect to the other algorithms reported in literature). Such structure decomposition is applied to nine hierarchical representation of the scaffold of the reference molecule, differing for the content of structural information: atom typing and bond order (this feature is novel with respect to the other algorithms reported in literature) The algorithm (metrics) generates a multi-dimensional hierarchical representation of the molecule. This descriptor applied to a library of compounds is able to extract structural (molecule having the same scaffold, wireframe or framework) and sub structural (molecule having the same fragments in common) relations among all the molecules. At least, this method generates relations among molecules based on identities (scaffolds or fragments). Such an approach produces a unique representation of the reference chemical space not biased by the threshold used to define the similarity cut-off between two molecules. This is in contrast to other methods which generate representations based in similarities. MA procedure, retrieving all scaffold representation, fragments and fragmentation’s patterns (according to the predefined rules) from a molecule, creates a molecular descriptor useful for several cheminformatics applications: • Visualization of the chemical space. The scaffold relations (Figure 7) and the fragmentation patterns can be plotted using a network representation. The obtained graphs are useful depictions of the chemical space highlighting the relations that occur among the molecule in a two dimensional space. • Clustering of the chemical space. The relations among the molecules are based on identities. This means that the scaffold representations and their fragments can be used as a hierarchical clustering method. This descriptor produces clusters that are independent from the number and similarity among closest neighbors because belonging to a cluster is a property of the single molecule (Figure 8). This intrinsic feature makes the scaffold based clustering much faster than other methods in producing “stable” clusters in fact, adding and removing molecules increases and decreases the number of clusters while adding or removing relations among the clusters. However these changes do not affect the cluster number and the relation of the other molecules in dataset. • Generate scaffold-based fingerprints. The descriptor can be used as a fingerprint of the molecule and to generate a similarity index able to compare single molecules or also to compare the diversity of two libraries as a whole. Chapter 6 reports an application of MA in the design of a diverse drug-like scaffold based library optimized for HTS campaigns. A well designed, sizeable and properly organized chemical library is a fundamental prerequisite for any HTS project. To build a collection of chemical compounds with high chemical diversity was the aim of the Italian Drug Discovery Network (IDDN) initiative. A structurally diverse collection of about 200,000 chemical molecules was designed and built taking into account practical aspects related to experimental HTS procedures. Algorithms and procedures were developed and implemented to address compound filtering, selection, clusterization and plating. Chapter 7 collects concluding remarks and plans for the further development of the tools

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    Computational Methods in Science and Engineering : Proceedings of the Workshop SimLabs@KIT, November 29 - 30, 2010, Karlsruhe, Germany

    Get PDF
    In this proceedings volume we provide a compilation of article contributions equally covering applications from different research fields and ranging from capacity up to capability computing. Besides classical computing aspects such as parallelization, the focus of these proceedings is on multi-scale approaches and methods for tackling algorithm and data complexity. Also practical aspects regarding the usage of the HPC infrastructure and available tools and software at the SCC are presented
    corecore