280 research outputs found

    Computational Approaches to Drug Profiling and Drug-Protein Interactions

    Get PDF
    Despite substantial increases in R&D spending within the pharmaceutical industry, denovo drug design has become a time-consuming endeavour. High attrition rates led to a long period of stagnation in drug approvals. Due to the extreme costs associated with introducing a drug to the market, locating and understanding the reasons for clinical failure is key to future productivity. As part of this PhD, three main contributions were made in this respect. First, the web platform, LigNFam enables users to interactively explore similarity relationships between ‘drug like’ molecules and the proteins they bind. Secondly, two deep-learning-based binding site comparison tools were developed, competing with the state-of-the-art over benchmark datasets. The models have the ability to predict offtarget interactions and potential candidates for target-based drug repurposing. Finally, the open-source ScaffoldGraph software was presented for the analysis of hierarchical scaffold relationships and has already been used in multiple projects, including integration into a virtual screening pipeline to increase the tractability of ultra-large screening experiments. Together, and with existing tools, the contributions made will aid in the understanding of drug-protein relationships, particularly in the fields of off-target prediction and drug repurposing, helping to design better drugs faster

    Partitioning algorithms for induced subgraph problems

    Get PDF
    This dissertation introduces the MCSPLIT family of algorithms for two closely-related NP-hard problems that involve finding a large induced subgraph contained by each of two input graphs: the induced subgraph isomorphism problem and the maximum common induced subgraph problem. The MCSPLIT algorithms resemble forward-checking constrant programming algorithms, but use problem-specific data structures that allow multiple, identical domains to be stored without duplication. These data structures enable fast, simple constraint propagation algorithms and very fast calculation of upper bounds. Versions of these algorithms for both sparse and dense graphs are described and implemented. The resulting algorithms are over an order of magnitude faster than the best existing algorithm for maximum common induced subgraph on unlabelled graphs, and outperform the state of the art on several classes of induced subgraph isomorphism instances. A further advantage of the MCSPLIT data structures is that variables and values are treated identically; this allows us to choose to branch on variables representing vertices of either input graph with no overhead. An extensive set of experiments shows that such two-sided branching can be particularly beneficial if the two input graphs have very different orders or densities. Finally, we turn from subgraphs to supergraphs, tackling the problem of finding a small graph that contains every member of a given family of graphs as an induced subgraph. Exact and heuristic techniques are developed for this problem, in each case using a MCSPLIT algorithm as a subroutine. These algorithms allow us to add new terms to two entries of the On-Line Encyclopedia of Integer Sequences

    Tradução médica: uma experiência de estágio (curricular) na Escola de Medicina

    Get PDF
    Dissertação de mestrado em Tradução e Comunicação MultilingueNo âmbito de um esforço crescente de internacionalização, a Escola de Medicina da Universidade do Minho criou o Núcleo de Internacionalização (IAO), cujo objetivo principal é apoiar todas as atividades desenvolvidas nas redes internacionais existentes na Escola e estimular novas iniciativas. Além disso, pretende também apoiar a dinamização de um ambiente de aprendizagem global no qual estudantes, docentes e membros de serviços administrativos possam usufruir dos privilégios transformativos de uma educação internacional e em expansão, construindo, assim, um espaço de diálogo e partilha. Com o objetivo de promover a internacionalização da Escola, a EM promove e participa em vários Programas de Mobilidade nacionais e internacionais. O objetivo do presente projeto de estágio é apresentar e descrever o trabalho desenvolvido durante o estágio curricular correspondente ao segundo semestre do 2º ano do Mestrado em Tradução e Comunicação Multilingue da Universidade do Minho. Este estágio durou quatro meses, no Núcleo Internacionalização, da Escola de Medicina, localizado no Campus de Gualtar da UMinho, em Braga. Este relatório consistirá de um breve enquadramento teórico sobre tradução médica, as principais caraterísticas e dificuldades, assim como o papel do tradutor enquanto especialista nesta área e a função das cat tools nesta área da tradução. Serão apresentadas as vantagens e desvantagens da utilização de ferramentas de tradução e a sua utilidade em tradução médica, especialmente durante o meu estágio.As part of a growing internationalization effort, the School of Medicine of the University of Minho created the International Affairs Office (IAO), whose main objective is to support all activities carried out in the international networks existing at the School and to encourage new initiatives. In addition, it also intends to support the promotion of a global learning environment in which students, teachers and members of administrative services can enjoy the transformative privileges of an international and expanding education, thus building a space for dialogue and sharing. With the aim of promoting the internationalization of the School, EMED promotes and participates in several national and international Mobility Programs. The objective of this internship report is to present and describe the work carried out during the curricular internship corresponding to the second semester of the second year of the Master in Translation and Multilingual Communication at the University of Minho. This internship lasted four months, at the International Affairs Office, at the School of Medicine, located at the UMinho’s Gualtar Campus, in Braga. This report will consist of a brief theoretical framework on medical translation, the main characteristics and difficulties, as well as the role of the translator as a specialist in this area and the role of cat tools in this area of translation. The advantages and disadvantages of using translation tools and their usefulness in medical translation will be presented, especially during my internship

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    The Multi-Maximum and Quasi-Maximum Common Subgraph Problem

    Get PDF
    The Maximum Common Subgraph problem has been long proven NP-hard. Nevertheless, it has countless practical applications, and researchers are still searching for exact solutions and scalable heuristic approaches. Driven by applications in molecular science and cyber-security, we concentrate on the Maximum Common Subgraph among an indefinite number of graphs. We first extend a state-of-the-art branch-and-bound procedure working on two graphs to N graphs. Then, given the high computational cost of this approach, we trade off complexity for accuracy, and we propose a set of heuristics to approximate the exact solution for N graphs. We analyze sequential, parallel multi-core, and parallel-many core (GPU-based) approaches, exploiting several leveraging techniques to decrease the contention among threads, improve the workload balance of the different tasks, reduce the computation time, and increase the final result size. We also present several sorting heuristics to order the vertices of the graphs and the graphs themselves. We compare our algorithms with a state-of-the-art method on publicly available benchmark sets. On graph pairs, we are able to speed up the exact computation by a 2× factor, pruning the search space by more than 60%. On sets of more than two graphs, all exact solutions are extremely time-consuming and of a complex application in many real cases. On the contrary, our heuristics are far less expensive (as they show a lower-bound for the speed up of 10×), have a far better asymptotic complexity (with speed ups up to several orders of magnitude in our experiments), and obtain excellent approximations of the maximal solution with 98.5% of the nodes on average

    Enabling Scalability: Graph Hierarchies and Fault Tolerance

    Get PDF
    In this dissertation, we explore approaches to two techniques for building scalable algorithms. First, we look at different graph problems. We show how to exploit the input graph\u27s inherent hierarchy for scalable graph algorithms. The second technique takes a step back from concrete algorithmic problems. Here, we consider the case of node failures in large distributed systems and present techniques to quickly recover from these. In the first part of the dissertation, we investigate how hierarchies in graphs can be used to scale algorithms to large inputs. We develop algorithms for three graph problems based on two approaches to build hierarchies. The first approach reduces instance sizes for NP-hard problems by applying so-called reduction rules. These rules can be applied in polynomial time. They either find parts of the input that can be solved in polynomial time, or they identify structures that can be contracted (reduced) into smaller structures without loss of information for the specific problem. After solving the reduced instance using an exponential-time algorithm, these previously contracted structures can be uncontracted to obtain an exact solution for the original input. In addition to a simple preprocessing procedure, reduction rules can also be used in branch-and-reduce algorithms where they are successively applied after each branching step to build a hierarchy of problem kernels of increasing computational hardness. We develop reduction-based algorithms for the classical NP-hard problems Maximum Independent Set and Maximum Cut. The second approach is used for route planning in road networks where we build a hierarchy of road segments based on their importance for long distance shortest paths. By only considering important road segments when we are far away from the source and destination, we can substantially speed up shortest path queries. In the second part of this dissertation, we take a step back from concrete graph problems and look at more general problems in high performance computing (HPC). Here, due to the ever increasing size and complexity of HPC clusters, we expect hardware and software failures to become more common in massively parallel computations. We present two techniques for applications to recover from failures and resume computation. Both techniques are based on in-memory storage of redundant information and a data distribution that enables fast recovery. The first technique can be used for general purpose distributed processing frameworks: We identify data that is redundantly available on multiple machines and only introduce additional work for the remaining data that is only available on one machine. The second technique is a checkpointing library engineered for fast recovery using a data distribution method that achieves balanced communication loads. Both our techniques have in common that they work in settings where computation after a failure is continued with less machines than before. This is in contrast to many previous approaches that---in particular for checkpointing---focus on systems that keep spare resources available to replace failed machines. Overall, we present different techniques that enable scalable algorithms. While some of these techniques are specific to graph problems, we also present tools for fault tolerant algorithms and applications in a distributed setting. To show that those can be helpful in many different domains, we evaluate them for graph problems and other applications like phylogenetic tree inference

    Scalable Graph Algorithms using Practically Efficient Data Reductions

    Get PDF

    Performance Assessment of Masonry School Buildings to Seismic and Flood Hazards Using Bayesian Networks

    Get PDF
    Performance assessment of schools is an integral part of disaster risk reduction of communities from natural hazards such as earthquakes and floods. In regions of high exposure, these hazards may often act concurrently, whereby yearly flood events weaken masonry school buildings, rendering them more vulnerable to frequent earthquake shaking. This recurring damage, combined with other functional losses, ultimately result in disruption to education delivery, affecting vulnerable schoolchildren. This project examines behaviour of school buildings to seismic and flood loading, and associated disruption to education from a structural and functional perspective. The study is based on a case study of school buildings in Guwahati, India, where the majority of the buildings can be classified into confined masonry (CM) typology. This project presents three stages of analyses to study the performance of these CM school buildings and the system of schools, as summarised in the following. The first stage of the study involves refinement of the World Bank’s Global Library of School Infrastructure taxonomy to widen its scope and to fit the CM school typology. This leads to the identification of index buildings, which are single-story buildings with flexible diaphragms differing mainly in the level of seismic design. In the second stage, a novel numerical modelling platform based on Applied Element Method is used to analyse the index buildings for simplified lateral loads from both the aforementioned hazards. Seismic loading is applied in the form of ground acceleration, while flood loading is applied as hydrostatic pressure. Sequential scenarios are simulated by subjecting the building to varying flood depths followed by lateral ground acceleration, after accounting for the material degradation due to past flooding. Analytical fragility curves are derived for each case of analysis to quantify their physical performance, using a non-linear static procedure (N2 method) and least square error regression. The third stage of the study employs a Bayesian network (BN) based methodology to model the education disruption at the school system level, from exposure of schools to flood and seismic hazards. The methodology integrates the qualitative and quantitative nature of system variables, such as the physical fragility of school buildings (derived in the second stage), accessibility loss, change of use as shelters and socio-economic condition of the users-community. The performance of the education system impacted by the sequential hazards is quantified through the probability of the various states of disruption duration. The BN also explores the effectiveness of non-structural mitigating measures, such as the transfer of students between schools in the system. The framework proves to be a useful tool to assist decision-making, with regard to disaster preparedness and recovery, hence, contributing to the development of resilient education systems

    Data-Driven Optimal Sensor Placement for High-Dimensional System Using Annealing Machine

    Full text link
    We propose a novel method for solving optimal sensor placement problem for high-dimensional system using an annealing machine. The sensor points are calculated as a maximum clique problem of the graph, the edge weight of which is determined by the proper orthogonal decomposition (POD) mode obtained from data based on the fact that a high-dimensional system usually has a low-dimensional representation. Since the maximum clique problem is equivalent to the independent set problem of the complement graph, the independent set problem is solved using Fujitsu Digital Annealer. As a demonstration of the proposed method, the pressure distribution induced by the K\'arm\'an vortex street behind a square cylinder is reconstructed based on the pressure data at the calculated sensor points. The pressure distribution is measured by pressure-sensitive paint (PSP) technique, which is an optical flow diagnose method. The root mean square errors (RMSEs) between the pressure measured by pressure transducer and the reconstructed pressures (calculated from the proposed method and an existing greedy method) at the same place are compared. As the result, the similar RMSE is achieved by the proposed method using approximately 1/5 number of sensor points obtained by the existing method. This method is of great importance as a novel approach for optimal sensor placement problem and a new engineering application of an annealing machine
    corecore