    Prediction of secondary structures for large RNA molecules

    The prediction of correct secondary structures of large RNAs is one of the unsolved challenges of computational molecular biology. Among the major obstacles is the fact that accurate calculations scale as O(n⁴), so the computational requirements become prohibitive as the length increases. We present a new parallel multicore and scalable program called GTfold, which is one to two orders of magnitude faster than the de facto standard programs mfold and RNAfold for folding large RNA viral sequences and achieves comparable accuracy of prediction. We analyze the algorithm's concurrency and describe the parallelism for a shared memory environment such as a symmetric multiprocessor or multicore chip. We are seeing a paradigm shift to multicore chips and parallelism must be explicitly addressed to continue gaining performance with each new generation of systems. We provide a rigorous proof of correctness of an optimized algorithm for internal loop calculations called internal loop speedup algorithm (ILSA), which reduces the time complexity of internal loop computations from O(n⁴) to O(n³) and show that the exact algorithms such as ILSA are executed with our method in affordable amount of time. The proof gives insight into solving these kinds of combinatorial problems. We have documented detailed pseudocode of the algorithm for predicting minimum free energy secondary structures which provides a base to implement future algorithmic improvements and improved thermodynamic model in GTfold. GTfold is written in C/C++ and freely available as open source from our website.M.S.Committee Chair: Bader, David; Committee Co-Chair: Heitsch, Christine; Committee Member: Harvey, Stephen; Committee Member: Vuduc, Richar

    Analysis and Implementation of Room Assignment Problem and Cannon\u27s Algorithm on General Purpose Programmable Graphical Processing Units with CUDA

    General-purpose Graphics Processing Units (GP-GPU) has emerged as a popular computing paradigm for high-performance computing over the last few years. The increased interest in GP-GPUs for parallel computing mirrors the trend in general computing with the rise of multi-core processors as an alternative approach to increase processor performance. Many applications that were previously accelerated on distributed processing platforms with MPI or multithreaded techniques such as OpenMP are now being investigated to assess their performance on GP-GPU platforms. Since the GP-GPU platform is designed to give higher performance for parallel problems, applications on other parallel architectures are good candidates for performance studies on GP-GPUs. The first case study in this research is a GP-GPU implementation of a Simulated Annealing-based solution of the Room Assignment problem using CUDA. The Room Assignment problem attempts to arrange N people in N/2 rooms, taking into consideration each person\u27s preference for a roommate. To evaluate the implementation, it was compared against the serial implementation for problem sizes 5000, 10000, 15000 and 20000 people. The GP-GPU implementation achieved as much as 78% higher improvement ratio than the serial version in comparable execution time. The second case study is a GP-GPU implementation of Cannon\u27s Algorithm using CUDA. The GP-GPU implementation is compared with a serial implementation of a conventional matrix multiplication O(n3). The GP-GPU implementation achieved upto 6.2x speedup over the conventional serial multiplication. The results for both applications with varying problem sizes are presented and discussed


    There has been a strong interest in modeling a mammalian brain in order to study the architectural and functional principles of the brain and offer tools to neuroscientists and medical researchers for related studies. Artificial Neural Networks (ANNs) are compute models that try to simulate the structure and/or the functional behavior of neurons and process information using the connectionist approach to computation. Hence, the ANNs are the viable options for such studies. Of many classes of ANNs, Spiking Neuron Network models (SNNs) have been employed to simulate mammalian brain, capturing its functionality and inference capabilities. In this class of neuron models, some of the biologically accurate models are the Hodgkin Huxley (HH) model, Morris Lecar (ML) model, Wilson model, and the Izhikevich model. The HH model is the oldest, most biologically accurate and the most compute intensive of the listed models. The Izhikevich model, a more recent development, is sufficiently accurate and involves the least computations. Accurate modeling of the neurons calls for compute intensive models and hence single core processors are not suitable for large scale SNN simulations due to their serial computation and low memory bandwidth. Graphical Processing Units have been used for general purpose computing as they offer raw computing power, with a majority of logic solely dedicated for computing purpose. The work presented in this thesis implements two-level character recognition networks using the four previously mentioned SNN models in Nvidia\u27s Tesla C870 card and investigates performance improvements over the equivalent software implementation on a 2.66 GHz Intel Core 2 Quad. The work probes some of the important parameters such as the kernel time, memory transfer time and flops offered by the GPU device for the implementations. In this work, we report speed-ups as high as 576x on a single GPU device for the most compute-intensive, highly biologically realistic Hodgkin Huxley model. These results demonstrate the potential of GPUs for large-scale, accurate modeling of the mammalian brain. The research in this thesis also presents several optimization techniques and strategies, and discusses the major bottlenecks that must be avoided in order to achieve maximum performance benefits for applications involving complex computations. The research also investigates an initial multi-GPU implementation to study the problem partitioning for simulating biological-scale neuron networks on a cluster of GPU devices

    Acceleration of Spiking Neural Networks on Multicore Architectures

    The human cortex is the seat of learning and cognition. Biological scale implementations of cortical models have the potential to provide significantly more power problem solving capabilities than traditional computing algorithms. The large scale implementation and design of these models has attracted significant attention recently. High performance implementations of the models are needed to enable such large scale designs. This thesis examines the acceleration of the spiking neural network class of cortical models on several modern multicore processors. These include the Izhikevich, Wilson, Morris-Lecar, and Hodgkin-Huxley models. The architectures examined are the STI Cell, Sun UltraSPARC T2+, and Intel Xeon E5345. Results indicate that these modern multicore processors can provide significant speed-ups and thus are useful in developing large scale cortical models. The models are then implemented on a 50 TeraFLOPS 336 node PlayStation 3 cluster. Results indicate that the models scale well on this cluster and can emulate 108 neurons and 1010 synapses. These numbers are comparable to the large scale cortical model implementation studies performed by IBM using the Blue Gene/L supercomputer. This study indicates that a cluster of PlayStation 3s can provide an economical, yet powerful, platform for simulating large scale biological models

    Investigation of service selection algorithms for grid services

    Grid computing has emerged as a global platform to support organizations for coordinated sharing of distributed data, applications, and processes. Additionally, Grid computing has also leveraged web services to define standard interfaces for Grid services adopting the service-oriented view. Consequently, there have been significant efforts to enable applications capable of tackling computationally intensive problems as services on the Grid. In order to ensure that the available services are assigned to the high volume of incoming requests efficiently, it is important to have a robust service selection algorithm. The selection algorithm should not only increase access to the distributed services, promoting operational flexibility and collaboration, but should also allow service providers to scale efficiently to meet a variety of demands while adhering to certain current Quality of Service (QoS) standards. In this research, two service selection algorithms, namely the Particle Swarm Intelligence based Service Selection Algorithm (PSI Selection Algorithm) based on the Multiple Objective Particle Swarm Optimization algorithm using Crowding Distance technique, and the Constraint Satisfaction based Selection (CSS) algorithm, are proposed. The proposed selection algorithms are designed to achieve the following goals: handling large number of incoming requests simultaneously; achieving high match scores in the case of competitive matching of similar types of incoming requests; assigning each services efficiently to all the incoming requests; providing the service requesters the flexibility to provide multiple service selection criteria based on a QoS metric; selecting the appropriate services for the incoming requests within a reasonable time. Next, the two algorithms are verified by a standard assignment problem algorithm called the Munkres algorithm. The feasibility and the accuracy of the proposed algorithms are then tested using various evaluation methods. These evaluations are based on various real world scenarios to check the accuracy of the algorithm, which is primarily based on how closely the requests are being matched to the available services based on the QoS parameters provided by the requesters

    Using delay differential equations in models of cardiac electrophysiology

    In cardiac physiology, electrical alternans is a phenomenon characterized by long-short alternations in the action potential duration of cardiac myocytes that give rise to complex spatiotemporal dynamics in tissue. Experiments and clinical measurements indicate that alternans can be a precursor of life-threatening arrhythmias, such as cardiac _brillation. Despite the importance of alternans in the study of cardiac disease, many mathematical models developed to describe cardiac electrophysiology at the cellular level are not able to produce this phenomenon. As a potential remedy to this de_ciency, we introduce short time-delays in some formulations of existing cardiac cell models that are based on Ordinary Di_erential Equations (ODEs). Many processes within cardiac cells involve delays in sensing and responding to changes. In addition, delay di_erential equations (DDEs) are known to give rise to complex dynamical properties in mathematical models. In biological modeling, DDEs have been applied to epidemiology, population dynamics, immunology, and neural networks. Therefore, DDEs can potentially represent mechanisms that result in complex dynamics both at the cellular level and at the tissue level. In this thesis, we propose DDE-based formulations for ion channel models based on the Hodgkin-Huxley formalism that can induce alternans in single-cell simulations in many models found in the literature. We also show that these modi_cations can destabilize spiral waves and produce spiral breakups in two-dimensional simulations, which is a typical model of cardiac _brillation. However, the new DDE-based formulations introduce new computational challenges due to the need for storing and retrieving past values of variables. Therefore, we present novel numerical methods to overcome these challenges and enable e_cient DDE-based studies at the tissue level in standard computational environments. We _nd that the proposed methods decrease memory usage by up to 95% in cardiac tissue simulations compared to straightforward history management algorithms available in widely used DDE solvers.Em fisiologia cardíaca, alternans elétrica _e um fenômeno caracterizado pela alternância entre potenciais de ação longos e curtos que dá origem a complexos comportamentos espaço-temporais em tecido. Experimentos e medições clínicas indicam que alternans pode ser um precursor de perigosas arritmias, como fibrilação ventricular ou morte súbita. Apesar da importância do alternans no estudo de doenças cardíacas, muitos modelos matemáticos para a eletrofisiologia de células cardíacas não são capazes de reproduzir este fenômeno. Como um potencial remédio para esta deficiência, introduzimos curtos atrasos de tempo em algumas formulações de modelos preexistentes para células cardíacas que são baseados em Equações Diferenciais Ordinárias (EDOs). Vários processos em células cardíacas envolvem atrasos de sensibilidade e de resposta a mudanças em variáveis fisiológicas. Além disso, equações diferenciais com atraso (DDEs) são conhecidas por dar origem a complexas propriedades dinâmicas em modelos matemáticos. Em modelagem biológica, DDEs têm sido aplicadas em epidemiologia, dinâmica populacional, imunologia e redes neurais. Portanto, DDEs podem representar mecanismos que resultam em dinâmicas complexas tanto no nível celular, quanto no nível do tecido. Nesta tese, propomos formulações baseadas em DDEs para modelos de canais iônicos descritos pelo formalismo de Hodgkin-Huxley. Tais formulações são capazes de induzir alternans em simulações celulares envolvendo vários modelos encontrados na literatura. Nós também mostramos que essas modificações podem desestabilizar e quebrar ondas espirais em simulações bidimensionais de propagação elétrica, o que é típico de fibrilação cardíaca. Entretanto, as formulações propostas introduzem novos desafios computacionais devido à necessidade de armazenar e recuperar valores passados de variáveis. Deste modo, nós apresentamos novos métodos numéricos para superar tais desafios e permitir a eficiente simulação de modelos baseados em DDEs no nível do tecido cardíaco. Os métodos propostos foram capazes de diminuir o uso de memória em até 95% em comparação aos algoritmos largamente utilizados na solução numérica de DDEs. Assim, os novos modelos baseados em DDEs e os eficientes métodos numéricos propostos nesta tese contribuem para o estudo de arritmias cardíacas fatais através de modelagem computacional

    Models of coupled smooth muscleand endothelial cells

    Impaired mass transfer characteristics of blood borne vasoactive species such as ATP in regions such as an arterial bifurcation have been hypothesized as a prospective mechanism in the aetiology of atherosclerotic lesions. Arterial endothelial (EC) and smooth muscle cells (SMC) respond differentially to altered local hemodynamics and produce coordinated macro-scale responses via intercellular communication. Using a computationally designed arterial segment comprising large populations of mathematically modelled coupled ECs & SMCs, we investigate their response to spatial gradients of blood borne agonist concentrations and the effect of micro-scale driven perturbation on the macro-scale. Altering homocellular (between same cell type) and heterocellular (between different cell types) intercellular coupling we simulated four cases of normal and pathological arterial segments experiencing an identical gradient in the concentration of the agonist. Results show that the heterocellular calcium (Ca2+) coupling between ECs and SMCs is important in eliciting a rapid response when the vessel segment is stimulated by the agonist gradient. In the absence of heterocellular coupling, homocellular Ca2+ coupling between smooth muscle cells is necessary for propagation of Ca2+ waves from downstream to upstream cells axially. Desynchronized intracellular Ca2+ oscillations in coupled smooth muscle cells are mandatory for this propagation. Upon decoupling the heterocellular membrane potential, the arterial segment looses the inhibitory effect of endothelial cells on the Ca2+ dynamics of underlying smooth muscle cells. The full system comprising hundreds of thousands of coupled nonlinear ordinary differential equations simulated on the massively parallel Blue Gene architecture. The use of massively parallel computational architectures shows the capability of this approach to address macro-scale phenomena driven by elementary micro-scale components of the system