187 research outputs found
Parallel multi-swarm cooperative particle swarm optimization for protein–ligand docking and virtual screening
BACKGROUND: A high-quality docking method tends to yield multifold gains with half pains for the new drug development. Over the past few decades, great efforts have been made for the development of novel docking programs with great efficiency and intriguing accuracy. AutoDock Vina (Vina) is one of these achievements with improved speed and accuracy compared to AutoDock4. Since it was proposed, some of its variants, such as PSOVina and GWOVina, have also been developed. However, for all these docking programs, there is still large room for performance improvement. RESULTS: In this work, we propose a parallel multi-swarm cooperative particle swarm model, in which one master swarm and several slave swarms mutually cooperate and co-evolve. Our experiments show that multi-swarm programs possess better docking robustness than PSOVina. Moreover, the multi-swarm program based on random drift PSO can achieve the best highest accuracy of protein–ligand docking, an outstanding enrichment effect for drug-like activate compounds, and the second best AUC screening accuracy among all the compared docking programs, but with less computation consumption than most of the other docking programs. CONCLUSION: The proposed multi-swarm cooperative model is a novel algorithmic modeling suitable for protein–ligand docking and virtual screening. Owing to the existing coevolution between the master and the slave swarms, this model in parallel generates remarkable docking performance. The source code can be freely downloaded from https://github.com/li-jin-xing/MPSOVina. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04711-0
Recommended from our members
Supervised Design-Space Exploration
Low-cost Very Large Scale Integration (VLSI) electronics have revolutionized daily life and expanded the role of computation in science and engineering. Meanwhile, process-technology scaling has changed VLSI design to an exploration process that strives for the optimal balance among multiple objectives, such as power, performance, and area, i.e. multi-objective Pareto-set optimization. Besides, modern VLSI design has shifted to synthesis-centric methodologies in order to boost the design productivity, which leads to better design quality given limited time and resources. However, current decade-old synthesis-centric design methodologies suffer from: (i) long synthesis tool runtime, (ii) elusive optimal setting of many synthesis knobs, (iii) limitation to one design implementation per synthesis run, and (iv) limited capability of digesting only component-level designs as opposed to holistic system-wide synthesis. These challenges make Design Space Exploration (DSE) with synthesis tools a daunting task for both novice and experienced VLSI designers, thus stagnating the development of more powerful (i.e. more complex) computer systems.
To address these challenges, I propose Supervised Design-Space Exploration (SDSE), an abstraction layer between a designer and a synthesis tool, aiming to autonomously supervise synthesis jobs for DSE. For system-level exploration, SDSE can approximate a system Pareto set given limited information: only lightweight component characterization is required, yet the necessary component synthesis jobs are discovered on-the-fly in order to compose the system Pareto set. For component-level exploration, SDSE can approximate a component Pareto set by iteratively refining the approximation with promising knob settings, guided by synthesis-result estimation with machine-learning models. Combined, SDSE has been applied with the three major synthesis stages, namely high-level, logic, and physical synthesis, to the design of heterogeneous accelerator cores as well as high-performance processor cores. In particular, SDSE has been successfully integrated into the IBM Synthesis Tuning System, yielding 20% better circuit performance than the original system on the design of a 22nm server processor that is currently in production.
Looking ahead, SDSE can be applied to other VLSI designs beyond the accelerator and the programmable cores. Moreover, SDSE opens several research avenues for: (i) new development and deployment platforms of synthesis tools, (ii) large-scale collaborative design engineering, and (iii) new computer-aided design approaches for new classes of systems beyond VLSI chips
Computational Investigations of Biomolecular Motions and Interactions in Genomic Maintenance and Regulation
The most critical biochemistry in an organism supports the central dogma of molecular biology: transcription of DNA to RNA and translation of RNA to peptide sequence. Proteins are then responsible for catalyzing, regulating and ensuring the fidelity of transcription and translation. At the heart of these processes lie selective biomolecular interactions and specific dynamics that are necessary for complex formation and catalytic activity. Through advanced biophysical and computational methods, it has become possible to probe these macromolecular dynamics and interactions at the molecular and atomic levels to tease out their underlying physical bases. To the end of a more thorough understanding of these physical bases, we have performed studies to probe the motions and interactions intrinsic to the function of biomolecular complexes: modeling the dual-base flipping strategy of alkylpurine glycosylase D, dynamically tracing evolution and epistasis in the 3-ketosteroid family of nuclear receptors, discovering the allosteric and conformational aspects of transcription regulation in liver receptor homologue 1, leveraging specific contacts in tyrosyl-DNA phosphodiesterase 2 for the development of novel inhibitor scaffolds, and detailing the experimentally observed connection between solvation and sequence-specific binding affinity in PU.1-DNA complexes at the atomic level. While each study seeks to solve system-specific problems, the collection outlines a general and broadly applicable description of the biophysical motivations of biochemical processes
ANALYSIS AND DEVELOPMENT OF A MATHEMATICAL STRUCTURE TO DESCRIBE ENERGY CONSUMPTION OF SENSOR NETWORKS
Collections of several hundred, thousands, or even millions of small devices scattered or placed throughout an area monitoring the environment called sensor networks have several useful applications. Until recently, the economic cost of development, manufacture, and deployment limited the use of sensor networks to military and government applications. Recent advances in technology provide a means for economical development, deployment, and manufacture of sensor networks.Current methodology designs, then implements and simulates the sensor network, then goes back and redesigns to better meet the specifications. The model developed in this dissertation provides an early indication of what types of solutions will meet the requirements and what types of solutions will not. With this ability, the time required for simulation and proof of concept is reduced, allowing more time and money for design and testing of the real world system. The model developed characterizes the energy consumption of a sensor or RFID network as a whole is extremely beneficial and is needed. The model provides a means to benchmark different types of sensor networks (i.e. different protocols, hardware, software) and to determine which type is the better solution. A model such as this removes the requirement to develop a simulation to compare different types. Using the model reduces the time (and save money) needed to verify the solution and helps with development as multiple designs can be quickly tested and compared possibly at a much earlier stage in the development cycle allowing a thorough investigation of different design alternatives
Recommended from our members
Structural and mechanistic studies of DNA repair proteins
Project 1: Small molecule inhibitors of TDP2
DNA Topoisomerase II (TOP2) has important roles in many cellular processes such as DNA replication and transcription, as well as in chromosome segregation. The main enzymatic function of TOP2 is to alter DNA topology and release torsional stress, by transiently introducing a double strand break (DSB) into a DNA duplex, passing a second intact duplex through the break, and then re-sealing the break. This enzymatic process involves the formation of TOP2-DNA covalent complexes, where the catalytic tyrosine (Y821) is linked to the 5’ phosphate group of a substrate DNA. TOP2 ‘poisons’ such as etoposide, doxorubicin and mitozantrone, which have found utility as anti-cancer agents, lead to an accumulation of these covalent complexes, leading eventually to cell death in rapidly replicating and dividing cells.
As many tumours treated with TOP2 poisons go on to develop chemo-resistance, it is postulated that dual-combination therapy with inhibitors of a second enzyme, 5'-tyrosyl DNA phosphodiesterase-2 (TDP2) may prevent this from occurring; TDP2 acts to remove TOP2-DNA adducts, liberating DNA ends for repair. Inhibitors of TDP2 may also prove useful as a mono-therapy in defined tumour types.
As part of an ongoing collaboration with the Sussex Drug Discovery Centre (SDDC), the aim of the project was to determine high-resolution X-ray crystal structures of TDP2 in complex with a series of deazaflavin inhibitors. The information acquired will guide ongoing structure-based drug design, with the aim of developing and nominating a hit-to lead compound in the near future.
Project 2: The XRCC1 phosphate-binding pocket binds poly(ADP-ribose)
In living organisms, genomic DNA is constantly exposed to both endogenous and exogenous sources of DNA damaging agents, which if not repaired, can result in the accumulation of mutations and chromosomal aberrations. Cells have evolved a series of DNA-damage repair enzymes and pathways, to cope with this perpetual threat. Poly(ADP-ribose) polymerase 1 (PARP1) is the founding member of the large ADP ribosyl transferase superfamily. Among its broad range of functions, PARP1 can detect the presence of both single- and double-strand breaks (SSBs and DSBs) in DNA, upon which it becomes catalytically activated. As a result, PARP1 then synthesises poly(ADP-ribose) polymer using NAD+ as a co-factor, thereby modifying both itself (auto-ribosylation ) and other proteins (trans-ribosylation) in the vicinity of the DNA break.
During the initial phases of the single-strand break repair (SSBR), the scaffold protein XRCC1 is recruited by PARP1, via an interaction between poly(AD-ribose) (PAR) and the central BRCT1 domain in XRCC1. However, further investigation is required to elucidate the mechanism by which the BRCT1 domain interacts with PAR. This project aims to address this question
Understanding and Optimizing Python-Based Applications - A Case Study on PYPY
Python is nowadays one of the most popular programming languages. It has been used extensively for rapid prototyping and developing real-world applications. Unfortunately, very few empirical studies were conducted on Python-based applications. There are various Python implementations (e.g., CPython, and PyPy). Among them, PyPy is generally the fastest due to PyPy's efficient tracing-based Just-in-Time (JIT) compiler. Understanding how PyPy has been evolved and the rationale behind its high performance would be very useful for Python application developers and researchers.
In the first part of the thesis, we conducted a replication study on mining the historical code changes' of PyPy and compared our findings against Python-based applications from five other application domains. In the second part, we conducted a detailed empirical study on the performance impact of the JIT configuration settings of PyPy. The findings and the techniques in this thesis will be useful for Python application developers and researchers
Molecular characterization and functional analysis of ORF P1192R from African swine fever virus
Tese de Doutoramento em Ciências Veterinárias. Especialidade de Ciências Biológicas e BiomédicasAfrican swine fever virus (ASFV) is a nucleo-cytoplasmic large DNA arbovirus and the
single member of the family Asfarviridae. It infects soft ticks of the genus Ornithodoros as
well as all members of the family Suidae, representing a global threat for pig husbandry for
which there is currently no effective vaccine or treatment. Since the ASFV viral cycle is
mainly cytoplasmic, it has been found/predicted to code for many components of the
replicative and transcriptional machineries. Of these, and based in sequence homologies, a
putative type II DNA topoisomerase-coding ORF (P1192R) was identified in the ASFV
genome. DNA topoisomerases are enzymes that modulate the topological state of DNA
molecules. They are ubiquitous and essential, participating in processes such as DNA
replication, recombination and repair and also in transcription. Since ASFV has a large linear
genome, with 170 to 190 kbp depending on the isolate, containing terminal inverted repeats
and covalently closed ends, a type II topoisomerase may be indispensable for viral replication
and/or transcriptional events. The main objectives of this work were to deepen the study on
ORF P1192R and determine if it indeed codes for a type II DNA topoisomerase and, if so, to
characterize its activity. Bioinformatics and phylogenetic analyses showed that ORF P1192R
is highly conserved among the fourteen ASFV isolates analyzed and, although its amino acid
sequence clearly diverges from other type II topoisomerases, the structural organization is
preserved and conserved motifs and domains essential for activity are present. Transient
expression of GFP-pP1192R in COS-7 cells revealed an exclusively cytoplasmic distribution
of the protein, which remained unaltered by treatment with leptomycin B. Using Vero cells or
swine macrophages infected with ASFV isolate Ba71V or L60, respectively, expression of
pP1192R was observed in the late phase of infection, co-localizing with the viral factories,
where the bulk of viral replication and transcription occurs. Heterologous expression of
pP1192R in Saccharomyces cerevisiae demonstrated that it functionally complements a top2
thermo-sensitive mutation and that it exhibits ATP-dependent decatenation activity. The
purified recombinant pP1192R was found to efficiently decatenate kDNA and to processively
relax supercoiled plasmid DNA, which are characteristics of a type II topoisomerase. The
optimal requirements in terms of pH, temperature and salt, divalent ions and ATP
concentrations for pP1192R activity in vitro were determined and its sensitivity to a panel of
topoisomerase poisons and inhibitors was tested. Our results indicate that P1192R may be a
target for studying, and possibly controlling, ASFV transcription and replication.RESUMO - O vírus da peste suína africana (VPSA) é um arbovírus icosaédrico núcleo-citoplasmático
de DNA de cadeia dupla, classificado no género Asfivirus da família Asfarviridae, da qual é o
único membro conhecido. Este vírus infecta carraças do género Ornithodoros assim como
todos os membros da família Suidae, constituindo uma ameaça global para a suinicultura para
a qual não existe actualmente qualquer vacina ou tratamento. A prevenção da peste suína
africana é feita através de medidas que visam reduzir o risco de introdução de animais ou
produtos de origem animal infectados em regiões livres da doença, enquanto o controlo de um
surto se baseia exclusivamente em medidas que incluem o abate sanitário de todos os animais
susceptíveis na área do foco e a proibição de movimentos e comercialização de animais.
Embora o VPSA tenha sido inicialmente descrito como um vírus com replicação
exclusivamente citoplasmática, actualmente sabe-se que o núcleo da célula hospedeira é
indispensável na fase inicial da infecção. Contudo, a grande maioria do ciclo infeccioso
ocorre no citoplasma da célula infectada, não sendo por isso surpreendente que, das 150 a 167
grelhas de leitura aberta (ORF, do inglês “open reading frame”) identificadas no genoma do
VPSA, algumas codifiquem para componentes das maquinarias de replicação e de transcrição.
Dentre estas, prevê-se, com base em homologia de sequências aminoacídicas, que a ORF
P1192R codifique para uma topoisomerase de DNA do tipo II.
As topoisomerases de DNA estão presentes em todas as células e são responsáveis pela
modulação do estado topológico do DNA, estado esse que se altera durante processos como a
replicação, a recombinação e a reparação do DNA, assim como a transcrição, e dos quais
resultam torções das moléculas de DNA que, não sendo resolvidas, podem comprometer a
integridade genómica e consequentemente a viabilidade celular. Todas as topoisomerases
exercem a sua actividade através da criação de quebras no DNA devido ao ataque nucleofílico
de um resíduo de tirosina catalítico ao esqueleto fosfodiéster da molécula de DNA, gerandose
assim uma ligação fosfotirosina covalente. As topoisomerases são classificadas em dois
tipos, tendo por base a forma como quebram a molécula de DNA: as topoisomerases do tipo I,
cuja actividade é independente de ATP e que geram quebras em cadeia única no DNA,
facilitando assim o desenrolamento; e as topoisomerases do tipo II, que necessitam de ATP
para gerar uma quebra nas duas cadeias do DNA, através da qual fazem passar uma dupla
cadeia intacta. Considerando que o VPSA tem um genoma linear de grandes dimensões, com
170 a 190 quilopares de bases dependendo do isolado, e que contém repetições terminais invertidas fechadas covalentemente, uma topoisomerase do tipo II pode efectivamente ser
essencial para eventos de replicação e/ou transcrição virais.
Os objectivos centrais deste trabalho foram os seguintes: (i) realização de um estudo
bioinformático e filogenético aprofundado da ORF P1192R do VPSA; (ii) estudo da proteína
codificada por esta ORF (pP1192R), através da sua clonagem, expressão em sistema
heterólogo, purificação da proteína recombinante e caracterização in vitro da sua actividade;
(iii) determinação do efeito sobre a actividade da proteína recombinante dum painel de
compostos químicos descritos como sendo inibidores de topoisomerases; (iv) identificação
dos níveis de expressão e da localização intracelular da pP1192R em células infectadas pelo
VPSA, a diferentes tempos de infecção; (v) avaliação do efeito de mutações dirigidas em
resíduos ou motivos identificados como reguladores da actividade enzimática ou localização
subcelular da pP1192R, tendo por base a informação gerada nos estudos bioinformáticos
acima mencionados.
A ORF P1192R do isolado L60 do VPSA foi amplificada por PCR e clonada e a sua
sequência nucleotídica foi determinada e utilizada em análises bioinformáticas e filogenéticas.
Verificou-se que esta ORF é altamente conservada entre os catorze isolados do VPSA cujo
genoma se encontrava disponível nas bases de dados e, embora a sua sequência aminoacídica
seja claramente divergente das de outras topoisomerases do tipo II incluídas neste estudo,
quer sejam elas de origem procariota, eucariota ou viral, a organização estrutural da proteína
está preservada e estão presentes motivos e domínios conservados que são essenciais para a
actividade enzimática. O estudo da localização celular da pP1192R iniciou-se com a
construção de plasmídeos quiméricos para a expressão da pP1192R em fusão com a proteína
verde fluorescente (GFP) ou com uma variante vermelha (RFP). Transfectaram-se
transientemente células de linha COS-7 com estas construções tendo-se observado que a
proteína de fusão se distribuía exclusivamente pelo citoplasma. Esta distribuição não foi
alterada após tratamento com leptomicina B que bloqueia uma das vias de exportação de
proteínas do núcleo. Já a infecção das células a expressarem GFP-pP1192R com um isolado
do VPSA adaptado a células Vero (Ba71V) induziu uma redistribuição da proteína de fusão,
deixando de estar homogeneamente distribuída pelo citoplasma para estar principalmente
concentrada nas fábricas virais a partir das 8 horas pós-infecção. Utilizando células de linha
Vero infectadas com o isolado Ba71V, utilizado como modelo de infecção, ou macrófagos
derivados de monócitos de sangue periférico de suíno (células alvo do vírus na infecção
natural) infectados com o isolado virulento L60, e utilizando um soro anti-pP1192R
produzido no decurso destes trabalhos, foi possível constatar que a pP1192R viral é produzida na fase intermédia/tardia da infecção (observável a partir das 6/8 horas pós-infecção) e que
acumula nas fábricas virais ao longo da infecção.
A expressão em sistema heterólogo da pP1192R iniciou-se num sistema procariota,
baseado em Escherichia coli, mas embora tenha sido possível obter proteína recombinante em
grandes quantidades, a sua purificação só foi conseguida recorrendo a agentes desnaturantes,
impedindo a obtenção de proteína activa. Assim, avançou-se para um novo sistema de
expressão baseado na levedura Pichia pastoris que apresenta diversas vantagens sobre o
anterior, nomeadamente o facto de ser um sistema eucariota e por isso mais semelhante ao
contexto em que a pP1192R é expressa em condições naturais. Contudo, neste sistema não foi
possível obter proteína recombinante e o sistema foi abandonado. Tentou-se por fim a
expressão heteróloga na levedura Saccharomyces cerevisiae. Neste organismo, a utilização
das estirpes JCW26 e SD117 que contêm uma mutação termo-sensível no gene que codifica
para a topoisomerase do tipo II endógena, permitiu demonstrar, quer in vivo através da
complementação da mutação termo-sensível, quer in vitro recorrendo a ensaios funcionais de
decatenação, que a pP1192R é efectivamente uma topoisomerase do tipo II funcional.
Utilizando ainda S. cerevisiae como sistema de expressão, foi possível obter e purificar
pP1192R recombinante para caracterização da sua actividade em ensaios funcionais in vitro.
Observou-se que a pP1192R é capaz de relaxar DNA superenrolado, de decatenar DNA
catenado e, quando em elevadas concentrações, de catenar DNA plasmídico, não tendo sido
detectada actividade de superenrolamento de DNA relaxado. Determinaram-se também as
condições óptimas de funcionamento em termos de temperatura, pH e concentrações de sal
(NaCl ou KCl), ATP ou iões divalentes (Mg2+, Mn2+, Zn2+, Cu2+ e Ca2+), que foram
posteriormente utilizadas para avaliar a sensibilidade da pP1192R recombinante a um painel
de inibidores de topoisomerases, entre os quais se incluem drogas frequentemente utilizadas
como agentes antimicrobianos ou antitumorais. Dos compostos testados, aqueles para os quais
foram obtidos resultados mais promissores, i.e., os que revelaram níveis de inibição mais
elevados, foram a coumermicina A1, a doxorubicina, a amsacrina e a genisteína. Pelo
contrário, as quinolonas, normalmente utilizadas como antibióticos visando infecções
provocadas por organismos procariotas, foram dos compostos com menor eficácia.
Em suma, os resultados deste trabalho indicam que a ORF P1192R é um alvo promissor
para o estudo e, eventualmente, o controlo dos processos replicativos e transcricionais do
vírus da peste suína africana
- …