703 research outputs found
On the role of metaheuristic optimization in bioinformatics
Metaheuristic algorithms are employed to solve complex and large-scale optimization problems in many different fields, from transportation and smart cities to finance. This paper discusses how metaheuristic algorithms are being applied to solve different optimization problems in the area of bioinformatics. While the text provides references to many optimization problems in the area, it focuses on those that have attracted more interest from the optimization community. Among the problems analyzed, the paper discusses in more detail the molecular docking problem, the protein structure prediction, phylogenetic inference, and different string problems. In addition, references to other relevant optimization problems are also given, including those related to medical imaging or gene selection for classification. From the previous analysis, the paper generates insights on research opportunities for the Operations Research and Computer Science communities in the field of bioinformatics
A Comprehensive Survey on Particle Swarm Optimization Algorithm and Its Applications
Particle swarm optimization (PSO) is a heuristic global optimization method, proposed originally by Kennedy and Eberhart in 1995. It is now one of the most commonly used optimization techniques. This survey presented a comprehensive investigation of PSO. On one hand, we provided advances with PSO, including its modifications (including quantum-behaved PSO, bare-bones PSO, chaotic PSO, and fuzzy PSO), population topology (as fully connected, von Neumann, ring, star, random, etc.), hybridization (with genetic algorithm, simulated annealing, Tabu search, artificial immune system, ant colony algorithm, artificial bee colony, differential evolution, harmonic search, and biogeography-based optimization), extensions (to multiobjective, constrained, discrete, and binary optimization), theoretical analysis (parameter selection and tuning, and convergence analysis), and parallel implementation (in multicore, multiprocessor, GPU, and cloud computing forms). On the other hand, we offered a survey on applications of PSO to the following eight fields: electrical and electronic engineering, automation control systems, communication theory, operations research, mechanical engineering, fuel and energy, medicine, chemistry, and biology. It is hoped that this survey would be beneficial for the researchers studying PSO algorithms
Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives
The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions
Termite Gut Microbes as Tools and Targets for Termite Control
The Formosan subterranean termite (FST), Coptotermes formosanus, is an invasive urban pest in the United States. Colonies of the FST are dependent on the symbiotic gut protozoa for cellulose digestion in the workers’ guts, and the gut bacterial community is known to provide essential nutrients to the termite. The objectives of this PhD research were to develop and evaluate paratransgenesis and phage therapy for termite control. During this study, a termite gut bacterium: Trabulsiella odontotermitis was genetically engineered and was evaluated as a ‘Trojan horse’ for paratransgenesis. We proved that T. odontotermitis can tolerate 50 times more concentration of ligand-Hecate than the concentration required to kill the gut protozoa. We also engineered T. odontotermitis to express Green Fluorescent Protein (GFP) and visualized the expression of GFP in the termite gut. We created a strain of T. odontotermitis expressing kanamycin-resistant gene using tn7 transposon. We used this strain to prove that once ingested, T. odontotermitis can stay in the termite gut for at least three weeks and it is horizontally transferred amongst nest mates. We also engineered T. odontotermitis to express functional ligand-Hecate-GFP fusion protein. Removal of the bacterial community from the gut also has a negative impact on the survival of the termites. The presence of a diverse and rich bacterial community makes the termite gut a perfect niche for bacteriophages; viruses that infect bacteria. So far, there has been no research to study the presence and role of bacteriophages in the gut of the termite. Bacteriophages have the potential to be used in ‘Phage therapy’ targeting the essential termite gut bacteria. During this study three novel bacteriophages were isolated and sequenced from the termite gut. A meta-virome sequencing of the termite gut was also done, which revealed the presence of previously unknown bacteriophages and other viruses associated with the termites. This is the first study elucidating the presence of a diverse and largely unexplored bacteriophage community in the termite gut. The study suggests that termites can serve as a model system to study the effect of bacteriophages on bacteria and ultimately on the host harboring the microbial community
Recommended from our members
Intelligent optimisation of analogue circuits using particle swarm optimisation, genetic programming and genetic folding
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University London.This research presents various intelligent optimisation methods which are: genetic algorithm (GA), particle swarm optimisation (PSO), artificial bee colony algorithm (ABCA), firefly algorithm (FA) and bacterial foraging optimisation (BFO). It attempts to minimise analogue electronic filter and amplifier circuits, taking a cascode amplifier design as a case study, and utilising the above-mentioned intelligent optimisation algorithms with the aim of determining the best among them to be used. Small signal analysis (SSA) conversion of the cascode circuit is performed while mesh analysis is applied to transform the circuit to matrices form. Computer programmes are developed in Matlab using the above mentioned intelligent optimisation algorithms to minimise the cascode amplifier circuit. The objective function is based on input resistance, output resistance, power consumption, gain, upperfrequency band and lower frequency band. The cascode circuit result presented, applied the above-mentioned existing intelligent optimisation algorithms to optimise the same circuit and compared the techniques with the one using Nelder-Mead and the original circuit simulated in PSpice. Four circuit element types (resistors, capacitors, transistors and operational amplifier (op-amp)) are targeted using the optimisation techniques and subsequently compared to the initial circuit. The PSO based optimised result has proven to be best followed by that of GA optimised technique regarding power consumption reduction and frequency response. This work modifies symbolic circuit analysis in Matlab (MSCAM) tool which utilises Netlist from PSpice or from simulation to generate matrices. These matrices are used for optimisation or to compute circuit parameters. The tool is modified to handle both active and passive elements such as inductors, resistors, capacitors, transistors and op-amps. The transistors are transformed into SSA and op-amp use the SSA that is easy to implement in programming. Results are presented to illustrate the potential of the algorithm. Results are compared to PSpice simulation and the approach handled larger matrices dimensions compared to that of existing symbolic circuit analysis in Matlab tool (SCAM). The SCAM formed matrices by adding additional rows and columns due to how the algorithm was developed which takes more computer resources and limit its performance. Next to this, this work attempts to reduce component count in high-pass, low-pass, and all- pass active filters. Also, it uses a lower order filter to realise same results as higher order filter regarding frequency response curve. The optimisers applied are GA, PSO (the best two methods among them) and Nelder-Mead (the worst method) are used subsequently for the filters optimisation. The filters are converted into their SSA while nodal analysis is applied to transform the circuit to matrices form. High-pass, low-pass, and all- pass active filters results are presented to demonstrate the effectiveness of the technique. Results presented have shown that with a computer code, a lower order op-amp filter can be applied to realise the same results as that of a higher order one. Furthermore, PSO can realise the best results regarding frequency response for the three results, followed by GA whereas Nelder-
Mead has the worst results. Furthermore, this research introduced genetic folding (GF), MSCAM, and automatically simulated Netlist into existing genetic programming (GP), which is a new contribution in this work, which enhances the development of independent Matlab toolbox for the evolution of passive and active filter circuits. The active filter circuit evolution especially when operational amplifier is involved as a component is of it first kind in circuit evolution. In the work, only one software package is used instead of combining PSpice and Matlab in electronic circuit simulation. This saves the elapsed time for moving the simulation
between the two platforms and reduces the cost of subscription. The evolving circuit from GP using Matlab simulation is automatically transformed into a symbolic Netlist also by Matlab simulation. The Netlist is fed into MSCAM; where MSCAM uses it to generate matrices for the simulation. The matrices enhance frequency response analysis of low-pass, high-pass, band-pass, band-stop of active and passive filter circuits. After the circuit evolution using the developed GP, PSO is then applied to optimise some of the circuits. The algorithm is tested with twelve different circuits (five examples of the active filter, four examples of passive filter circuits and three examples of transistor amplifier circuits) and the results presented have shown that the algorithm is efficient regarding design.Tertiary Education Trust Fund (TETFUND) through University of Calabar, Nigeria
Development of a novel platform for high-throughput gene design and artificial gene synthesis to produce large libraries of recombinant venom peptides for drug discovery
Tese de Doutoramento em Ciências Veterinárias na Especialidade de Ciências Biológicas e BiomédicasAnimal venoms are complex mixtures of biologically active molecules that, while presenting low immunogenicity, target with high selectivity and efficacy a variety of membrane receptors. It is believed that animal venoms comprise a natural library of more than 40 million different natural compounds that have been continuously fine-tuned during the evolutionary process to disturb cellular function. Within animal venoms, reticulated peptides are the most attractive class of molecules for drug discovery. However, the use of animal venoms to develop novel pharmacological compounds is still hampered by difficulties in obtaining these low molecular mass cysteine-rich polypeptides in sufficient amounts. Here, a high-throughput gene synthesis platform was developed to produce synthetic genes encoding venom peptides. The final goal of this project is the production of large libraries of recombinant venom peptides that can be screened for drug discovery. A robust and efficient Polymerase Chain Reaction (PCR) methodology was refined to assemble overlapping oligonucleotides into small artificial genes (< 500 bp) with high-fidelity. In addition, two bioinformatics tools were constructed to design multiple optimized genes (ATGenium) and overlapping oligonucleotides (NZYOligo designer), in order to allow automation of the high-throughput gene synthesis platform. The platform can assemble 96 synthetic genes encoding venom peptides simultaneously, with an error rate of 1.1 mutations per kb. To decrease the error rate associated with artificial gene synthesis, an error removal step using phage T7 endonuclease I was designed and integrated into the gene synthesis methodology. T7 endonuclease I was shown to be highly effective to specifically recognize and cleave DNA mismatches allowing a dramatically reduction of error frequency in large synthetic genes, from 3.45 to 0.43 errors per kb. Combining the knowledge acquired in the initial stages of the work, a comprehensive study was performed to investigate the influence of gene design, presence of fusion tags, cellular localization of expression, and usage of Tobacco Etch Virus (TEV) protease for tag removal, on the recombinant expression of disulfide-rich venom peptides in Escherichia coli. Codon usage dramatically affected the levels of recombinant expression in E. coli. In addition, a significant pressure in the usage of the two cysteine codons suggests that both need to be present at equivalent levels in genes designed de novo to ensure high levels of expression. This study also revealed that DsbC was the best fusion tag for recombinant expression of disulfide-rich peptides, in particular when expression of the fusion peptide was directed to the bacterial periplasm. TEV protease was highly effective for efficient tag removal and its recognition sites can tolerate all residues at its C-terminal, with exception of proline, confirming that no extra residues need to be incorporated at the N-terminus of recombinant venom peptides. This study revealed that E. coli is a convenient heterologous host for the expression of soluble and potentially functional venom peptides. Thus, this novel high-throughput gene synthesis platform was used to produce ~5,000 synthetic genes with a low error rate. This genetic library supported the production of the largest library of recombinant venom peptides constructed until now. The library contains 2736 animal venom peptides and it is presently being screened for the discovery of novel drug leads related to different diseases.RESUMO - Desenvolvimento de uma nova plataforma de alta capacidade para desenhar e sintetizar genes artificiais, para a produção de péptidos venómicos recombinantes - Os venenos animais são misturas complexas de moléculas biologicamente activas que se ligam com elevada selectividade e eficácia a uma grande variedade de receptores de membrana. Embora apresentem baixa imunogenicidade, os venenos podem afectar a função celular actuando ao nível dos seus receptores. Actualmente, pensa-se que os venenos de animais constituam uma biblioteca natural de mais de 40 milhões de moléculas diferentes que têm sido continuamente aperfeiçoadas ao longo do processo evolutivo. Tendo em conta a composição dos venenos, os péptidos reticulados são a classe mais atractiva de moléculas com interesse farmacológico. No entanto, a utilização de venenos para o desenvolvimento de novos fármacos está limitada por dificuldades em obter estas moléculas em quantidades adequadas ao seu estudo. Neste trabalho desenvolveu-se uma plataforma de alta capacidade para a síntese de genes sintéticos codificadores de péptidos venómicos, com o objectivo de produzir bibliotecas de péptidos venómicos recombinantes que possam ser rastreadas para a descoberta de novos medicamentos. Com o objectivo de sintetizar genes pequenos (< 500 pares de bases) com elevada fidelidade e em simultâneo, desenvolveu-se uma metodologia de PCR (polymerase chain reaction) robusta e eficiente, que se baseia na extensão de oligonucleótidos sobrepostos. Para possibilitar a automatização da plataforma de síntese de genes, foram construídas duas ferramentas bioinformáticas para desenhar simultaneamente dezenas a milhares de genes optimizados para a expressão em Escherichia coli (ATGenium) e os respectivos oligonucleótios sobrepostos (NZYOligo designer). Esta plataforma foi optimizada para sintetizar em simultâneo 96 genes sintéticos, tendo-se obtido uma taxa de erro de 1.1 mutações por kb de DNA sintetizado. A fim de diminuir a taxa de erro associada à produção de genes sintéticos, desenvolveu-se um método para remoção de erros utilizando a enzima T7 endonuclease I. A enzima T7 endonuclease I mostrou-se muito eficaz no reconhecimento e clivagem de moléculas DNA que apresentam emparelhamentos incorrectos, reduzindo drasticamente a frequência de erros identificados em genes grandes, de 3.45 para 0.43 erros por kb de DNA sintetizado. Investigou-se também a influência do desenho dos genes, da presença de tags de fusão, da localização celular da expressão e da actividade da protease Tobacco Etch Virus (TEV) para a remoção eficiente de tags, na expressão de péptidos venómicos ricos em cisteínas em E. coli. A utilização de codões meticulosamente escolhidos afectou drasticamente os níveis de expressão em E. coli. Para além disso, os resultados mostram que existe uma pressão significativa no uso dos dois codões que codificam para o resíduo cisteína, o que sugere que ambos os codões têm de estar presentes, em níveis equivalentes, nos genes que foram desenhados e optimizados para garantir elevados níveis de expressão. Este trabalho indicou também que o tag de fusão DsbC foi o mais apropriado para a expressão eficiente de péptidos venómicos ricos em cisteínas, particularmente quando os péptidos recombinantes foram expressos no periplasma bacteriano. Confirmou-se que a protease TEV é eficaz na remoção de tags de fusão, podendo o seu local de reconhecimento conter quaisquer aminoácidos na extremidade C-terminal, com excepção da prolina. Desta forma, verificou-se não ser necessário incorporar qualquer aminoácido extra na extremidade N-terminal dos péptidos venómicos recombinantes. Reunindo todos os resultados, verificou-se que a E. coli é um hospedeiro adequado para a expressão, na forma solúvel, de péptidos venómicos potencialmente funcionais. Por último, foram produzidos, com uma taxa de erro reduzida, ~5000 genes sintéticos codificadores de péptidos venómicos utilizando a nova plataforma de elevada capacidade para a síntese de genes aqui desenvolvida. A nova biblioteca de genes sintéticos foi usada para produzir a maior biblioteca de péptidos venómicos recombinantes construída até agora, que inclui 2736 péptidos venómicos. Esta biblioteca recombinante está presentemente a ser rastreada com o objectivo de descobrir novas drogas com interesse para a saúde humana
- …