Search CORE

870 research outputs found

An agent-based hybrid system for microarray data analysis

Author: Wu Xindong
Yang Pengyi
Zhang Chengqi
Zhang Zili
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

This article reports our experience in agent-based hybrid construction for microarray data analysis. The contributions are twofold: We demonstrate that agent-based approaches are suitable for building hybrid systems in general, and that a genetic ensemble system is appropriate for microarray data analysis in particular. Created using an agent-based framework, this genetic ensemble system for microarray data analysis excels in both sample classification accuracy and gene selection reproducibility.<br /

Deakin Research Online

OPUS - University of Technology Sydney

Algorithms for Analysis of Heterogeneous Cancer and Viral Populations Using High-Throughput Sequencing Data

Author: Tsyvina Viachaslau
Publication venue: ScholarWorks @ Georgia State University
Publication date: 13/12/2021
Field of study

Next-generation sequencing (NGS) technologies experienced giant leaps in recent years. Short read samples reach millions of reads, and the number of samples has been growing enormously in the wake of the COVID-19 pandemic. This data can expose essential aspects of disease transmission and development and reveal the key to its treatment. At the same time, single-cell sequencing saw the progress of getting from dozens to tens of thousands of cells per sample. These technological advances bring new challenges for computational biology and require the development of scalable, robust methods to deal with a wide range of problems varying from epidemiology to cancer studies. The first part of this work is focused on processing virus NGS data. It proposes algorithms that can facilitate the initial data analysis steps by filtering genetically related sequencing and the tool investigating intra-host virus diversity vital for biomedical research and epidemiology. The second part addresses single-cell data in cancer studies. It develops evolutionary cancer models involving new quantitative parameters of cancer subclones to understand the underlying processes of cancer development better

ScholarWorks @ Georgia State University

Genetic Improvement of Software: From Program Landscapes to the Automatic Improvement of a Live System

Author: Haraldsson Saemundur Oskar
Publication venue: University of Stirling
Publication date: 31/05/2017
Field of study

In today’s technology driven society, software is becoming increasingly important in more areas of our lives. The domain of software extends beyond the obvious domain of computers, tablets, and mobile phones. Smart devices and the internet-of-things have inspired the integra- tion of digital and computational technology into objects that some of us would never have guessed could be possible or even necessary. Fridges and freezers connected to social media sites, a toaster activated with a mobile phone, physical buttons for shopping, and verbally asking smart speakers to order a meal to be delivered. This is the world we live in and it is an exciting time for software engineers and computer scientists. The sheer volume of code that is currently in use has long since outgrown beyond the point of any hope for proper manual maintenance. The rate of which mobile application stores such as Google’s and Apple’s have expanded is astounding. The research presented here aims to shed a light on an emerging field of research, called Genetic Improvement ( GI ) of software. It is a methodology to change program code to improve existing software. This thesis details a framework for GI that is then applied to explore fitness landscape of bug fixing Python software, reduce execution time in a C ++ program, and integrated into a live system. We show that software is generally not fragile and although fitness landscapes for GI are flat they are not impossible to search in. This conclusion applies equally to bug fixing in small programs as well as execution time improvements. The framework’s application is shown to be transportable between programming languages with minimal effort. Additionally, it can be easily integrated into a system that runs a live web service. The work within this thesis was funded by EPSRC grant EP/J017515/1 through the DAASE project

Stirling Online Research Repository

Recommended from our members

A niching memetic algorithm for simultaneous clustering and feature selection

Author: Fairhurst M
Liu X
Sheng W
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2008
Field of study

Clustering is inherently a difficult task, and is made even more difficult when the selection of relevant features is also an issue. In this paper we propose an approach for simultaneous clustering and feature selection using a niching memetic algorithm. Our approach (which we call NMA_CFS) makes feature selection an integral part of the global clustering search procedure and attempts to overcome the problem of identifying less promising locally optimal solutions in both clustering and feature selection, without making any a priori assumption about the number of clusters. Within the NMA_CFS procedure, a variable composite representation is devised to encode both feature selection and cluster centers with different numbers of clusters. Further, local search operations are introduced to refine feature selection and cluster centers encoded in the chromosomes. Finally, a niching method is integrated to preserve the population diversity and prevent premature convergence. In an experimental evaluation we demonstrate the effectiveness of the proposed approach and compare it with other related approaches, using both synthetic and real data

Brunel University Research Archive

A Survey of Genetic Improvement Search Spaces

Author: Alexander Brad
Barr Earl T
Brownlee Alexander E I
Petke Justyna
Wagner Markus
White David R
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Genetic Improvement (GI) uses automated search to improve existing software. Most GI work has focused on empirical studies that successfully apply GI to improve software's running time, fix bugs, add new features, etc. There has been little research into why GI has been so successful. For example, genetic programming has been the most commonly applied search algorithm in GI. Is genetic programming the best choice for GI? Initial attempts to answer this question have explored GI's mutation search space. This paper summarises the work published on this question to date

Stirling Online Research Repository (RIOXX)

Adelaide Research & Scholarship

UCL Discovery

Stirling Online Research Repository

Multi-Objective Improvement of Android Applications

Author: Callan James
Petke Justyna
Publication venue
Publication date: 22/08/2023
Field of study

Non-functional properties, such as runtime or memory use, are important to mobile app users and developers, as they affect user experience. Previous work on automated improvement of non-functional properties in mobile apps failed to address the inherent trade-offs between such properties. We propose a practical approach and the first open-source tool, GIDroid (2023), for multi-objective automated improvement of Android apps. In particular, we use Genetic improvement, a search-based technique that navigates the space of software variants to find improved software. We use a simulation-based testing framework to greatly improve the speed of search. GIDroid contains three state-of-the-art multi-objective algorithms, and two new mutation operators, which cache the results of method calls. Genetic improvement relies on testing to validate patches. Previous work showed that tests in open-source Android applications are scarce. We thus wrote tests for 21 versions of 7 Android apps, creating a new benchmark for performance improvements. We used GIDroid to improve versions of mobile apps where developers had previously found improvements to runtime, memory, and bandwidth use. Our technique automatically re-discovers 64% of existing improvements. We then applied our approach to current versions of software in which there were no known improvements. We were able to improve execution time by up to 35%, and memory use by up to 33% in these apps.Comment: 32 pages, 8 Figure

arXiv.org e-Print Archive

Evaluation of Existing Methods for High-Order Epistasis Detection

Author: Carvajal-Rodriguez Antonio
González-Domínguez Jorge
Martín María J.
Ponte-Fernández Christian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/10/2020
Field of study

[Abstract] Finding epistatic interactions among loci when expressing a phenotype is a widely employed strategy to understand the genetic architecture of complex traits in GWAS. The abundance of methods dedicated to the same purpose, however, makes it increasingly difficult for scientists to decide which method is more suitable for their studies. This work compares the different epistasis detection methods published during the last decade in terms of runtime, detection power and type I error rate, with a special emphasis on high-order interactions. Results show that in terms of detection power, the only methods that perform well across all experiments are the exhaustive methods, although their computational cost may be prohibitive in large-scale studies. Regarding non-exhaustive methods, not one could consistently find epistasis interactions when marginal effects are absent. If marginal effects are present, there are methods that perform well for high-order interactions, such as BADTrees, FDHE-IW, SingleMI or SNPHarvester. As for false-positive control, only SNPHarvester, FDHE-IW and DCHE show good results. The study concludes that there is no single epistasis detection method to recommend in all scenarios. Authors should prioritize exhaustive methods when sufficient computational resources are available considering the data set size, and resort to non-exhaustive methods when the analysis time is prohibitive.10.13039/501100010801-Xunta de Galicia (Grant Number: ED431C2016-037, ED431C2017/04 and ED431G2019/01) 10.13039/501100003176-Ministerio de Educacion Cultura y Deporte (Grant Number: FPU16/01333) 10.13039/501100003329-Ministerio de Economia y Competitividad (Grant Number: CGL2016-75482-P, PID2019-104184RB-I00, AEI/FEDER/EU, 10.13039/50110 and TIN2016-75845-P)Xunta de Galicia; ED431C2016-037Xunta de Galicia; ED431G2019/01Xunta de Galicia; ED431C 2017/0

Repositorio da Universidade da Coruña

Rapid forward-in-time simulation at the chromosome and genome level

Author: Aberer Andre J
Stamatakis Alexandros
Publication venue: BioMed Central
Publication date: 11/01/2017
Field of study

Background: In population genetics, simulation is a fundamental tool for analyzing how basic evolutionary forces such as natural selection, recombination, and mutation shape the genetic landscape of a population. Forward simulation represents the most powerful, but, at the same time, most compute-intensive approach for simulating the genetic material of a population. Results: We introduce AnA-FiTS, a highly optimized forward simulation software, that is up to two orders of magnitude faster than current state-of-the-art software. In addition, we present a novel algorithm that further improves runtimes by up to an additional order of magnitude, for simulations where a fraction of the mutations is neutral (e.g., only 10% of mutations have an effect on fitness). Apart from simulated sequences, our tool also generates a graph structure that depicts the complete observable history of neutral mutations. Conclusions: The substantial performance improvements allow for conducting forward simulations at the chromosome and genome level. The graph structure generated by our algorithm can give rise to novel approaches for visualizing and analyzing the output of forward simulations

KITopen