726 research outputs found
Single- and multi-objective genetic programming: new bounds for weighted order and majority
We consolidate the existing computational complexity analysis of genetic programming (GP) by bringing together sound theoretical proofs and empirical analysis. In particular, we address computational complexity issues arising when coupling algorithms using variable length representation, such as GP itself, with different bloat-control techniques. In order to accomplish this, we first introduce several novel upper bounds for two single- and multi-objective GP algorithms on the generalised Weighted ORDER and MAJORITY problems. To obtain these, we employ well-established computational complexity analysis techniques such as fitness-based partitions, and for the first time, additive and multiplicative drift. The bounds we identify depend on two measures, the maximum tree size and the maximum population size, that arise during the optimization run and that have a key relevance in determining the runtime of the studied GP algorithms. In order to understand the impact of these measures on a typical run, we study their magnitude experimentally, and we discuss the obtained findings.Anh Nguyen, Tommaso Urli, Markus Wagnerhttp://www.sigevo.org/foga-2013
Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis
Background and Objectives: This paper examines the accuracy and efficiency (time complexity) of high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. The need for this research derives from the urgent and increasing need for accurate and efficient algorithms. Colon cancer is a leading cause of death worldwide, hence it is vitally important for the cancer tissues to be expertly identified and classified in a rapid and timely manner, to assure both a fast detection of the disease and to expedite the drug discovery process.
Methods: In this research, a three-phase approach was proposed and implemented: Phases One and Two examined the feature selection algorithms and classification algorithms employed separately, and Phase Three examined the performance of the combination of these.
Results: It was found from Phase One that the Particle Swarm Optimization (PSO) algorithm performed best with the colon dataset as a feature selection (29 genes selected) and from Phase Two that the Sup- port Vector Machine (SVM) algorithm outperformed other classifications, with an accuracy of almost 86%. It was also found from Phase Three that the combined use of PSO and SVM surpassed other algorithms in accuracy and performance, and was faster in terms of time analysis (94%).
Conclusions: It is concluded that applying feature selection algorithms prior to classification algorithms results in better accuracy than when the latter are applied alone. This conclusion is important and significant to industry and society
Computational complexity analysis of genetic programming
Genetic programming (GP) is an evolutionary computation technique to solve problems in an automated, domain-independent way. Rather than identifying the optimum of a function as in more traditional evolutionary optimization, the aim of GP is to evolve computer programs with a given functionality. While many GP applications have produced human competitive results, the theoretical understanding of what problem characteristics and algorithm properties allow GP to be effective is comparatively limited. Compared with traditional evolutionary algorithms for function optimization, GP applications are further complicated by two additional factors: the variable-length representation of candidate programs, and the difficulty of evaluating their quality efficiently. Such difficulties considerably impact the runtime analysis of GP, where space complexity also comes into play. As a result, initial complexity analyses of GP have focused on restricted settings such as the evolution of trees with given structures or the estimation of solution quality using only a small polynomial number of input/output examples. However, the first computational complexity analyses of GP for evolving proper functions with defined input/output behavior have recently appeared. In this chapter, we present an overview of the state of the art
Conceptual graph-based knowledge representation for supporting reasoning in African traditional medicine
Although African patients use both conventional or modern and traditional healthcare simultaneously, it has been proven that 80% of people rely on African traditional medicine (ATM). ATM includes medical activities stemming from practices, customs and traditions which were integral to the distinctive African cultures. It is based mainly on the oral transfer of knowledge, with the risk of losing critical knowledge. Moreover, practices differ according to the regions and the availability of medicinal plants. Therefore, it is necessary to compile tacit, disseminated and complex knowledge from various Tradi-Practitioners (TP) in order to determine interesting patterns for treating a given disease. Knowledge engineering methods for traditional medicine are useful to model suitably complex information needs, formalize knowledge of domain experts and highlight the effective practices for their integration to conventional medicine. The work described in this paper presents an approach which addresses two issues. First it aims at proposing a formal representation model of ATM knowledge and practices to facilitate their sharing and reusing. Then, it aims at providing a visual reasoning mechanism for selecting best available procedures and medicinal plants to treat diseases. The approach is based on the use of the Delphi method for capturing knowledge from various experts which necessitate reaching a consensus. Conceptual graph formalism is used to model ATM knowledge with visual reasoning capabilities and processes. The nested conceptual graphs are used to visually express the semantic meaning of Computational Tree Logic (CTL) constructs that are useful for formal specification of temporal properties of ATM domain knowledge. Our approach presents the advantage of mitigating knowledge loss with conceptual development assistance to improve the quality of ATM care (medical diagnosis and therapeutics), but also patient safety (drug monitoring)
Human-microbiota interactions in health and disease :bioinformatics analyses of gut microbiome datasets
EngD ThesisThe human gut harbours a vast diversity of microbial cells, collectively known as the gut microbiota,
that are crucial for human health and dysfunctional in many of the most prevalent chronic diseases.
Until recently culture dependent methods limited our ability to study the microbiota in depth including
the collective genomes of the microbiota, the microbiome. Advances in culture independent
metagenomic sequencing technologies have since provided new insights into the microbiome and
lead to a rapid expansion of data rich resources for microbiome research. These high throughput
sequencing methods and large datasets provide new opportunities for research with an emphasis on
bioinformatics analyses and a novel field for drug discovery through data mining.
In this thesis I explore a range of metagenomics analyses to extract insights from metagenomics
data and inform drug discovery in the microbiota. Firstly I survey the existing technologies and
data sources available for data mining therapeutic targets. Then I analyse 16S metagenomics data
combined with metabolite data from mice to investigate the treatment model of a proposed antibiotic
treatment targetting the microbiota. Then I investigate the occurence frequency and diversity of
proteases in metagenomics data in order to inform understanding of host-microbiota-diet interactions
through protein and peptide associated glycan degradation by the gut microbiota. Finally I develop a
system to facilitate the process of integrating metagenomics data for gene annotations.
One of the main challenges in leveraging the scale of data availability in microbiome research is
managing the data resources from microbiome studies. Through a series of analytical studies I used
metagenomics data to identify community trends, to demonstrate therapeutic interventions and to do
a wide scale screen for proteases that are central to human-microbiota interactions. These studies
articulated the requirement for a computational framework to integrate and access metagenomics
data in a reproducible way using a scalable data store. The thesis concludes explaining how data
integration in microbiome research is needed to provide the insights into metagenomics data that are
required for drug discovery
Recommended from our members
Improving fault coverage and minimising the cost of fault identification when testing from finite state machines
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Software needs to be adequately tested in order to increase the confidence that the system being developed is reliable. However, testing is a complicated and expensive process. Formal specification based models such as finite state machines have been widely used in system modelling and testing. In this PhD thesis, we primarily investigate fault detection and identification when testing from finite state machines. The research in this thesis is mainly comprised of three topics - construction of multiple Unique Input/Output (UIO) sequences using Metaheuristic Optimisation Techniques (MOTs), the improved fault
coverage by using robust Unique Input/Output Circuit (UIOC) sequences, and fault diagnosis when testing from finite state machines. In the studies of the construction of UIOs, a model is proposed where a fitness function is defined to guide the search for input sequences that are potentially UIOs. In the studies of the improved fault coverage, a new type of UIOCs is defined. Based upon the Rural Chinese Postman Algorithm (RCPA), a new approach is proposed for the construction of more robust test sequences. In the studies of fault diagnosis, heuristics are defined that attempt to lead to failures being observed in some shorter test sequences, which helps to reduce the
cost of fault isolation and identification. The proposed approaches and techniques were evaluated with regard to a set of case studies, which provides experimental evidence for their efficacy.Brunel Research Initiative and Enterprise Fund (BRIEF) Award from Brunel University and Departmental bursary from Department of Information Systems and Computing, Brunel Universit
Continuous-time temporal logic specification and verification for nonlinear biological systems in uncertain contexts
In this thesis we introduce a complete framework for modelling and verification of biological systems in uncertain contexts based on the bond-calculus process algebra and
the LBUC spatio-temporal logic. The bond-calculus is a biological process algebra which
captures complex patterns of interaction based on affinity patterns, a novel communication
mechanism using pattern matching to express multiway interaction affinities and general
kinetic laws, whilst retaining an agent-centric modelling style for biomolecular species.
The bond-calculus is equipped with a novel continuous semantics which maps models to
systems of Ordinary Differential Equations (ODEs) in a compositional way.
We then extend the bond-calculus to handle uncertain models, featuring interval uncertainties in their species concentrations and reaction rate parameters. Our semantics is also
extended to handle uncertainty in every aspect of a model, producing non-deterministic
continuous systems whose behaviour depends either on time-independent uncertain parameters and initial conditions, corresponding to our partial knowledge of the system at
hand, or time-varying uncertain inputs, corresponding to genuine variability in a systemâs
behaviour based on environmental factors.
This language is then coupled with the LBUC spatio-temporal logic which combines
Signal Temporal Logic (STL) temporal operators with an uncertain context operator
which quantifies over an uncertain context model describing the range of environments
over which a property must hold. We develop model-checking procedures for STL and
LBUC properties based on verified signal monitoring over flowpipes produced by the
Flow* verified integrator, including the technique of masking which directs monitoring for
atomic propositions to time regions relevant to the overall verification problem at hand.
This allows us to monitor many interesting nested contextual properties and frequently
reduces monitoring costs by an order of magnitude. Finally, we explore the technique
of contextual signal monitoring which can use a single Flow* flowpipe representing a
functional dependency to complete a whole tree of signals corresponding to different
uncertain contexts. This allows us to produce refined monitoring results over the whole
space and to explore the variation in system behaviour in different contexts
Extending Epigenesis: From Phenotypic Plasticity to the Bio-Cultural Feedback
The paper aims at proposing an extended notion of epigenesis acknowledging an actual causal import to the phenotypic dimension for the evolutionary diversification of life forms. Section 1 offers introductory remarks on the issue of epigenesis contrasting it with ancient and modern preformationist views. In Section 2 we propose to intend epigenesis as a process of phenotypic formation and diversification a) dependent on environmental influences, b) independent of changes in the genomic nucleotide sequence, and c) occurring during the whole life span. Then, Section 3 focuses on phenotypic plasticity and offers an overview of basic properties (like robustness, modularity and degeneracy) that allows biological systems to be evolvable â i.e. to have the potentiality of producing phenotypic variation. Successively (Section 4), the emphasis is put on environmentally-induced modification in the regulation of gene expression giving rise to phenotypic variation and diversification. After some brief considerations on the debated issue of epigenetic inheritance (Section 5), the issue of culture (kept in the background of the preceding sections) is considered. The key point is that, in the case of humans and of the evolutionary history of the genus Homo at least, the environment is also, importantly, the cultural environment. Thus, Section 6 argues that a bio-cultural feedback should be acknowledged in the âepigenicâ processes leading to phenotypic diversification and innovation in Homo evolution. Finally, Section 7 introduces the notion of âcultural neural reuseâ, which refers to phenotypic/neural modifications induced by specific features of the cultural environment that are effective in human cultural evolution without involving genetic changes. Therefore, cultural neural reuse may be regarded as a key instance of the bio-cultural feedback and ultimately of the extended notion of epigenesis proposed in this work
- âŚ