14 research outputs found

    A tandem evolutionary algorithm for identifying causal rules from complex data

    Get PDF
    We propose a new evolutionary approach for discovering causal rules in complex classification problems from batch data. Key aspects include (a) the use of a hypergeometric probability mass function as a principled statistic for assessing fitness that quantifies the probability that the observed association between a given clause and target class is due to chance, taking into account the size of the dataset, the amount of missing data, and the distribution of outcome categories, (b) tandem age-layered evolutionary algorithms for evolving parsimonious archives of conjunctive clauses, and disjunctions of these conjunctions, each of which have probabilistically significant associations with outcome classes, and (c) separate archive bins for clauses of different orders, with dynamically adjusted order-specific thresholds. The method is validated on majority-on and multiplexer benchmark problems exhibiting various combinations of heterogeneity, epistasis, overlap, noise in class associations, missing data, extraneous features, and imbalanced classes. We also validate on a more realistic synthetic genome dataset with heterogeneity, epistasis, extraneous features, and noise. In all synthetic epistatic benchmarks, we consistently recover the true causal rule sets used to generate the data. Finally, we discuss an application to a complex real-world survey dataset designed to inform possible ecohealth interventions for Chagas disease

    Epidemiology of Injury in English Women's Super league Football: A Cohort Study

    Get PDF
    INTRODUCTION: The epidemiology of injury in male professional football has been well documented (Ekstrand, Hägglund, & Waldén, 2011) and used as a basis to understand injury trends for a number of years. The prevalence and incidence of injuries occurring in womens super league football is unknown. The aim of this study is to estimate the prevalence and incidence of injury in an English Super League Women’s Football squad. METHODS: Following ethical approval from Leeds Beckett University, players (n = 25) signed to a Women’s Super League Football club provided written informed consent to complete a self-administered injury survey. Measures of exposure, injury and performance over a 12-month period was gathered. Participants were classified as injured if they reported a football injury that required medical attention or withdrawal from participation for one day or more. Injuries were categorised as either traumatic or overuse and whether the injury was a new injury and/or re-injury of the same anatomical site RESULTS: 43 injuries, including re-injury were reported by the 25 participants providing a clinical incidence of 1.72 injuries per player. Total incidence of injury was 10.8/1000 h (95% CI: 7.5 to 14.03). Participants were at higher risk of injury during a match compared with training (32.4 (95% CI: 15.6 to 48.4) vs 8.0 (95% CI: 5.0 to 10.85)/1000 hours, p 28 days) of which there were three non-contact anterior cruciate ligament (ACL) injuries. The epidemiological incidence proportion was 0.80 (95% CI: 0.64 to 0.95) and the average probability that any player on this team will sustain at least one injury was 80.0% (95% CI: 64.3% to 95.6%) CONCLUSION: This is the first report capturing exposure and injury incidence by anatomical site from a cohort of English players and is comparable to that found in Europe (6.3/1000 h (95% CI 5.4 to 7.36) Larruskain et al 2017). The number of ACL injuries highlights a potential injury burden for a squad of this size. Multi-site prospective investigations into the incidence and prevalence of injury in women’s football are require

    A New Evolutionary Algorithm For Mining Noisy, Epistatic, Geospatial Survey Data Associated With Chagas Disease

    Get PDF
    The scientific community is just beginning to understand some of the profound affects that feature interactions and heterogeneity have on natural systems. Despite the belief that these nonlinear and heterogeneous interactions exist across numerous real-world systems (e.g., from the development of personalized drug therapies to market predictions of consumer behaviors), the tools for analysis have not kept pace. This research was motivated by the desire to mine data from large socioeconomic surveys aimed at identifying the drivers of household infestation by a Triatomine insect that transmits the life-threatening Chagas disease. To decrease the risk of transmission, our colleagues at the laboratory of applied entomology and parasitology have implemented mitigation strategies (known as Ecohealth interventions); however, limited resources necessitate the search for better risk models. Mining these complex Chagas survey data for potential predictive features is challenging due to imbalanced class outcomes, missing data, heterogeneity, and the non-independence of some features. We develop an evolutionary algorithm (EA) to identify feature interactions in Big Datasets with desired categorical outcomes (e.g., disease or infestation). The method is non-parametric and uses the hypergeometric PMF as a fitness function to tackle challenges associated with using p-values in Big Data (e.g., p-values decrease inversely with the size of the dataset). To demonstrate the EA effectiveness, we first test the algorithm on three benchmark datasets. These include two classic Boolean classifier problems: (1) the majority-on problem and (2) the multiplexer problem, as well as (3) a simulated single nucleotide polymorphism (SNP) disease dataset. Next, we apply the EA to real-world Chagas Disease survey data and successfully archived numerous high-order feature interactions associated with infestation that would not have been discovered using traditional statistics. These feature interactions are also explored using network analysis. The spatial autocorrelation of the genetic data (SNPs of Triatoma dimidiata) was captured using geostatistics. Specifically, a modified semivariogram analysis was performed to characterize the SNP data and help elucidate the movement of the vector within two villages. For both villages, the SNP information showed strong spatial autocorrelation albeit with different geostatistical characteristics (sills, ranges, and nuggets). These metrics were leveraged to create risk maps that suggest the more forested village had a sylvatic source of infestation, while the other village had a domestic/peridomestic source. This initial exploration into using Big Data to analyze disease risk shows that novel and modified existing statistical tools can improve the assessment of risk on a fine-scale

    Learning Feature Selection and Combination Strategies for Generic Salient Object Detection

    No full text
    For a diverse range of applications in machine vision from social media searches to robotic home care providers, it is important to replicate the mechanism by which the human brain selects the most important visual information, while suppressing the remaining non-usable information. Many computational methods attempt to model this process by following the traditional model of visual attention. The traditional model of attention involves feature extraction, conditioning and combination to capture this behaviour of human visual attention. Consequently, the model has inherent design choices at its various stages. These choices include selection of parameters related to the feature computation process, setting a conditioning approach, feature importance and setting a combination approach. Despite rapid research and substantial improvements in benchmark performance, the performance of many models depends upon tuning these design choices in an ad hoc fashion. Additionally, these design choices are heuristic in nature, thus resulting in good performance only in certain settings. Consequentially, many such models exhibit low robustness to difficult stimuli and the complexities of real-world imagery. Machine learning and optimisation technique have long been used to increase the generalisability of a system to unseen data. Surprisingly, artificial learning techniques have not been investigated to their full potential to improve generalisation of visual attention methods. The proposed thesis is that artificial learning can increase the generalisability of the traditional model of visual attention by effective selection and optimal combination of features. The following new techniques have been introduced at various stages of the traditional model of visual attention to improve its generalisation performance, specifically on challenging cases of saliency detection: 1. Joint optimisation of feature related parameters and feature importance weights is introduced for the first time to improve the generalisation of the traditional model of visual attention. To evaluate the joint learning hypothesis, a new method namely GAOVSM is introduced for the tasks of eye fixation prediction. By finding the relationships between feature related parameters and feature importance, the developed method improves the generalisation performance of baseline method (that employ human encoded parameters). 2. Spectral matting based figure-ground segregation is introduced to overcome the artifacts encountered by region-based salient object detection approaches. By suppressing the unwanted background information and assigning saliency to object parts in a uniform manner, the developed FGS approach overcomes the limitations of region based approaches. 3. Joint optimisation of feature computation parameters and feature importance weights is introduced for optimal combination of FGS with complementary features for the first time for salient object detection. By learning feature related parameters and their respective importance at multiple segmentation thresholds and by considering the performance gaps amongst features, the developed FGSopt method improves the object detection performance of the FGS technique also improving upon several state-of-the-art salient object detection models. 4. The introduction of multiple combination schemes/rules further extends the generalisability of the traditional attention model beyond that of joint optimisation based single rules. The introduction of feature composition based grouping of images, enables the developed IGA method to autonomously identify an appropriate combination strategy for an unseen image. The results of a pair-wise ranksum test confirm that the IGA method is significantly better than the deterministic and classification based benchmark methods on the 99% confidence interval level. Extending this line of research, a novel relative encoding approach enables the adapted XCSCA method to group images having similar saliency prediction ability. By keeping track of previous inputs, the introduced action part of the XCSCA approach enables learning of generalised feature importance rules. By more accurate grouping of images as compared with IGA, generalised learnt rules and appropriate application of feature importance rules, the XCSCA approach improves upon the generalisation performance of the IGA method. 5. The introduced uniform saliency assignment and segmentation quality cues enable label free evaluation of a feature/saliency map. By accurate ranking and effective clustering, the developed DFS method successfully solves the complex problem of finding appropriate features for combination (on an-image-by-image basis) for the first time in saliency detection. The DFS method enables ground truth free evaluation of saliency methods and advances the state-of-the-art in data driven saliency aggregation by detection and deselection of redundant information. The final contribution is that the developed methods are formed into a complete system where analysis shows the effects of their interactions on the system. Based on the saliency prediction accuracy versus computational time trade-off, specialised variants of the proposed methods are presented along with the recommendations for further use by other saliency detection systems. This research work has shown that artificial learning can increase the generalisation of the traditional model of attention by effective selection and optimal combination of features. Overall, this thesis has shown that it is the ability to autonomously segregate images based on their types and subsequent learning of appropriate combinations that aid generalisation on difficult unseen stimuli

    Human inspired robotic path planning and heterogeneous robotic mapping

    No full text
    One of the biggest challenges facing robotics is the ability for a robot to autonomously navigate real-world unknown environments and is considered by many to be a key prerequisite of truly autonomous robots. Autonomous navigation is a complex problem that requires a robot to solve the three problems of navigation: localisation, goal recognition, and path-planning. Conventional approaches to these problems rely on computational techniques that are inherently rigid and brittle. That is, the underlying models cannot adapt to novel input, nor can they account for all potential external conditions, which could result in erroneous or misleading decision making. In contrast, humans are capable of learning from their prior experiences and adapting to novel situations. Humans are also capable of sharing their experiences and knowledge with other humans to bootstrap their learning. This is widely thought to underlie the success of humanity by allowing high-fidelity transmission of information and skills between individuals, facilitating cumulative knowledge gain. Furthermore, human cognition is influenced by internal emotion states. Historically considered to be a detriment to a person's cognitive process, recent research is regarding emotions as a beneficial mechanism in the decision making process by facilitating the communication of simple, but high-impact information. Human created control approaches are inherently rigid and cannot account for the complexity of behaviours required for autonomous navigation. The proposed thesis is that cognitive inspired mechanisms can address limitations in current robotic navigation techniques by allowing robots to autonomously learn beneficial behaviours from interacting with its environment. The first objective is to enable the sharing of navigation information between heterogeneous robotic platforms. The second objective is to add flexibility to rigid path-planning approaches by utilising emotions as low-level but high-impact behavioural responses. Inspired by cognitive sciences, a novel cognitive mapping approach is presented that functions in conjunction with current localisation techniques. The cognitive mapping stage utilises an Anticipatory Classifier System (ACS) to learn the novel Cognitive Action Map (CAM) of decision points, areas in which a robot must determine its next action (direction of travel). These physical actions provide a shared means of understanding the environment to allow for communicating learned navigation information. The presented cognitive mapping approach has been trained and evaluated on real-world robotic platforms. The results show the successful sharing of navigation information between two heterogeneous robotic platforms with different sensing capabilities. The results have also demonstrated the novel contribution of autonomously sharing navigation information between a range-based (GMapping) and vision-based (RatSLAM) localisation approach for the first time. The advantage of sharing information between localisation techniques allows an individual robotic platform to utilise the best fit localisation approach for its sensors while still being able to provide useful navigation information for robots with different sensor types. Inspired by theories on natural emotions, this work presents a novel emotion model designed to improve a robot's navigation performance through learning to adapt a rigid path-planning approach. The model is based on the concept of a bow-tie structure, linking emotional reinforcers and behavioural modifiers through intermediary emotion states. An important function of the emotions in the model is to provide a compact set of high-impact behaviour adaptations, reducing an otherwise tangled web of stimulus-response patterns. Crucially, the system learns these emotional responses with no human pre-specifying the behaviour of the robot, hence avoiding human bias. The results of training the emotion model demonstrate that it is capable of learning up to three emotion states for robotic navigation without human bias: fear, apprehension, and happiness. The fear and apprehension responses slow the robot's speed and drive the robot away from obstacles when the robot experiences pain, or is uncertain of its current position. The happiness response increases the speed of the robot and reduces the safety margins around obstacles when pain is absent, allowing the robot to drive closer to obstacles. These learned emotion responses have improved the navigation performance of the robot by reducing collisions and navigation times, in both simulated and real-world experiments. The two emotion model (fear and happiness) improved performance the most, indicating that a robot may only require two emotion states (fear and happiness) for navigation in common, static domains

    A Survey on Evolutionary Computation Approaches to Feature Selection

    Get PDF
    Feature selection is an important task in data mining and machine learning to reduce the dimensionality of the data and increase the performance of an algorithm, such as a classification algorithm. However, feature selection is a challenging task due mainly to the large search space. A variety of methods have been applied to solve feature selection problems, where evolutionary computation (EC) techniques have recently gained much attention and shown some success. However, there are no comprehensive guidelines on the strengths and weaknesses of alternative approaches. This leads to a disjointed and fragmented field with ultimately lost opportunities for improving performance and successful applications. This paper presents a comprehensive survey of the state-of-the-art work on EC for feature selection, which identifies the contributions of these different algorithms. In addition, current issues and challenges are also discussed to identify promising areas for future research.</p

    A brief history of learning classifier systems: from CS-1 to XCS and its variants

    Get PDF
    © 2015, Springer-Verlag Berlin Heidelberg. The direction set by Wilson’s XCS is that modern Learning Classifier Systems can be characterized by their use of rule accuracy as the utility metric for the search algorithm(s) discovering useful rules. Such searching typically takes place within the restricted space of co-active rules for efficiency. This paper gives an overview of the evolution of Learning Classifier Systems up to XCS, and then of some of the subsequent developments of Wilson’s algorithm to different types of learning

    Proceedings of the 8th International Conference on Energy Efficiency in Domestic Appliances and Lighting

    Get PDF
    At the EEDAL'15 conference 128 papers dealing with energy consumption and energy efficiency improvements for the residential sector have been presented. Papers focused policies and programmes, technologies and consumer behaviour. Special focus was on standards and labels, demand response and smart meters. All the paper s have been peer reviewed by experts in the sector.JRC.F.7-Renewables and Energy Efficienc

    Improving the Scalability of XCS-Based Learning Classifier Systems

    No full text
    Using evolutionary intelligence and machine learning techniques, a broad range of intelligent machines have been designed to perform different tasks. An intelligent machine learns by perceiving its environmental status and taking an action that maximizes its chances of success. Human beings have the ability to apply knowledge learned from a smaller problem to more complex, large-scale problems of the same or a related domain, but currently the vast majority of evolutionary machine learning techniques lack this ability. This lack of ability to apply the already learned knowledge of a domain results in consuming more than the necessary resources and time to solve complex, large-scale problems of the domain. As the problem increases in size, it becomes difficult and even sometimes impractical (if not impossible) to solve due to the needed resources and time. Therefore, in order to scale in a problem domain, a systemis needed that has the ability to reuse the learned knowledge of the domain and/or encapsulate the underlying patterns in the domain. To extract and reuse building blocks of knowledge or to encapsulate the underlying patterns in a problem domain, a rich encoding is needed, but the search space could then expand undesirably and cause bloat, e.g. as in some forms of genetic programming (GP). Learning classifier systems (LCSs) are a well-structured evolutionary computation based learning technique that have pressures to implicitly avoid bloat, such as fitness sharing through niche based reproduction. The proposed thesis is that an LCS can scale to complex problems in a domain by reusing the learnt knowledge from simpler problems of the domain and/or encapsulating the underlying patterns in the domain. Wilson’s XCS is used to implement and test the proposed systems, which is a well-tested, online learning and accuracy based LCS model. To extract the reusable building blocks of knowledge, GP-tree like, code-fragments are introduced, which are more than simply another representation (e.g. ternary or real-valued alphabets). This thesis is extended to capture the underlying patterns in a problemusing a cyclic representation. Hard problems are experimented to test the newly developed scalable systems and compare them with benchmark techniques. Specifically, this work develops four systems to improve the scalability of XCS-based classifier systems. (1) Building blocks of knowledge are extracted fromsmaller problems of a Boolean domain and reused in learning more complex, large-scale problems in the domain, for the first time. By utilizing the learnt knowledge from small-scale problems, the developed XCSCFC (i.e. XCS with Code-Fragment Conditions) system readily solves problems of a scale that existing LCS and GP approaches cannot, e.g. the 135-bitMUX problem. (2) The introduction of the code fragments in classifier actions in XCSCFA (i.e. XCS with Code-Fragment Actions) enables the rich representation of GP, which when couples with the divide and conquer approach of LCS, to successfully solve various complex, overlapping and niche imbalance Boolean problems that are difficult to solve using numeric action based XCS. (3) The underlying patterns in a problem domain are encapsulated in classifier rules encoded by a cyclic representation. The developed XCSSMA system produces general solutions of any scale n for a number of important Boolean problems, for the first time in the field of LCS, e.g. parity problems. (4) Optimal solutions for various real-valued problems are evolved by extending the existing real-valued XCSR system with code-fragment actions to XCSRCFA. Exploiting the combined power of GP and LCS techniques, XCSRCFA successfully learns various continuous action and function approximation problems that are difficult to learn using the base techniques. This research work has shown that LCSs can scale to complex, largescale problems through reusing learnt knowledge. The messy nature, disassociation of message to condition order, masking, feature construction, and reuse of extracted knowledge add additional abilities to the XCS family of LCSs. The ability to use rich encoding in antecedent GP-like codefragments or consequent cyclic representation leads to the evolution of accurate, maximally general and compact solutions in learning various complex Boolean as well as real-valued problems. Effectively exploiting the combined power of GP and LCS techniques, various continuous action and function approximation problems are solved in a simple and straight forward manner. The analysis of the evolved rules reveals, for the first time in XCS, that no matter how specific or general the initial classifiers are, all the optimal classifiers are converged through the mechanism ‘be specific then generalize’ near the final stages of evolution. Also that standard XCS does not use all available information or all available genetic operators to evolve optimal rules, whereas the developed code-fragment action based systems effectively use figure and ground information during the training process. Thiswork has created a platformto explore the reuse of learnt functionality, not just terminal knowledge as present, which is needed to replicate human capabilities
    corecore