6 research outputs found

    Combining Genomics, Metabolome Analysis, and Biochemical Modelling to Understand Metabolic Networks

    Get PDF
    Now that complete genome sequences are available for a variety of organisms, the elucidation of gene functions involved in metabolism necessarily includes a better understanding of cellular responses upon mutations on all levels of gene products, mRNA, proteins, and metabolites. Such progress is essential since the observable properties of organisms – the phenotypes – are produced by the genotype in juxtaposition with the environment. Whereas much has been done to make mRNA and protein profiling possible, considerably less effort has been put into profiling the end products of gene expression, metabolites. To date, analytical approaches have been aimed primarily at the accurate quantification of a number of pre-defined target metabolites, or at producing fingerprints of metabolic changes without individually determining metabolite identities. Neither of these approaches allows the formation of an in-depth understanding of the biochemical behaviour within metabolic networks. Yet, by carefully choosing protocols for sample preparation and analytical techniques, a number of chemically different classes of compounds can be quantified simultaneously to enable such understanding. In this review, the terms describing various metabolite-oriented approaches are given, and the differences among these approaches are outlined. Metabolite target analysis, metabolite profiling, metabolomics, and metabolic fingerprinting are considered. For each approach, a number of examples are given, and potential applications are discussed

    Field Guide to Genetic Programming

    Get PDF

    Event and state detection in time series by genetic programming

    Get PDF
    Event and state detection in time series has significant value in scientific areas and real-world applications. The aim of detecting time series event and state patterns is to identify particular variations of user-interest in one or more channels of time series streams. For example, dangerous driving behaviours such as sudden braking and harsh acceleration can be detected from continuous recordings from inertial sensors. However, the existing methods are highly dependent on domain knowledge such as the size of the time series pattern and a set of effective features. Furthermore, they are not directly suitable for multi-channel time series data. In this study, we establish a genetic programming based method which can perform classification on multi-channel time series data. It does not need the domain knowledge required by the existing methods. The investigation consists of four parts: the methodology, an evaluation on event detection tasks, an evaluation on state detection tasks and an analysis on the suitability for real-world applications. In the methodology, a GP based method is proposed for processing and analysing multi-channel time series streams. The function set includes basic mathematical operations. In addition, specific functions and terminals are introduced to reserve historical information, capture temporal dependency across time points and handle dependency between channels. These functions and terminals help the GP based method to automatically find the pattern size and extract features. This study also investigates two different fitness functions - accuracy and area under the curve. The proposed method is investigated on a range of event detection tasks. The investigation starts from synthetic tasks such as detecting complete sine waves. The performance of the GP based method is compared to traditional classification methods. On the raw data the GP based method achieves 100 percent accuracy, which outperforms all the non-GP methods.The performance of the non-GP methods is comparable to the GP based method only with suitable features. In addition, the GP based method is investigated on two complex real-world event detection tasks - dangerous driving behaviour detection and video shot detection. In the task of detecting three dangerous driving behaviours from 21-channel time series data, the GP based method performs consistently better than the non-GP classifiers even when features are provided. In the video shot detection task, the GP based method achieves comparable performance on 11200-channel time series to the non-GP classifiers on 28 features. The GP based method is more accurate than a commercial product. The GP based method has also been investigated on state detection tasks. This involves synthetic tasks such as detecting concurrent high values in four of five channels and a real-world activity recognition problem. The results also show that the GP based method consistently outperforms the non-GP methods even with the presence of manually constructed features. As part of the investigation, a mobile phone based activity recognition data set was collected as there was no existing publicly available data set. The suitability of the GP based method for solving real-world problems is further analysed. Our analysis shows that the GP based method can be successfully extended for multi-class classification. The analysis of the evolved programs demonstrates that they do capture time series patterns. On synthetic data sets, the injected regularities are revealed in understandable individuals. The best programs for three real-world problems are more difficult to explain but still provide some insight. The selection of relevant channels and data points by the programs are consistent with domain knowledge. In addition, the analysis shows that the proposed method still performs well for time series pattern of different sizes. The effective window sizes of the evolved GP programs are close to the pattern size. Finally, our study on execution performance of the evolved programs shows that these programs are fast in execution and are suitable for real-time applications. In summary, the GP based method is suitable for the kinds of real-world applications studied in this thesis. This thesis concludes that, with a suitable representation, genetic programming can be an effective method for event and state detection in multi-channel time series for a range of synthetic and real-world tasks. This method does not require much domain knowledge such as the pattern size and suitable features. It offers an effective classification method in similar tasks that are studied in this thesis
    corecore