28 research outputs found

    A review on probabilistic graphical models in evolutionary computation

    Get PDF
    Thanks to their inherent properties, probabilistic graphical models are one of the prime candidates for machine learning and decision making tasks especially in uncertain domains. Their capabilities, like representation, inference and learning, if used effectively, can greatly help to build intelligent systems that are able to act accordingly in different problem domains. Evolutionary algorithms is one such discipline that has employed probabilistic graphical models to improve the search for optimal solutions in complex problems. This paper shows how probabilistic graphical models have been used in evolutionary algorithms to improve their performance in solving complex problems. Specifically, we give a survey of probabilistic model building-based evolutionary algorithms, called estimation of distribution algorithms, and compare different methods for probabilistic modeling in these algorithms

    Regularized model learning in EDAs for continuous and multi-objective optimization

    Get PDF
    Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods

    Approaches For Capturing Time-Varying Functional Network Connectivity With Application to Normative Development and Mental Illness

    Get PDF
    Since the beginning of medical science, the human brain has remained an unsolved puzzle; an illusive organ that controls everything- from breathing to heartbeats, from emotion to anger, and more. With the power of advanced neuroimaging techniques, scientists have now started to solve this nearly impossible puzzle, piece by piece. Over the past decade, various in vivo techniques, including functional magnetic resonance imaging (fMRI), have been increasingly used to understand brain functions. fMRI is extensively being used to facilitate the identification of various neuropsychological disorders such as schizophrenia (SZ), bipolar disorder (BP) and autism spectrum disorder (ASD). These disorders are currently diagnosed based on patients’ self-reported experiences, and observed symptoms and behaviors over the course of the illnesses. Therefore, efficient identification of biological-based markers (biomarkers) can lead to early diagnosis of these mental disorders, and provide a trajectory for disease progression. By applying advanced machine learning techniques on fMRI data, significant differences in brain function among patients with mental disorders and healthy controls can be identified. Moreover, by jointly estimating information from multiple modalities, such as, functional brain data and genetic factors, we can now investigate the relationship between brain function and genes. Functional connectivity (FC) has become a very common measure to characterize brain functions, where FC is defined as the temporal covariance of neural signals between multiple spatially distinct brain regions. Recently, researchers are studying the FC among functionally specialized brain networks which can be defined as a higher level of FC, and is termed as functional network connectivity (FNC, defined as the correlation value that summarizes the overall connection between brain ‘networks’ over time). Most functional connectivity studies have made the limiting assumption that connectivity is stationary over multiple minutes, and ignore to identify the time-varying and reoccurring patterns of FNC among brain regions (known as time-varying FNC). In this dissertation, we demonstrate the use of time-varying FNC features as potential biomarkers to differentiate between patients with mental disorders and healthy subjects. The developmental characteristics of time-varying FNC in children with typically developing brain and ASD have been extensively studies in a cross-sectional framework, and age-, sex- and disease-related FNC profiles have been proposed. Also, time-varying FNC is characterized in healthy adults and patients with severe mental disorders (SZ and BP). Moreover, an efficient classification algorithm is designed to identify patients and controls at individual level. Finally, a new framework is proposed to jointly utilize information from brain’s functional network connectivity and genetic features to find the associations between them. The frameworks that we presented here can help us understand the important role played by time-varying FNC to identify potential biomarkers for the diagnosis of severe mental disorders

    The association between stress and mood across the adult lifespan on default mode network

    Get PDF
    Aging of brain structure and function is a complex process characterized by high inter- and intra-individual variability. Such variability may arise from the interaction of multiple factors, including exposure to stressful experience and mood variation, across the lifespan. Using a multimodal neuroimaging and neurocognitive approach, we investigated the association of stress, mood and their interaction, in the structure and function of the default mode network (DMN), both during rest and task-induced deactivation, throughout the adult lifespan. Data confirmed a decreased functional connectivity (FC) and task-induced deactivation of the DMN during the aging process and in subjects with lower mood; on the contrary, an increased FC was observed in subjects with higher perceived stress. Surprisingly, the association of aging with DMN was altered by stress and mood in specific regions. An increased difficulty to deactivate the DMN was noted in older participants with lower mood, contrasting with an increased deactivation in individuals presenting high stress, independently of their mood levels, with aging. Interestingly, this constant interaction across aging was globally most significant in the combination of high stress levels with a more depressed mood state, both during resting state and task-induced deactivations. The present results contribute to characterize the spectrum of FC and deactivation patterns of the DMN, highlighting the crucial association of stress and mood levels, during the adult aging process. These combinatorial approaches may help to understand the heterogeneity of the aging process in brain structure and function and several states that may lead to neuropsychiatric disorders.The work was supported by SwitchBox-FP7-HEALTH-2010-Grant 259772-2 and by ON.2, O NOVO NORTE, North Portugal Regional Operational Programme 2007/2013, of the National strategic Reference Framework (NSRF) 2007/2013, through the European Regional Development Fund (ERDF)info:eu-repo/semantics/publishedVersio

    Innovative hybrid MOEA/AD variants for solving multi-objective combinatorial optimization problems

    Get PDF
    Orientador : Aurora Trinidad Ramirez PozoCoorientador : Roberto SantanaTese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa: Curitiba, 16/12/2016Inclui referências : f. 103-116Resumo: Muitos problemas do mundo real podem ser representados como um problema de otimização combinatória. Muitas vezes, estes problemas são caracterizados pelo grande número de variáveis e pela presença de múltiplos objetivos a serem otimizados ao mesmo tempo. Muitas vezes estes problemas são difíceis de serem resolvidos de forma ótima. Suas resoluções tem sido considerada um desafio nas últimas décadas. Os algoritimos metaheurísticos visam encontrar uma aproximação aceitável do ótimo em um tempo computacional razoável. Os algoritmos metaheurísticos continuam sendo um foco de pesquisa científica, recebendo uma atenção crescente pela comunidade. Uma das têndencias neste cenário é a arbordagem híbrida, na qual diferentes métodos e conceitos são combinados objetivando propor metaheurísticas mais eficientes. Nesta tese, nós propomos algoritmos metaheurísticos híbridos para a solução de problemas combinatoriais multiobjetivo. Os principais ingredientes das nossas propostas são: (i) o algoritmo evolutivo multiobjetivo baseado em decomposição (MOEA/D framework), (ii) a otimização por colônias de formigas e (iii) e os algoritmos de estimação de distribuição. Em nossos frameworks, além dos operadores genéticos tradicionais, podemos instanciar diferentes modelos como mecanismo de reprodução dos algoritmos. Além disso, nós introduzimos alguns componentes nos frameworks objetivando balancear a convergência e a diversidade durante a busca. Nossos esforços foram direcionados para a resolução de problemas considerados difíceis na literatura. São eles: a programação quadrática binária sem restrições multiobjetivo, o problema de programação flow-shop permutacional multiobjetivo, e também os problemas caracterizados como deceptivos. Por meio de estudos experimentais, mostramos que as abordagens propostas são capazes de superar os resultados do estado-da-arte em grande parte dos casos considerados. Mostramos que as diretrizes do MOEA/D hibridizadas com outras metaheurísticas é uma estratégia promissora para a solução de problemas combinatoriais multiobjetivo. Palavras-chave: metaheuristicas, otimização multiobjetivo, problemas combinatoriais, MOEA/D, otimização por colônia de formigas, algoritmos de estimação de distribuição, programação quadrática binária sem restrições multiobjetivo, problema de programação flow-shop permutacional multiobjetivo, abordagens híbridas.Abstract: Several real-world problems can be stated as a combinatorial optimization problem. Very often, they are characterized by the large number of variables and the presence of multiple conflicting objectives to be optimized at the same time. These kind of problems are, usually, hard to be solved optimally, and their solutions have been considered a challenge for a long time. Metaheuristic algorithms aim at finding an acceptable approximation to the optimal solution in a reasonable computational time. The research on metaheuristics remains an attractive area and receives growing attention. One of the trends in this scenario are the hybrid approaches, in which different methods and concepts are combined aiming to propose more efficient approaches. In this thesis, we have proposed hybrid metaheuristic algorithms for solving multi-objective combinatorial optimization problems. Our proposals are based on (i) the multi-objective evolutionary algorithm based on decomposition (MOEA/D framework), (ii) the bio-inspired metaheuristic ant colony optimization, and (iii) the probabilistic models from the estimation of distribution algorithms. Our algorithms are considered MOEA/D variants. In our MOEA/D variants, besides the traditional genetic operators, we can instantiate different models as the variation step (reproduction). Moreover, we include some design modifications into the frameworks to control the convergence and the diversity during their search (evolution). We have addressed some important problems from the literature, e.g., the multi-objective unconstrained binary quadratic programming, the multiobjective permutation flowshop scheduling problem, and the problems characterized by deception. As a result, we show that our proposed frameworks are able to solve these problems efficiently by outperforming the state-of-the-art approaches in most of the cases considered. We show that the MOEA/D guidelines hybridized to other metaheuristic components and concepts is a powerful strategy for solving multi-objective combinatorial optimization problems. Keywords: meta-heuristics, multi-objective optimization, combinatorial problems, MOEA/D, ant colony optimization, estimation of distribution algorithms, unconstrained binary quadratic programming, permutation flowshop scheduling problem, hybrid approaches

    Characterizing model uncertainty in ensemble learning

    Get PDF

    New Approaches for Data-mining and Classification of Mental Disorder in Brain Imaging Data

    Get PDF
    Brain imaging data are incredibly complex and new information is being learned as approaches to mine these data are developed. In addition to studying the healthy brain, new approaches for using this information to provide information about complex mental illness such as schizophrenia are needed. Functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG) are two well-known neuroimaging approaches that provide complementary information, both of which provide a huge amount of data that are not easily modelled. Currently, diagnosis of mental disorders is based on a patients self-reported experiences and observed behavior over the longitudinal course of the illness. There is great interest in identifying biologically based marker of illness, rather than relying on symptoms, which are a very indirect manifestation of the illness. The hope is that biological markers will lead to earlier diagnosis and improved treatment as well as reduced costs. Understanding mental disorders is a challenging task due to the complexity of brain structure and function, overlapping features between disorders, small numbers of data sets for training, heterogeneity within disorders, and a very large amount of high dimensional data. This doctoral work proposes machine learning and data mining based algorithms to detect abnormal functional network connectivity patterns of patients with schizophrenia and distinguish them from healthy controls using 1) independent components obtained from task related fMRI data, 2) functional network correlations based on resting-state and a hierarchy of tasks, and 3) functional network correlations in both fMRI and MEG data. The abnormal activation patterns of the functional network correlation of patients are characterized by using a statistical analysis and then used as an input to classification algorithms. The framework presented in this doctoral study is able to achieve good characterization of schizophrenia and provides an initial step towards designing an objective biological marker-based diagnostic test for schizophrenia. The methods we develop can also help us to more fully leverage available imaging technology in order to better understand the mystery of the human brain, the most complex organ in the human body

    Study on probabilistic model building genetic network programming

    Get PDF
    制度:新 ; 報告番号:甲3776号 ; 学位の種類:博士(工学) ; 授与年月日:2013/3/15 ; 早大学位記番号:新6149Waseda Universit

    Model-driven analysis of gene expression control

    Get PDF
    During this PhD, I worked on three different aspects in the broad field of experimental and theoretical analysis of gene regulation. The first part, "Quantifying the strength of miRNA-target interactions", addresses the problem of predicting mRNA targets of miRNAs. I show that biochemical measurements of miRNA-mRNA interactions can be used to optimise the parameter inference of a pre-existing model of miRNA target prediction. This model named MIRZA, predicts miRNA-mRNA binding using 25 energy parameters that describe the miRNA-mRNA hybrid structure, with 2 base pairing parameters for the AU and GC pairs, 3 configuration parameters for the symmetric and asymmetric loops, and 21 positional parameters for the 21 nucleotides of the miRNA sequence. MIRZA was built to infer these parameters from Argonaute protein CLIP data, which captures potential targets of miRNAs. Upon the publication of precise measurements of chemical kinetic constants of miRNA-mRNA binding interactions between a mRNA target and a set of systematically mutated miRNA sequences, we reasoned that such data could be used to improve the parameters inference of the MIRZA model. After showing that the prediction of the existing model on the set of measured miRNA-mRNA pairs shows high correlation with the binding energy calculated from the measurements, I used simulations as a proof of principle of the inference procedure and to design measurements that would be needed to infer the parameters of the MIRZA model. Staying in the field of miRNA, in "Single cell mRNA profiling reveals the hierarchical response of miRNA targets to miRNA induction", I developed an approach to infer miRNA targets based on scRNA-seq data from cells that express the miRNA at different levels. A miRNA can target several hundreds of different mRNAs and is present in the cell in limited quantities, implying that the interaction of a target mRNA with a specific miRNA depends on its concentration and on the interactions of the miRNA with its other targets. In other words, since miRNA binding is exclusive, mRNA targets compete for the same miRNA pool. Therefore, the concentrations of the thereby coupled mRNAs depend not only on the miRNA concentration but also on the concentration of every competing mRNA that is targeted by the same miRNA. To study this, HEK 293 cell lines were constructed to inducibly express a miRNA (hsa-miR-199a) as well as the mRNA encoding a green fluorescent protein. Express from the same promoter as the miRNA, this mRNA allows the monitoring of the miRNA concentration. The study aimed not only to determine the parameters of individual mRNA-mRNA interactions, but also to assess the degree to which mRNAs act in a competitive manner to influence each other's expression. scRNA-seq was chosen to bring the resolution needed to reach these goals. The effect of the miRNA on a bound target is to increase its decay rate, hence the expression levels of the targets depends on the miRNA concentration and their binding energy. To gain insight into the target binding energy, we constructed a model considering mRNA transcription rate, the miRNA-mRNA binding/unbinding rate, the mRNA decay rates in the bound and unbound state, and the free/bound concentration of miRNA. We showed that the model can be factored in terms of the miRNA concentrations in individual cells and the miRNA-mRNA target interaction parameters and we solved the model to obtain estimates of miRNA-mRNA interaction parameters, which we showed explain the mRNA levels in cells more accurately than the sequence-based computationally predicted interaction energies. Finally, in "Bayesian inference of the gene expression states from single-cell RNA-seq data" I carried out fundamental technical work on the normalisation of count data obtained in scRNA-seq experiments. As introduced above, multiple strategies have been developed with the aim of reducing the high level of noise present on such data, and estimating a 'true' biological state of expression for each gene in each cell. While the project aimed to reconstruct the Waddington landscape of regulator activity based on the single cell gene expression measurements, at the start of the project we realised that there is no satisfactory solution to gene expression normalisation in single cells in the literature. Thus, we tackled this problem with a Bayesian model, considering each gene independently and inferring a posterior probability of gene expression in each cell. Our model assumes a log-normal distribution of gene expression across cells and additional Poisson noise caused by the stochastic process of gene expression and the sampling process introduced by the mRNA capture in experimental protocols. These normalised gene expression values are the basis of a motif-activity response based approach for inferring the activity of TFs and miRNAs in individual cells, and for reconstructing the underlying landscape. The application of this normalisation algorithm to reconstruct a landscape is presented in the last part, "Realizing Waddington’s metaphor: Inferring regulatory landscapes from single-cell gene expression data". There I present the mathematical principles needed to formally define a landscape following the idea of Waddington from 1957, and I propose two applications of the landscape. First I show that it defines cell types as local minima, and secondly, in the case of cells undergoing differentiation, I show how the landscape can be used to find developmental path and the transcription factors associated with the differentiation process
    corecore