181 research outputs found
MATEDA: A suite of EDA programs in Matlab
This paper describes MATEDA-2.0, a suite of programs in Matlab for
estimation of distribution algorithms. The package allows the optimization of single and multi-objective problems with estimation of distribution
algorithms (EDAs) based on undirected graphical models and Bayesian
networks. The implementation is conceived for allowing the incorporation
by the user of different combinations of selection, learning, sampling, and
local search procedures. Other included methods allow the analysis of the
structures learned by the probabilistic models, the visualization of particular features of these structures and the use of the probabilistic models
as fitness modeling tools
A review on probabilistic graphical models in evolutionary computation
Thanks to their inherent properties, probabilistic graphical models are one of the prime candidates for machine learning and decision making tasks especially in uncertain domains. Their capabilities, like representation, inference and learning, if used effectively, can greatly help to build intelligent systems that are able to act accordingly in different problem domains. Evolutionary algorithms is one such discipline that has employed probabilistic graphical models to improve the search for optimal solutions in complex problems. This paper shows how probabilistic graphical models have been used in evolutionary algorithms to improve their performance in solving complex problems. Specifically, we give a survey of probabilistic model building-based evolutionary algorithms, called estimation of distribution algorithms, and compare different methods for probabilistic modeling in these algorithms
Distributed Low-rank Subspace Segmentation
Vision problems ranging from image clustering to motion segmentation to
semi-supervised learning can naturally be framed as subspace segmentation
problems, in which one aims to recover multiple low-dimensional subspaces from
noisy and corrupted input data. Low-Rank Representation (LRR), a convex
formulation of the subspace segmentation problem, is provably and empirically
accurate on small problems but does not scale to the massive sizes of modern
vision datasets. Moreover, past work aimed at scaling up low-rank matrix
factorization is not applicable to LRR given its non-decomposable constraints.
In this work, we propose a novel divide-and-conquer algorithm for large-scale
subspace segmentation that can cope with LRR's non-decomposable constraints and
maintains LRR's strong recovery guarantees. This has immediate implications for
the scalability of subspace segmentation, which we demonstrate on a benchmark
face recognition dataset and in simulations. We then introduce novel
applications of LRR-based subspace segmentation to large-scale semi-supervised
learning for multimedia event detection, concept detection, and image tagging.
In each case, we obtain state-of-the-art results and order-of-magnitude speed
ups
New methods for generating populations in Markov network based EDAs: Decimation strategies and model-based template recombination
Methods for generating a new population are a fundamental component of estimation of distribution algorithms (EDAs). They serve to transfer the information contained in the probabilistic model to the new generated population. In EDAs based on Markov networks, methods for generating new populations usually discard information contained in the model to gain in efficiency. Other methods like Gibbs sampling use information about all interactions in the model but are computationally very costly. In this paper we propose new methods for generating new solutions in EDAs based on Markov networks. We introduce approaches based on inference methods for computing the most probable configurations and model-based template recombination. We show that the application of different variants of inference methods can increase the EDAs’ convergence rate and reduce the number of function evaluations needed to find the optimum of binary and non-binary discrete functions
Adaptive Graph via Multiple Kernel Learning for Nonnegative Matrix Factorization
Nonnegative Matrix Factorization (NMF) has been continuously evolving in
several areas like pattern recognition and information retrieval methods. It
factorizes a matrix into a product of 2 low-rank non-negative matrices that
will define parts-based, and linear representation of nonnegative data.
Recently, Graph regularized NMF (GrNMF) is proposed to find a compact
representation,which uncovers the hidden semantics and simultaneously respects
the intrinsic geometric structure. In GNMF, an affinity graph is constructed
from the original data space to encode the geometrical information. In this
paper, we propose a novel idea which engages a Multiple Kernel Learning
approach into refining the graph structure that reflects the factorization of
the matrix and the new data space. The GrNMF is improved by utilizing the graph
refined by the kernel learning, and then a novel kernel learning method is
introduced under the GrNMF framework. Our approach shows encouraging results of
the proposed algorithm in comparison to the state-of-the-art clustering
algorithms like NMF, GrNMF, SVD etc.Comment: This paper has been withdrawn by the author due to the terrible
writin
Adaptive semi-supervised affinity propagation clustering algorithm based on structural similarity
Uzimajući u obzir nezadovoljavajuće djelovanje grupiranja srodnog širenja algoritma grupiranja, kada se radi o nizovima podataka složenih struktura, u ovom se radu predlaže prilagodljivi nadzirani algoritam grupiranja srodnog širenja utemeljen na strukturnoj sličnosti (SAAP-SS). Najprije se predlaže nova strukturna sličnost rješavanjem nelinearnog problema zastupljenosti niskoga ranga. Zatim slijedi srodno širenje na temelju podešavanja matrice sličnosti primjenom poznatih udvojenih ograničenja. Na kraju se u postupak algoritma uvodi ideja eksplozija kod vatrometa. Prilagodljivo pretražujući preferencijalni prostor u dva smjera, uravnotežuju se globalne i lokalne pretraživačke sposobnosti algoritma u cilju pronalaženja optimalne strukture grupiranja. Rezultati eksperimenata i sa sintetičkim i s realnim nizovima podataka pokazuju poboljšanja u radu predloženog algoritma u usporedbi s AP, FEO-SAP i K-means metodama.In view of the unsatisfying clustering effect of affinity propagation (AP) clustering algorithm when dealing with data sets of complex structures, an adaptive semi-supervised affinity propagation clustering algorithm based on structural similarity (SAAP-SS) is proposed in this paper. First, a novel structural similarity is proposed by solving a non-linear, low-rank representation problem. Then we perform affinity propagation on the basis of adjusting the similarity matrix by utilizing the known pairwise constraints. Finally, the idea of fireworks explosion is introduced into the process of the algorithm. By adaptively searching the preference space bi-directionally, the algorithm’s global and local searching abilities are balanced in order to find the optimal clustering structure. The results of the experiments with both synthetic and real data sets show performance improvements of the proposed algorithm compared with AP, FEO-SAP and K-means methods
From timeout-based to item-by-item analysis : investigating methodologies for splitting user sessions originated from shared accounts in online platforms
Although some content providers register stream data from its users and can track their profile style for content recommendation, when two or more users share a same account, their true profile activity is obfuscated and fuzzed. This user behavior hinders the recommender systems from providers, moreover, the growing concerns on user privacy poses a risk to current models that rely on unconcealed user identity. This work proposes a way of classifying users’ stream data trough sessions, based only on its media content, opening the possibility for breaking a same account profile within multiple user profiles and consequently identifying this activity. In this work dimensionality reduction and clustering methods are used to classify user stream data into sessions that correspond to each respective user profile. Experiments show that the event-driven nature of news content can challenge the construction of a session splitting method based exclusively on content-type without user profiling.Embora as provedoras de conteúdos registram dados de acessos de seus usuários e consigam analisar seus perfis para recomendações de conteúdo, quando duas ou mais pessoas compartilham da mesma conta a atividade e perfil original e individual de cada usuário é obfuscada e difusa por essas contas compartilhadas. Este comportamento confunde os sistemas de recomendação existentes, além disso, o aumento da preocupação com a privacidade dos usuários coloca em risco os modelos atuais que são dependentes de reconhecimento explícito dos usuários. Este trabalho propõe uma maneira de classificar o fluxo de dados dos usuários em sessões baseando-se apenas em seu conteúdo, abrindo portas para quebrar a mesma conta em múltiplos perfis de usuários e consequentemente identificando esta atividade. Neste trabalho técnicas de redução de dimensionalidade e métodos de clusterização são utilizados para classificar o fluxo de dados em sessões que correspondem respectivamente a cada perfil de usuário. Experimentos mostram que a natureza guiada a eventos dos conteúdos de notícias tornam desafiador a construção de um método de quebra de sessões exclusivamente baseado em categorização de conteúdo sem perfilização de usuário
- …