Search CORE

555 research outputs found

Efficient mining of maximal biclusters in mixed-attribute datasets

Author: Veroneze Rosana
Von Zuben Fernando J.
Publication venue
Publication date: 09/10/2017
Field of study

This paper presents a novel enumerative biclustering algorithm to directly mine all maximal biclusters in mixed-attribute datasets (containing both numerical and categorical attributes), with or without missing values. The proposal is an extension of RIn-Close_CVC, which was originally conceived to mine perfect or perturbed biclusters with constant values on columns solely from numerical datasets, and without missing values. Even endowed with additional and more general features, the extended RIn-Close_CVC retains four key properties: (1) efficiency, (2) completeness, (3) correctness, and (4) non-redundancy. Our proposal is the first one to deal with mixed-attribute datasets without requiring any pre-processing step, such as discretization and itemization of real-valued attributes. This is a decisive aspect, because discretization and itemization implies a priori decisions, with information loss and no clear control over the consequences. On the other hand, even having to specify a priori an individual threshold for each numerical attribute, that will be used to indicate internal consistency per attribute, each threshold will be applied during the construction of the biclusters, shaping the peculiarities of the data distribution. We also explore the strong connection between biclustering and frequent pattern mining to (1) provide filters to select a compact bicluster set that exhibits high relevance and low redundancy, and (2) in the case of labeled datasets, automatically present the biclusters in a user-friendly and intuitive form, by means of quantitative class association rules. Our experimental results showed that the biclusters yield a parsimonious set of relevant rules, providing useful and interpretable models for five mixed-attribute labeled datasets

arXiv.org e-Print Archive

RIn-Close_CVC2: an even more efficient enumerative algorithm for biclustering of numerical datasets

Author: Veroneze Rosana
Von Zuben Fernando J.
Publication venue
Publication date: 17/10/2018
Field of study

RIn-Close_CVC is an efficient (take polynomial time per bicluster), complete (find all maximal biclusters), correct (all biclusters attend the user-defined level of consistency) and non-redundant (all the obtained biclusters are maximal and the same bicluster is not enumerated more than once) enumerative algorithm for mining maximal biclusters with constant values on columns in numerical datasets. Despite RIn-Close_CVC has all these outstanding properties, it has a high computational cost in terms of memory usage because it must keep a symbol table in memory to prevent a maximal bicluster to be found more than once. In this paper, we propose a new version of RIn-Close_CVC, named RIn-Close_CVC2, that does not use a symbol table to prevent redundant biclusters, and keeps all these four properties. We also prove that these algorithms actually possess these properties. Experiments are carried out with synthetic and real-world datasets to compare RIn-Close_CVC and RIn-Close_CVC2 in terms of memory usage and runtime. The experimental results show that RIn-Close_CVC2 brings a large reduction in memory usage and, in average, significant runtime gain when compared to its predecessor

arXiv.org e-Print Archive

Necessary and Sufficient Conditions for Surrogate Functions of Pareto Frontiers and Their Synthesis Using Gaussian Processes

Author: Miranda Conrado Silva
Von Zuben Fernando José
Publication venue
Publication date: 18/12/2015
Field of study

This paper introduces the necessary and sufficient conditions that surrogate functions must satisfy to properly define frontiers of non-dominated solutions in multi-objective optimization problems. These new conditions work directly on the objective space, thus being agnostic about how the solutions are evaluated. Therefore, real objectives or user-designed objectives' surrogates are allowed, opening the possibility of linking independent objective surrogates. To illustrate the practical consequences of adopting the proposed conditions, we use Gaussian processes as surrogates endowed with monotonicity soft constraints and with an adjustable degree of flexibility, and compare them to regular Gaussian processes and to a frontier surrogate method in the literature that is the closest to the method proposed in this paper. Results show that the necessary and sufficient conditions proposed here are finely managed by the constrained Gaussian process, guiding to high-quality surrogates capable of suitably synthesizing an approximation to the Pareto frontier in challenging instances of multi-objective optimization, while an existing approach that does not take the theory proposed in consideration defines surrogates which greatly violate the conditions to describe a valid frontier

arXiv.org e-Print Archive

Multi-Objective Optimization for Self-Adjusting Weighted Gradient in Machine Learning Tasks

Author: Miranda Conrado Silva
Von Zuben Fernando José
Publication venue
Publication date: 20/07/2015
Field of study

Much of the focus in machine learning research is placed in creating new architectures and optimization methods, but the overall loss function is seldom questioned. This paper interprets machine learning from a multi-objective optimization perspective, showing the limitations of the default linear combination of loss functions over a data set and introducing the hypervolume indicator as an alternative. It is shown that the gradient of the hypervolume is defined by a self-adjusting weighted mean of the individual loss gradients, making it similar to the gradient of a weighted mean loss but without requiring the weights to be defined a priori. This enables an inner boosting-like behavior, where the current model is used to automatically place higher weights on samples with higher losses but without requiring the use of multiple models. Results on a denoising autoencoder show that the new formulation is able to achieve better mean loss than the direct optimization of the mean loss, providing evidence to the conjecture that self-adjusting the weights creates a smoother loss surface

arXiv.org e-Print Archive

MONISE - Many Objective Non-Inferior Set Estimation

Author: Raimundo Marcos M.
Von Zuben Fernando J.
Publication venue
Publication date: 31/01/2018
Field of study

This work proposes a novel many objective optimization approach that globally finds a set of efficient solutions, also known as Pareto-optimal solutions, by automatically formulating and solving a sequence of weighted problems. The approach is called MONISE (Many-Objective NISE), because it represents an extension of the well-known non-inferior set estimation (NISE) algorithm, which was originally conceived to deal with two-dimensional objective spaces. Looking for theoretical support, we demonstrate that being a solution of the weighted problem is a necessary condition, and it will also be a sufficient condition at the convex hull of the feasible set. The proposal is conceived to operate in more than two dimensions, thus properly supporting many objectives. Moreover, specifically deal with two objectives, some nice additional properties are portrayed for the estimated non-inferior set. Experimental results are used to validate the proposal and have indicated that MONISE is competitive both in terms of computational cost and considering the overall quality of the non-inferior set, measured by the hypervolume.Comment: 36 page

arXiv.org e-Print Archive

Single-Solution Hypervolume Maximization and its use for Improving Generalization of Neural Networks

Author: Miranda Conrado S.
Von Zuben Fernando J.
Publication venue
Publication date: 02/02/2016
Field of study

This paper introduces the hypervolume maximization with a single solution as an alternative to the mean loss minimization. The relationship between the two problems is proved through bounds on the cost function when an optimal solution to one of the problems is evaluated on the other, with a hyperparameter to control the similarity between the two problems. This same hyperparameter allows higher weight to be placed on samples with higher loss when computing the hypervolume's gradient, whose normalized version can range from the mean loss to the max loss. An experiment on MNIST with a neural network is used to validate the theory developed, showing that the hypervolume maximization can behave similarly to the mean loss minimization and can also provide better performance, resulting on a 20% reduction of the classification error on the test set

arXiv.org e-Print Archive

Hybrid Algorithm for Multi-Objective Optimization by Greedy Hypervolume Maximization

Author: Miranda Conrado Silva
Von Zuben Fernando José
Publication venue
Publication date: 17/06/2015
Field of study

This paper introduces a high-performance hybrid algorithm, called Hybrid Hypervolume Maximization Algorithm (H2MA), for multi-objective optimization that alternates between exploring the decision space and exploiting the already obtained non-dominated solutions. The proposal is centered on maximizing the hypervolume indicator, thus converting the multi-objective problem into a single-objective one. The exploitation employs gradient-based methods, but considering a single candidate efficient solution at a time, to overcome limitations associated with population-based approaches and also to allow an easy control of the number of solutions provided. There is an interchange between two steps. The first step is a deterministic local exploration, endowed with an automatic procedure to detect stagnation. When stagnation is detected, the search is switched to a second step characterized by a stochastic global exploration using an evolutionary algorithm. Using five ZDT benchmarks with 30 variables, the performance of the new algorithm is compared to state-of-the-art algorithms for multi-objective optimization, more specifically NSGA-II, SPEA2, and SMS-EMOA. The solutions found by the H2MA guide to higher hypervolume and smaller distance to the true Pareto frontier with significantly less function evaluations, even when the gradient is estimated numerically. Furthermore, although only continuous decision spaces have been considered here, discrete decision spaces could also have been treated, replacing gradient-based search by hill-climbing. Finally, a thorough explanation is provided to support the expressive gain in performance that was achieved

arXiv.org e-Print Archive

Reducing the Training Time of Neural Networks by Partitioning

Author: Miranda Conrado S.
Von Zuben Fernando J.
Publication venue
Publication date: 03/01/2016
Field of study

This paper presents a new method for pre-training neural networks that can decrease the total training time for a neural network while maintaining the final performance, which motivates its use on deep neural networks. By partitioning the training task in multiple training subtasks with sub-models, which can be performed independently and in parallel, it is shown that the size of the sub-models reduces almost quadratically with the number of subtasks created, quickly scaling down the sub-models used for the pre-training. The sub-models are then merged to provide a pre-trained initial set of weights for the original model. The proposed method is independent of the other aspects of the training, such as architecture of the neural network, training method, and objective, making it compatible with a wide range of existing approaches. The speedup without loss of performance is validated experimentally on MNIST and on CIFAR10 data sets, also showing that even performing the subtasks sequentially can decrease the training time. Moreover, we show that larger models may present higher speedups and conjecture about the benefits of the method in distributed learning systems.Comment: Figure 2b has lower quality due to file size constraint

arXiv.org e-Print Archive

Enumerating all maximal biclusters in numerical datasets

Author: Banerjee Arindam
Veroneze Rosana
Von Zuben Fernando J.
Publication venue
Publication date: 23/07/2015
Field of study

Biclustering has proved to be a powerful data analysis technique due to its wide success in various application domains. However, the existing literature presents efficient solutions only for enumerating maximal biclusters with constant values, or heuristic-based approaches which can not find all biclusters or even support the maximality of the obtained biclusters. Here, we present a general family of biclustering algorithms for enumerating all maximal biclusters with (i) constant values on rows, (ii) constant values on columns, or (iii) coherent values. Versions for perfect and for perturbed biclusters are provided. Our algorithms have four key properties (just the algorithm for perturbed biclusters with coherent values fails to exhibit the first property): they are (1) efficient (take polynomial time per pattern), (2) complete (find all maximal biclusters), (3) correct (all biclusters attend the user-defined measure of similarity), and (4) non-redundant (all the obtained biclusters are maximal and the same bicluster is not enumerated twice). They are based on a generalization of an efficient formal concept analysis algorithm called In-Close2. Experimental results point to the necessity of having efficient enumerative biclustering algorithms and provide a valuable insight into the scalability of our family of algorithms and its sensitivity to user-defined parameters

arXiv.org e-Print Archive

Online Social Network Analysis: A Survey of Research Applications in Computer Science

Author: Godoy Alan
Kurka David Burth
Von Zuben Fernando J.
Publication venue
Publication date: 04/04/2016
Field of study

The emergence and popularization of online social networks suddenly made available a large amount of data from social organization, interaction and human behavior. All this information opens new perspectives and challenges to the study of social systems, being of interest to many fields. Although most online social networks are recent (less than fifteen years old), a vast amount of scientific papers was already published on this topic, dealing with a broad range of analytical methods and applications. This work describes how computational researches have approached this subject and the methods used to analyze such systems. Founded on a wide though non-exaustive review of the literature, a taxonomy is proposed to classify and describe different categories of research. Each research category is described and the main works, discoveries and perspectives are highlighted

arXiv.org e-Print Archive