117 research outputs found
Efficient mining of maximal biclusters in mixed-attribute datasets
This paper presents a novel enumerative biclustering algorithm to directly
mine all maximal biclusters in mixed-attribute datasets (containing both
numerical and categorical attributes), with or without missing values. The
proposal is an extension of RIn-Close_CVC, which was originally conceived to
mine perfect or perturbed biclusters with constant values on columns solely
from numerical datasets, and without missing values. Even endowed with
additional and more general features, the extended RIn-Close_CVC retains four
key properties: (1) efficiency, (2) completeness, (3) correctness, and (4)
non-redundancy. Our proposal is the first one to deal with mixed-attribute
datasets without requiring any pre-processing step, such as discretization and
itemization of real-valued attributes. This is a decisive aspect, because
discretization and itemization implies a priori decisions, with information
loss and no clear control over the consequences. On the other hand, even having
to specify a priori an individual threshold for each numerical attribute, that
will be used to indicate internal consistency per attribute, each threshold
will be applied during the construction of the biclusters, shaping the
peculiarities of the data distribution. We also explore the strong connection
between biclustering and frequent pattern mining to (1) provide filters to
select a compact bicluster set that exhibits high relevance and low redundancy,
and (2) in the case of labeled datasets, automatically present the biclusters
in a user-friendly and intuitive form, by means of quantitative class
association rules. Our experimental results showed that the biclusters yield a
parsimonious set of relevant rules, providing useful and interpretable models
for five mixed-attribute labeled datasets
RIn-Close_CVC2: an even more efficient enumerative algorithm for biclustering of numerical datasets
RIn-Close_CVC is an efficient (take polynomial time per bicluster), complete
(find all maximal biclusters), correct (all biclusters attend the user-defined
level of consistency) and non-redundant (all the obtained biclusters are
maximal and the same bicluster is not enumerated more than once) enumerative
algorithm for mining maximal biclusters with constant values on columns in
numerical datasets. Despite RIn-Close_CVC has all these outstanding properties,
it has a high computational cost in terms of memory usage because it must keep
a symbol table in memory to prevent a maximal bicluster to be found more than
once. In this paper, we propose a new version of RIn-Close_CVC, named
RIn-Close_CVC2, that does not use a symbol table to prevent redundant
biclusters, and keeps all these four properties. We also prove that these
algorithms actually possess these properties. Experiments are carried out with
synthetic and real-world datasets to compare RIn-Close_CVC and RIn-Close_CVC2
in terms of memory usage and runtime. The experimental results show that
RIn-Close_CVC2 brings a large reduction in memory usage and, in average,
significant runtime gain when compared to its predecessor
Necessary and Sufficient Conditions for Surrogate Functions of Pareto Frontiers and Their Synthesis Using Gaussian Processes
This paper introduces the necessary and sufficient conditions that surrogate
functions must satisfy to properly define frontiers of non-dominated solutions
in multi-objective optimization problems. These new conditions work directly on
the objective space, thus being agnostic about how the solutions are evaluated.
Therefore, real objectives or user-designed objectives' surrogates are allowed,
opening the possibility of linking independent objective surrogates. To
illustrate the practical consequences of adopting the proposed conditions, we
use Gaussian processes as surrogates endowed with monotonicity soft constraints
and with an adjustable degree of flexibility, and compare them to regular
Gaussian processes and to a frontier surrogate method in the literature that is
the closest to the method proposed in this paper. Results show that the
necessary and sufficient conditions proposed here are finely managed by the
constrained Gaussian process, guiding to high-quality surrogates capable of
suitably synthesizing an approximation to the Pareto frontier in challenging
instances of multi-objective optimization, while an existing approach that does
not take the theory proposed in consideration defines surrogates which greatly
violate the conditions to describe a valid frontier
Multi-Objective Optimization for Self-Adjusting Weighted Gradient in Machine Learning Tasks
Much of the focus in machine learning research is placed in creating new
architectures and optimization methods, but the overall loss function is seldom
questioned. This paper interprets machine learning from a multi-objective
optimization perspective, showing the limitations of the default linear
combination of loss functions over a data set and introducing the hypervolume
indicator as an alternative. It is shown that the gradient of the hypervolume
is defined by a self-adjusting weighted mean of the individual loss gradients,
making it similar to the gradient of a weighted mean loss but without requiring
the weights to be defined a priori. This enables an inner boosting-like
behavior, where the current model is used to automatically place higher weights
on samples with higher losses but without requiring the use of multiple models.
Results on a denoising autoencoder show that the new formulation is able to
achieve better mean loss than the direct optimization of the mean loss,
providing evidence to the conjecture that self-adjusting the weights creates a
smoother loss surface
MONISE - Many Objective Non-Inferior Set Estimation
This work proposes a novel many objective optimization approach that globally
finds a set of efficient solutions, also known as Pareto-optimal solutions, by
automatically formulating and solving a sequence of weighted problems. The
approach is called MONISE (Many-Objective NISE), because it represents an
extension of the well-known non-inferior set estimation (NISE) algorithm, which
was originally conceived to deal with two-dimensional objective spaces. Looking
for theoretical support, we demonstrate that being a solution of the weighted
problem is a necessary condition, and it will also be a sufficient condition at
the convex hull of the feasible set. The proposal is conceived to operate in
more than two dimensions, thus properly supporting many objectives. Moreover,
specifically deal with two objectives, some nice additional properties are
portrayed for the estimated non-inferior set. Experimental results are used to
validate the proposal and have indicated that MONISE is competitive both in
terms of computational cost and considering the overall quality of the
non-inferior set, measured by the hypervolume.Comment: 36 page
Single-Solution Hypervolume Maximization and its use for Improving Generalization of Neural Networks
This paper introduces the hypervolume maximization with a single solution as
an alternative to the mean loss minimization. The relationship between the two
problems is proved through bounds on the cost function when an optimal solution
to one of the problems is evaluated on the other, with a hyperparameter to
control the similarity between the two problems. This same hyperparameter
allows higher weight to be placed on samples with higher loss when computing
the hypervolume's gradient, whose normalized version can range from the mean
loss to the max loss. An experiment on MNIST with a neural network is used to
validate the theory developed, showing that the hypervolume maximization can
behave similarly to the mean loss minimization and can also provide better
performance, resulting on a 20% reduction of the classification error on the
test set
Hybrid Algorithm for Multi-Objective Optimization by Greedy Hypervolume Maximization
This paper introduces a high-performance hybrid algorithm, called Hybrid
Hypervolume Maximization Algorithm (H2MA), for multi-objective optimization
that alternates between exploring the decision space and exploiting the already
obtained non-dominated solutions. The proposal is centered on maximizing the
hypervolume indicator, thus converting the multi-objective problem into a
single-objective one. The exploitation employs gradient-based methods, but
considering a single candidate efficient solution at a time, to overcome
limitations associated with population-based approaches and also to allow an
easy control of the number of solutions provided. There is an interchange
between two steps. The first step is a deterministic local exploration, endowed
with an automatic procedure to detect stagnation. When stagnation is detected,
the search is switched to a second step characterized by a stochastic global
exploration using an evolutionary algorithm. Using five ZDT benchmarks with 30
variables, the performance of the new algorithm is compared to state-of-the-art
algorithms for multi-objective optimization, more specifically NSGA-II, SPEA2,
and SMS-EMOA. The solutions found by the H2MA guide to higher hypervolume and
smaller distance to the true Pareto frontier with significantly less function
evaluations, even when the gradient is estimated numerically. Furthermore,
although only continuous decision spaces have been considered here, discrete
decision spaces could also have been treated, replacing gradient-based search
by hill-climbing. Finally, a thorough explanation is provided to support the
expressive gain in performance that was achieved
Reducing the Training Time of Neural Networks by Partitioning
This paper presents a new method for pre-training neural networks that can
decrease the total training time for a neural network while maintaining the
final performance, which motivates its use on deep neural networks. By
partitioning the training task in multiple training subtasks with sub-models,
which can be performed independently and in parallel, it is shown that the size
of the sub-models reduces almost quadratically with the number of subtasks
created, quickly scaling down the sub-models used for the pre-training. The
sub-models are then merged to provide a pre-trained initial set of weights for
the original model. The proposed method is independent of the other aspects of
the training, such as architecture of the neural network, training method, and
objective, making it compatible with a wide range of existing approaches. The
speedup without loss of performance is validated experimentally on MNIST and on
CIFAR10 data sets, also showing that even performing the subtasks sequentially
can decrease the training time. Moreover, we show that larger models may
present higher speedups and conjecture about the benefits of the method in
distributed learning systems.Comment: Figure 2b has lower quality due to file size constraint
Enumerating all maximal biclusters in numerical datasets
Biclustering has proved to be a powerful data analysis technique due to its
wide success in various application domains. However, the existing literature
presents efficient solutions only for enumerating maximal biclusters with
constant values, or heuristic-based approaches which can not find all
biclusters or even support the maximality of the obtained biclusters. Here, we
present a general family of biclustering algorithms for enumerating all maximal
biclusters with (i) constant values on rows, (ii) constant values on columns,
or (iii) coherent values. Versions for perfect and for perturbed biclusters are
provided. Our algorithms have four key properties (just the algorithm for
perturbed biclusters with coherent values fails to exhibit the first property):
they are (1) efficient (take polynomial time per pattern), (2) complete (find
all maximal biclusters), (3) correct (all biclusters attend the user-defined
measure of similarity), and (4) non-redundant (all the obtained biclusters are
maximal and the same bicluster is not enumerated twice). They are based on a
generalization of an efficient formal concept analysis algorithm called
In-Close2. Experimental results point to the necessity of having efficient
enumerative biclustering algorithms and provide a valuable insight into the
scalability of our family of algorithms and its sensitivity to user-defined
parameters
Online Social Network Analysis: A Survey of Research Applications in Computer Science
The emergence and popularization of online social networks suddenly made
available a large amount of data from social organization, interaction and
human behavior. All this information opens new perspectives and challenges to
the study of social systems, being of interest to many fields. Although most
online social networks are recent (less than fifteen years old), a vast amount
of scientific papers was already published on this topic, dealing with a broad
range of analytical methods and applications. This work describes how
computational researches have approached this subject and the methods used to
analyze such systems. Founded on a wide though non-exaustive review of the
literature, a taxonomy is proposed to classify and describe different
categories of research. Each research category is described and the main works,
discoveries and perspectives are highlighted
- …