1,162 research outputs found
Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives
The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions
Fast Distributed Approximation for Max-Cut
Finding a maximum cut is a fundamental task in many computational settings.
Surprisingly, it has been insufficiently studied in the classic distributed
settings, where vertices communicate by synchronously sending messages to their
neighbors according to the underlying graph, known as the or
models. We amend this by obtaining almost optimal
algorithms for Max-Cut on a wide class of graphs in these models. In
particular, for any , we develop randomized approximation
algorithms achieving a ratio of to the optimum for Max-Cut on
bipartite graphs in the model, and on general graphs in the
model.
We further present efficient deterministic algorithms, including a
-approximation for Max-Dicut in our models, thus improving the best known
(randomized) ratio of . Our algorithms make non-trivial use of the greedy
approach of Buchbinder et al. (SIAM Journal on Computing, 2015) for maximizing
an unconstrained (non-monotone) submodular function, which may be of
independent interest
Investigating Decision Support Techniques for Automating Cloud Service Selection
The compass of Cloud infrastructure services advances steadily leaving users
in the agony of choice. To be able to select the best mix of service offering
from an abundance of possibilities, users must consider complex dependencies
and heterogeneous sets of criteria. Therefore, we present a PhD thesis proposal
on investigating an intelligent decision support system for selecting Cloud
based infrastructure services (e.g. storage, network, CPU).Comment: Accepted by IEEE Cloudcom 2012 - PhD consortium trac
- …