Search CORE

1,162 research outputs found

Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives

Author: Bazlur Rashid A. N. M.
Choudhury Tonmoy
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2019
Field of study

The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions

Research Online @ ECU

The data cyclotron : juggling data and queries for a data warehouse audience

Author: Pereira Goncalves R.A. (Romulo Antonio)
Publication venue
Publication date: 22/03/2013
Field of study

CWI's Institutional Repository

Fast Distributed Approximation for Max-Cut

Author: C Lenzen
F Barahona
F Hadlock
F Kuhn
F Kuhn
G Xue
J Håstad
K Chang
KW Chin
L Trevisan
L Trevisan
L Wang
M Elkin
M Ghaffari
M Grötschel
M Åstrand
MR Garey
MX Goemans
N Buchbinder
N Linial
S Khot
S Matuura
S Sahni
S Saurabh
U Feige
Y Xu
Publication venue
Publication date: 26/07/2017
Field of study

Finding a maximum cut is a fundamental task in many computational settings. Surprisingly, it has been insufficiently studied in the classic distributed settings, where vertices communicate by synchronously sending messages to their neighbors according to the underlying graph, known as the

\mathcal{LOCAL}

\mathcal{CONGEST}

models. We amend this by obtaining almost optimal algorithms for Max-Cut on a wide class of graphs in these models. In particular, for any

\epsilon > 0

, we develop randomized approximation algorithms achieving a ratio of

(1-\epsilon)

to the optimum for Max-Cut on bipartite graphs in the

\mathcal{CONGEST}

model, and on general graphs in the

\mathcal{LOCAL}

model. We further present efficient deterministic algorithms, including a

1/3

-approximation for Max-Dicut in our models, thus improving the best known (randomized) ratio of

1/4

. Our algorithms make non-trivial use of the greedy approach of Buchbinder et al. (SIAM Journal on Computing, 2015) for maximizing an unconstrained (non-monotone) submodular function, which may be of independent interest

arXiv.org e-Print Archive

Crossref

Investigating Decision Support Techniques for Automating Cloud Service Selection

Author: Georgakopoulos Dimitrios
Haller Armin
Ranjan Rajiv
Strazdins Peter
Zhang Miranda
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

The compass of Cloud infrastructure services advances steadily leaving users in the agony of choice. To be able to select the best mix of service offering from an abundance of possibilities, users must consider complex dependencies and heterogeneous sets of criteria. Therefore, we present a PhD thesis proposal on investigating an intelligent decision support system for selecting Cloud based infrastructure services (e.g. storage, network, CPU).Comment: Accepted by IEEE Cloudcom 2012 - PhD consortium trac

arXiv.org e-Print Archive

RMIT Research Repository