7,522 research outputs found
Intelligent Management and Efficient Operation of Big Data
This chapter details how Big Data can be used and implemented in networking
and computing infrastructures. Specifically, it addresses three main aspects:
the timely extraction of relevant knowledge from heterogeneous, and very often
unstructured large data sources, the enhancement on the performance of
processing and networking (cloud) infrastructures that are the most important
foundational pillars of Big Data applications or services, and novel ways to
efficiently manage network infrastructures with high-level composed policies
for supporting the transmission of large amounts of data with distinct
requisites (video vs. non-video). A case study involving an intelligent
management solution to route data traffic with diverse requirements in a wide
area Internet Exchange Point is presented, discussed in the context of Big
Data, and evaluated.Comment: In book Handbook of Research on Trends and Future Directions in Big
Data and Web Intelligence, IGI Global, 201
DPVis: Visual Analytics with Hidden Markov Models for Disease Progression Pathways
Clinical researchers use disease progression models to understand patient
status and characterize progression patterns from longitudinal health records.
One approach for disease progression modeling is to describe patient status
using a small number of states that represent distinctive distributions over a
set of observed measures. Hidden Markov models (HMMs) and its variants are a
class of models that both discover these states and make inferences of health
states for patients. Despite the advantages of using the algorithms for
discovering interesting patterns, it still remains challenging for medical
experts to interpret model outputs, understand complex modeling parameters, and
clinically make sense of the patterns. To tackle these problems, we conducted a
design study with clinical scientists, statisticians, and visualization
experts, with the goal to investigate disease progression pathways of chronic
diseases, namely type 1 diabetes (T1D), Huntington's disease, Parkinson's
disease, and chronic obstructive pulmonary disease (COPD). As a result, we
introduce DPVis which seamlessly integrates model parameters and outcomes of
HMMs into interpretable and interactive visualizations. In this study, we
demonstrate that DPVis is successful in evaluating disease progression models,
visually summarizing disease states, interactively exploring disease
progression patterns, and building, analyzing, and comparing clinically
relevant patient subgroups.Comment: to appear at IEEE Transactions on Visualization and Computer Graphic
Data Mining and Machine Learning in Astronomy
We review the current state of data mining and machine learning in astronomy.
'Data Mining' can have a somewhat mixed connotation from the point of view of a
researcher in this field. If used correctly, it can be a powerful approach,
holding the potential to fully exploit the exponentially increasing amount of
available data, promising great scientific advance. However, if misused, it can
be little more than the black-box application of complex computing algorithms
that may give little physical insight, and provide questionable results. Here,
we give an overview of the entire data mining process, from data collection
through to the interpretation of results. We cover common machine learning
algorithms, such as artificial neural networks and support vector machines,
applications from a broad range of astronomy, emphasizing those where data
mining techniques directly resulted in improved science, and important current
and future directions, including probability density functions, parallel
algorithms, petascale computing, and the time domain. We conclude that, so long
as one carefully selects an appropriate algorithm, and is guided by the
astronomical problem at hand, data mining can be very much the powerful tool,
and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra
figures, some minor additions to the tex
Bi-Objective Nonnegative Matrix Factorization: Linear Versus Kernel-Based Models
Nonnegative matrix factorization (NMF) is a powerful class of feature
extraction techniques that has been successfully applied in many fields, namely
in signal and image processing. Current NMF techniques have been limited to a
single-objective problem in either its linear or nonlinear kernel-based
formulation. In this paper, we propose to revisit the NMF as a multi-objective
problem, in particular a bi-objective one, where the objective functions
defined in both input and feature spaces are taken into account. By taking the
advantage of the sum-weighted method from the literature of multi-objective
optimization, the proposed bi-objective NMF determines a set of nondominated,
Pareto optimal, solutions instead of a single optimal decomposition. Moreover,
the corresponding Pareto front is studied and approximated. Experimental
results on unmixing real hyperspectral images confirm the efficiency of the
proposed bi-objective NMF compared with the state-of-the-art methods
- …