332 research outputs found
AAPOR Report on Big Data
In recent years we have seen an increase in the amount of statistics in society describing different phenomena based on so called Big Data. The term Big Data is used for a variety of data as explained in the report, many of them characterized not just by their large volume, but also by their variety and velocity, the organic way in which they are created, and the new types of processes needed to analyze them and make inference from them. The change in the nature of the new types of data, their availability, the way in which they are collected, and disseminated are fundamental. The change constitutes a paradigm shift for survey research.There is a great potential in Big Data but there are some fundamental challenges that have to be resolved before its full potential can be realized. In this report we give examples of different types of Big Data and their potential for survey research. We also describe the Big Data process and discuss its main challenges
Overview of Some Intelligent Control Structures and Dedicated Algorithms
Automatic control refers to the use of a control device to make the controlled object automatically run or keep the state unchanged without the participation of people. The guiding ideology of intelligent control is based on people’s way of thinking and ability to solve problems, in order to solve the current methods that require human intelligence. We already know that the complexity of the controlled object includes model uncertainty, high nonlinearity, distributed sensors/actuators, dynamic mutations, multiple time scales, complex information patterns, big data process, and strict characteristic indicators, etc. In addition, the complexity of the environment manifests itself in uncertainty and uncertainty of change. Based on this, various researches continue to suggest that the main methods of intelligent control can include expert control, fuzzy control, neural network control, hierarchical intelligent control, anthropomorphic intelligent control, integrated intelligent control, combined intelligent control, chaos control, wavelet theory, etc. However, it is difficult to want all the intelligent control methods in a chapter, so this chapter focuses on intelligent control based on fuzzy logic, intelligent control based on neural network, expert control and human-like intelligent control, and hierarchical intelligent control and learning control, and provide relevant and useful programming for readers to practice
Recommended from our members
Big Data Assurance Evaluation: An SLA-Based Approach.
The Big Data community has started noticing that there is the need to complete Big Data platforms with assurance techniques proving the correct behavior of Big Data
analytics and management. In this paper, we propose a Big Data assurance solution based on Service-Level Agreements (SLAs), focusing on a platform providing Model-based Big Data Analytics-as-a-Service (MBDAaaS)
Big Data on Decision Making in Energetic Management of Copper Mining
Indexado en: Web of Science; Scopus.It is proposed an analysis of the related variables with the energetic consumption in the process of concentrate of copper; specifically ball mills and SAG. The methodology considers the analysis of great volumes of data, which allows to identify the variables of interest (tonnage, temperature and power) to reach to an improvement plan in the energetic efficiency. The correct processing of the great volumen of data, previous imputation to the null data, not informed and out of range, coming from the milling process of copper, a decision support systems integrated, it allows to obtain clear and on line information for the decision making. As results it is establish that exist correlation between the energetic consumption of the Ball and SAG Mills, regarding the East, West temperature and winding. Nevertheless, it is not observed correlation between the energetic consumption of the Ball Mills and the SAG Mills, regarding to the tonnages of feed of SAG Mill. In consequence, From the experimental design, a similarity of behavior between two groups of different mills was determined in lines process. In addition, it was determined that there is a difference in energy consumption between the mills of the same group. This approach modifies the method presented in [1].(a)http://www.univagora.ro/jour/index.php/ijccc/article/view/2784/106
Process-oriented Iterative Multiple Alignment for Medical Process Mining
Adapted from biological sequence alignment, trace alignment is a process
mining technique used to visualize and analyze workflow data. Any analysis done
with this method, however, is affected by the alignment quality. The best
existing trace alignment techniques use progressive guide-trees to
heuristically approximate the optimal alignment in O(N2L2) time. These
algorithms are heavily dependent on the selected guide-tree metric, often
return sum-of-pairs-score-reducing errors that interfere with interpretation,
and are computationally intensive for large datasets. To alleviate these
issues, we propose process-oriented iterative multiple alignment (PIMA), which
contains specialized optimizations to better handle workflow data. We
demonstrate that PIMA is a flexible framework capable of achieving better
sum-of-pairs score than existing trace alignment algorithms in only O(NL2)
time. We applied PIMA to analyzing medical workflow data, showing how iterative
alignment can better represent the data and facilitate the extraction of
insights from data visualization.Comment: accepted at ICDMW 201
Distributed Graph Clustering using Modularity and Map Equation
We study large-scale, distributed graph clustering. Given an undirected
graph, our objective is to partition the nodes into disjoint sets called
clusters. A cluster should contain many internal edges while being sparsely
connected to other clusters. In the context of a social network, a cluster
could be a group of friends. Modularity and map equation are established
formalizations of this internally-dense-externally-sparse principle. We present
two versions of a simple distributed algorithm to optimize both measures. They
are based on Thrill, a distributed big data processing framework that
implements an extended MapReduce model. The algorithms for the two measures,
DSLM-Mod and DSLM-Map, differ only slightly. Adapting them for similar quality
measures is straight-forward. We conduct an extensive experimental study on
real-world graphs and on synthetic benchmark graphs with up to 68 billion
edges. Our algorithms are fast while detecting clusterings similar to those
detected by other sequential, parallel and distributed clustering algorithms.
Compared to the distributed GossipMap algorithm, DSLM-Map needs less memory, is
up to an order of magnitude faster and achieves better quality.Comment: 14 pages, 3 figures; v3: Camera ready for Euro-Par 2018, more
details, more results; v2: extended experiments to include comparison with
competing algorithms, shortened for submission to Euro-Par 201
- …