1,686 research outputs found
Mapping Big Data into Knowledge Space with Cognitive Cyber-Infrastructure
Big data research has attracted great attention in science, technology,
industry and society. It is developing with the evolving scientific paradigm,
the fourth industrial revolution, and the transformational innovation of
technologies. However, its nature and fundamental challenge have not been
recognized, and its own methodology has not been formed. This paper explores
and answers the following questions: What is big data? What are the basic
methods for representing, managing and analyzing big data? What is the
relationship between big data and knowledge? Can we find a mapping from big
data into knowledge space? What kind of infrastructure is required to support
not only big data management and analysis but also knowledge discovery, sharing
and management? What is the relationship between big data and science paradigm?
What is the nature and fundamental challenge of big data computing? A
multi-dimensional perspective is presented toward a methodology of big data
computing.Comment: 59 page
DeepMerge: Deep-Learning-Based Region-Merging for Image Segmentation
Image segmentation aims to partition an image according to the objects in the
scene and is a fundamental step in analysing very high spatial-resolution (VHR)
remote sensing imagery. Current methods struggle to effectively consider land
objects with diverse shapes and sizes. Additionally, the determination of
segmentation scale parameters frequently adheres to a static and empirical
doctrine, posing limitations on the segmentation of large-scale remote sensing
images and yielding algorithms with limited interpretability. To address the
above challenges, we propose a deep-learning-based region merging method dubbed
DeepMerge to handle the segmentation of complete objects in large VHR images by
integrating deep learning and region adjacency graph (RAG). This is the first
method to use deep learning to learn the similarity and merge similar adjacent
super-pixels in RAG. We propose a modified binary tree sampling method to
generate shift-scale data, serving as inputs for transformer-based deep
learning networks, a shift-scale attention with 3-Dimension relative position
embedding to learn features across scales, and an embedding to fuse learned
features with hand-crafted features. DeepMerge can achieve high segmentation
accuracy in a supervised manner from large-scale remotely sensed images and
provides an interpretable optimal scale parameter, which is validated using a
remote sensing image of 0.55 m resolution covering an area of 5,660 km^2. The
experimental results show that DeepMerge achieves the highest F value (0.9550)
and the lowest total error TE (0.0895), correctly segmenting objects of
different sizes and outperforming all competing segmentation methods
Connectionist-Symbolic Machine Intelligence using Cellular Automata based Reservoir-Hyperdimensional Computing
We introduce a novel framework of reservoir computing, that is capable of
both connectionist machine intelligence and symbolic computation. Cellular
automaton is used as the reservoir of dynamical systems. Input is randomly
projected onto the initial conditions of automaton cells and nonlinear
computation is performed on the input via application of a rule in the
automaton for a period of time. The evolution of the automaton creates a
space-time volume of the automaton state space, and it is used as the
reservoir. The proposed framework is capable of long short-term memory and it
requires orders of magnitude less computation compared to Echo State Networks.
We prove that cellular automaton reservoir holds a distributed representation
of attribute statistics, which provides a more effective computation than local
representation. It is possible to estimate the kernel for linear cellular
automata via metric learning, that enables a much more efficient distance
computation in support vector machine framework. Also, binary reservoir feature
vectors can be combined using Boolean operations as in hyperdimensional
computing, paving a direct way for concept building and symbolic processing.Comment: Corrected Typos. Responded some comments on section 8. Added appendix
for details. Recurrent architecture emphasize
Visual Analysis of High-Dimensional Point Clouds using Topological Abstraction
This thesis is about visualizing a kind of data that is trivial to process by computers but difficult to imagine by humans because nature does not allow for intuition with this type of information: high-dimensional data. Such data often result from representing observations of objects under various aspects or with different properties. In many applications, a typical, laborious task is to find related objects or to group those that are similar to each other. One classic solution for this task is to imagine the data as vectors in a Euclidean space with object variables as dimensions. Utilizing Euclidean distance as a measure of similarity, objects with similar properties and values accumulate to groups, so-called clusters, that are exposed by cluster analysis on the high-dimensional point cloud. Because similar vectors can be thought of as objects that are alike in terms of their attributes, the point cloud\''s structure and individual cluster properties, like their size or compactness, summarize data categories and their relative importance. The contribution of this thesis is a novel analysis approach for visual exploration of high-dimensional point clouds without suffering from structural occlusion. The work is based on implementing two key concepts: The first idea is to discard those geometric properties that cannot be preserved and, thus, lead to the typical artifacts. Topological concepts are used instead to shift away the focus from a point-centered view on the data to a more structure-centered perspective. The advantage is that topology-driven clustering information can be extracted in the data\''s original domain and be preserved without loss in low dimensions. The second idea is to split the analysis into a topology-based global overview and a subsequent geometric local refinement. The occlusion-free overview enables the analyst to identify features and to link them to other visualizations that permit analysis of those properties not captured by the topological abstraction, e.g. cluster shape or value distributions in particular dimensions or subspaces. The advantage of separating structure from data point analysis is that restricting local analysis only to data subsets significantly reduces artifacts and the visual complexity of standard techniques. That is, the additional topological layer enables the analyst to identify structure that was hidden before and to focus on particular features by suppressing irrelevant points during local feature analysis. This thesis addresses the topology-based visual analysis of high-dimensional point clouds for both the time-invariant and the time-varying case. Time-invariant means that the points do not change in their number or positions. That is, the analyst explores the clustering of a fixed and constant set of points. The extension to the time-varying case implies the analysis of a varying clustering, where clusters appear as new, merge or split, or vanish. Especially for high-dimensional data, both tracking---which means to relate features over time---but also visualizing changing structure are difficult problems to solve
- …