2,221 research outputs found
Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs
Laplacian mixture models identify overlapping regions of influence in
unlabeled graph and network data in a scalable and computationally efficient
way, yielding useful low-dimensional representations. By combining Laplacian
eigenspace and finite mixture modeling methods, they provide probabilistic or
fuzzy dimensionality reductions or domain decompositions for a variety of input
data types, including mixture distributions, feature vectors, and graphs or
networks. Provable optimal recovery using the algorithm is analytically shown
for a nontrivial class of cluster graphs. Heuristic approximations for scalable
high-performance implementations are described and empirically tested.
Connections to PageRank and community detection in network analysis demonstrate
the wide applicability of this approach. The origins of fuzzy spectral methods,
beginning with generalized heat or diffusion equations in physics, are reviewed
and summarized. Comparisons to other dimensionality reduction and clustering
methods for challenging unsupervised machine learning problems are also
discussed.Comment: 13 figures, 35 reference
A Diagram Is Worth A Dozen Images
Diagrams are common tools for representing complex concepts, relationships
and events, often when it would be difficult to portray the same information
with natural images. Understanding natural images has been extensively studied
in computer vision, while diagram understanding has received little attention.
In this paper, we study the problem of diagram interpretation and reasoning,
the challenging task of identifying the structure of a diagram and the
semantics of its constituents and their relationships. We introduce Diagram
Parse Graphs (DPG) as our representation to model the structure of diagrams. We
define syntactic parsing of diagrams as learning to infer DPGs for diagrams and
study semantic interpretation and reasoning of diagrams in the context of
diagram question answering. We devise an LSTM-based method for syntactic
parsing of diagrams and introduce a DPG-based attention model for diagram
question answering. We compile a new dataset of diagrams with exhaustive
annotations of constituents and relationships for over 5,000 diagrams and
15,000 questions and answers. Our results show the significance of our models
for syntactic parsing and question answering in diagrams using DPGs
How Best to Hunt a Mammoth - Toward Automated Knowledge Extraction From Graphical Research Models
In the Information Systems (IS) discipline, central contributions of research projects are often represented in graphical research models, clearly illustrating constructs and their relationships. Although thousands of such representations exist, methods for extracting this source of knowledge are still in an early stage. We present a method for (1) extracting graphical research models from articles, (2) generating synthetic training data for (3) performing object detection with a neural network, and (4) a graph reconstruction algorithm to (5) storing results into a designated research model format. We trained YOLOv7 on 20,000 generated diagrams and evaluated its performance on 100 manually reconstructed diagrams from the Senior Scholars\u27 Basket. The results for extracting graphical research models show a F1-score of 0.82 for nodes, 0.72 for links, and an accuracy of 0.72 for labels, indicating the applicability for supporting the population of knowledge repositories contributing to knowledge synthesi
Abstract Data Visualisation in Mobile VR Platforms
Data visualisation, as a key tool in data understanding, is widely used in science and everyday life. In order data visualisation to be effective, perceptual factors and the characteristics of the display interface play a crucial role. Virtual Reality is nowadays accepted as a valid medium for scientific visualisation, because of its inherent characteristics of real-world emulation and intuitive interaction. However, the use of VR in abstract data visualisation is still limited. In this research, I investigate the use and suitability of mobile phone-based Virtual Reality as a medium for abstract data visualisation. I develop a prototype VR Android application and visualise data using the Scatterplot and Parallel Coordinates methods. After that, I conduct a user study to compare the effectiveness of the mobile VR application compared to a similar screen-based one by implementing some data exploration scenarios. The study results, while not being statistically significant, show improved accuracy and speed in the mobile VR visualisation application. The main conclusions are two-fold: Virtual Reality is beneficial for abstract data visualisation, even in the case of limited processing power and display resolution. Mobile VR, an affordable alternative to expensive desktop VR set-ups can be utilized as a data visualisation platform
Graph matching with a dual-step EM algorithm
This paper describes a new approach to matching geometric structure in 2D point-sets. The novel feature is to unify the tasks of estimating transformation geometry and identifying point-correspondence matches. Unification is realized by constructing a mixture model over the bipartite graph representing the correspondence match and by affecting optimization using the EM algorithm. According to our EM framework, the probabilities of structural correspondence gate contributions to the expected likelihood function used to estimate maximum likelihood transformation parameters. These gating probabilities measure the consistency of the matched neighborhoods in the graphs. The recovery of transformational geometry and hard correspondence matches are interleaved and are realized by applying coupled update operations to the expected log-likelihood function. In this way, the two processes bootstrap one another. This provides a means of rejecting structural outliers. We evaluate the technique on two real-world problems. The first involves the matching of different perspective views of 3.5-inch floppy discs. The second example is furnished by the matching of a digital map against aerial images that are subject to severe barrel distortion due to a line-scan sampling process. We complement these experiments with a sensitivity study based on synthetic data
Defending against Sybil Devices in Crowdsourced Mapping Services
Real-time crowdsourced maps such as Waze provide timely updates on traffic,
congestion, accidents and points of interest. In this paper, we demonstrate how
lack of strong location authentication allows creation of software-based {\em
Sybil devices} that expose crowdsourced map systems to a variety of security
and privacy attacks. Our experiments show that a single Sybil device with
limited resources can cause havoc on Waze, reporting false congestion and
accidents and automatically rerouting user traffic. More importantly, we
describe techniques to generate Sybil devices at scale, creating armies of
virtual vehicles capable of remotely tracking precise movements for large user
populations while avoiding detection. We propose a new approach to defend
against Sybil devices based on {\em co-location edges}, authenticated records
that attest to the one-time physical co-location of a pair of devices. Over
time, co-location edges combine to form large {\em proximity graphs} that
attest to physical interactions between devices, allowing scalable detection of
virtual vehicles. We demonstrate the efficacy of this approach using
large-scale simulations, and discuss how they can be used to dramatically
reduce the impact of attacks against crowdsourced mapping services.Comment: Measure and integratio
- …