77,613 research outputs found
Big Data Visualization Tools
Data visualization is the presentation of data in a pictorial or graphical
format, and a data visualization tool is the software that generates this
presentation. Data visualization provides users with intuitive means to
interactively explore and analyze data, enabling them to effectively identify
interesting patterns, infer correlations and causalities, and supports
sense-making activities.Comment: This article appears in Encyclopedia of Big Data Technologies,
Springer, 201
Exploration and Visualization in the Web of Big Linked Data: A Survey of the State of the Art
Data exploration and visualization systems are of great importance in the Big
Data era. Exploring and visualizing very large datasets has become a major
research challenge, of which scalability is a vital requirement. In this
survey, we describe the major prerequisites and challenges that should be
addressed by the modern exploration and visualization systems. Considering
these challenges, we present how state-of-the-art approaches from the Database
and Information Visualization communities attempt to handle them. Finally, we
survey the systems developed by Semantic Web community in the context of the
Web of Linked Data, and discuss to which extent these satisfy the contemporary
requirements.Comment: 6th International Workshop on Linked Web Data Management (LWDM 2016
Web-based haptic applications for blind people to create virtual graphs
Haptic technology has great potentials in many applications. This paper introduces our work on delivery haptic information via the Web. A multimodal tool has been developed to allow blind people to create virtual graphs independently. Multimodal interactions in the process of graph creation and exploration are provided by using a low-cost haptic device, the Logitech WingMan Force Feedback Mouse, and Web audio. The Web-based tool also provides blind people with the convenience of receiving information at home. In this paper, we present the development of the tool and evaluation results. Discussions on the issues related to the design of similar Web-based haptic applications are also given
Graffinity: Visualizing Connectivity In Large Graphs
Multivariate graphs are prolific across many fields, including transportation
and neuroscience. A key task in graph analysis is the exploration of
connectivity, to, for example, analyze how signals flow through neurons, or to
explore how well different cities are connected by flights. While standard
node-link diagrams are helpful in judging connectivity, they do not scale to
large networks. Adjacency matrices also do not scale to large networks and are
only suitable to judge connectivity of adjacent nodes. A key approach to
realize scalable graph visualization are queries: instead of displaying the
whole network, only a relevant subset is shown. Query-based techniques for
analyzing connectivity in graphs, however, can also easily suffer from
cluttering if the query result is big enough. To remedy this, we introduce
techniques that provide an overview of the connectivity and reveal details on
demand. We have two main contributions: (1) two novel visualization techniques
that work in concert for summarizing graph connectivity; and (2) Graffinity, an
open-source implementation of these visualizations supplemented by detail views
to enable a complete analysis workflow. Graffinity was designed in a close
collaboration with neuroscientists and is optimized for connectomics data
analysis, yet the technique is applicable across domains. We validate the
connectivity overview and our open-source tool with illustrative examples using
flight and connectomics data.Comment: The definitive version is available at http://diglib.eg.org/ and
http://onlinelibrary.wiley.co
Exploring Restart Distributions
We consider the generic approach of using an experience memory to help
exploration by adapting a restart distribution. That is, given the capacity to
reset the state with those corresponding to the agent's past observations, we
help exploration by promoting faster state-space coverage via restarting the
agent from a more diverse set of initial states, as well as allowing it to
restart in states associated with significant past experiences. This approach
is compatible with both on-policy and off-policy methods. However, a caveat is
that altering the distribution of initial states could change the optimal
policies when searching within a restricted class of policies. To reduce this
unsought learning bias, we evaluate our approach in deep reinforcement learning
which benefits from the high representational capacity of deep neural networks.
We instantiate three variants of our approach, each inspired by an idea in the
context of experience replay. Using these variants, we show that performance
gains can be achieved, especially in hard exploration problems.Comment: RLDM 201
A Serverless Tool for Platform Agnostic Computational Experiment Management
Neuroscience has been carried into the domain of big data and high
performance computing (HPC) on the backs of initiatives in data collection and
an increasingly compute-intensive tools. While managing HPC experiments
requires considerable technical acumen, platforms and standards have been
developed to ease this burden on scientists. While web-portals make resources
widely accessible, data organizations such as the Brain Imaging Data Structure
and tool description languages such as Boutiques provide researchers with a
foothold to tackle these problems using their own datasets, pipelines, and
environments. While these standards lower the barrier to adoption of HPC and
cloud systems for neuroscience applications, they still require the
consolidation of disparate domain-specific knowledge. We present Clowdr, a
lightweight tool to launch experiments on HPC systems and clouds, record rich
execution records, and enable the accessible sharing of experimental summaries
and results. Clowdr uniquely sits between web platforms and bare-metal
applications for experiment management by preserving the flexibility of
do-it-yourself solutions while providing a low barrier for developing,
deploying and disseminating neuroscientific analysis.Comment: 12 pages, 3 figures, 1 too
Coloring Big Graphs with AlphaGoZero
We show that recent innovations in deep reinforcement learning can
effectively color very large graphs -- a well-known NP-hard problem with clear
commercial applications. Because the Monte Carlo Tree Search with Upper
Confidence Bound algorithm used in AlphaGoZero can improve the performance of a
given heuristic, our approach allows deep neural networks trained using high
performance computing (HPC) technologies to transform computation into improved
heuristics with zero prior knowledge. Key to our approach is the introduction
of a novel deep neural network architecture (FastColorNet) that has access to
the full graph context and requires time and space to color a graph with
vertices, which enables scaling to very large graphs that arise in real
applications like parallel computing, compilers, numerical solvers, and design
automation, among others. As a result, we are able to learn new state of the
art heuristics for graph coloring
Generating and evaluating application-specific hardware extensions
Modern platform-based design involves the application-specific extension of
embedded processors to fit customer requirements. To accomplish this task, the
possibilities offered by recent custom/extensible processors for tuning their
instruction set and microarchitecture to the applications of interest have to
be exploited. A significant factor often determining the success of this
process is the utomation available in application analysis and custom
instruction generation.
In this paper we present YARDstick, a design automation tool for custom
processor development flows that focuses on generating and evaluating
application-specific hardware extensions. YARDstick is a building block for
ASIP development, integrating application analysis, custom instruction
generation and selection with user-defined compiler intermediate
representations. In a YARDstick-enabled environment, practical issues in
traditional ASIP design are confronted efficiently; the exploration
infrastructure is liberated from compiler and simulator idiosyncrasies, since
the ASIP designer is empowered with the freedom of specifying the target
architectures of choice and adding new implementations of analyses and custom
instruction generation/selection methods. To illustrate the capabilities of the
YARDstick approach, we present interesting exploration scenarios: quantifying
the effect of machine-dependent compiler optimizations and the selection of the
target architecture in terms of operation set and memory model on custom
instruction generation/selection under different input/output constraints.Comment: 11 pages, 15 figures, 5 tables. An unpublished journal paper
presenting the YARDstick custom instruction generation environmen
Open Research Knowledge Graph: Next Generation Infrastructure for Semantic Scholarly Knowledge
Despite improved digital access to scholarly knowledge in recent decades,
scholarly communication remains exclusively document-based. In this form,
scholarly knowledge is hard to process automatically. In this paper, we present
the first steps towards a knowledge graph based infrastructure that acquires
scholarly knowledge in machine actionable form thus enabling new possibilities
for scholarly knowledge curation, publication and processing. The primary
contribution is to present, evaluate and discuss multi-modal scholarly
knowledge acquisition, combining crowdsourced and automated techniques. We
present the results of the first user evaluation of the infrastructure with the
participants of a recent international conference. Results suggest that users
were intrigued by the novelty of the proposed infrastructure and by the
possibilities for innovative scholarly knowledge processing it could enable.Comment: 8 page
The role of binding site on the mechanical unfolding mechanism of ubiquitin.
We apply novel atomistic simulations based on potential energy surface exploration to investigate the constant force-induced unfolding of ubiquitin. At the experimentally-studied force clamping level of 100 pN, we find a new unfolding mechanism starting with the detachment between β5 and β3 involving the binding site of ubiquitin, the Ile44 residue. This new unfolding pathway leads to the discovery of new intermediate configurations, which correspond to the end-to-end extensions previously seen experimentally. More importantly, it demonstrates the novel finding that the binding site of ubiquitin can be responsible not only for its biological functions, but also its unfolding dynamics. We also report in contrast to previous single molecule constant force experiments that when the clamping force becomes smaller than about 300 pN, the number of intermediate configurations increases dramatically, where almost all unfolding events at 100 pN involve an intermediate configuration. By directly calculating the life times of the intermediate configurations from the height of the barriers that were crossed on the potential energy surface, we demonstrate that these intermediate states were likely not observed experimentally due to their lifetimes typically being about two orders of magnitude smaller than the experimental temporal resolution
- …