550 research outputs found
Projectional Editors for JSON-Based DSLs
Augmenting text-based programming with rich structured interactions has been
explored in many ways. Among these, projectional editors offer an enticing
combination of structure editing and domain-specific program visualization. Yet
such tools are typically bespoke and expensive to produce, leaving them
inaccessible to many DSL and application designers.
We describe a relatively inexpensive way to build rich projectional editors
for a large class of DSLs -- namely, those defined using JSON. Given any such
JSON-based DSL, we derive a projectional editor through (i) a language-agnostic
mapping from JSON Schemas to structure-editor GUIs and (ii) an API for
application designers to implement custom views for the domain-specific types
described in a schema. We implement these ideas in a prototype, Prong, which we
illustrate with several examples including the Vega and Vega-Lite data
visualization DSLs.Comment: To appear at VL/HCC 202
Debugging visualization tools: a systematic review
Debugging is the task of locate and fix program bugs. Debugging activity is performed in the same way since 1960’s, when the first symbolic debuggers were introduced. Recently, visualization techniques have been proposed to represent program information during fault localization. However, none of them were introduced at industrial environments. This article presents a systematic review about visualization techniques for debugging. Despite the increasing number of studies in the area, visual debugging tools are not yet used in practice.Sociedad Argentina de Informática e Investigación Operativ
Recommended from our members
Muck: A Build Tool for Data Journalists
Veracity and reproducibility are vital qualities for any data journalism project. As computational investigations become more complex and time consuming, the effort required to maintain correctness of code and conclusions increases dramatically. This report presents Muck, a new tool for organizing and reliably reproducing data computations. Muck is a command line program that plays the role of the build system in traditional software development, except that instead of being used to compile code into executable applications, it runs data processing scripts to produce output documents (e.g., data visualizations or tables of statistical results). In essence, it automates the task of executing a series of computational steps to produce an updated product. The system supports a variety of languages, formats, and tools, and draws upon well-established Unix software conventions.
A great deal of data journalism work can be characterized as a process of deriving data from original sources. Muck models such work as a graph of computational steps and uses this model to update results efficiently whenever the inputs or code change. This algorithmic approach relieves programmers from having to constantly worry about the dependency relationships between various parts of a project. At the same time, Muck encourages programmers to organize their code into modular scripts, which can make the code more readable for a collaborating group. The system relies on a naming convention to connect scripts to their outputs, and automatically infers the dependency graph from these implied relationships. Thus, unlike more traditional build systems, Muck requires no configuration files, which makes altering the structure of a project less onerous.
Muck’s development was motivated by conversations with working data journalists and students. This report describes the rationale for building a new tool, its compelling features, and preliminary experience testing it with several demonstration projects. Muck has proven successful for a variety of use cases, but work remains to be done on documentation, compatibility, and testing. The long-term goal of the project is to provide a simple, language-agnostic tool that allows journalists to better develop and maintain ambitious data projects
LabelVizier: Interactive Validation and Relabeling for Technical Text Annotations
With the rapid accumulation of text data produced by data-driven techniques,
the task of extracting "data annotations"--concise, high-quality data summaries
from unstructured raw text--has become increasingly important. The recent
advances in weak supervision and crowd-sourcing techniques provide promising
solutions to efficiently create annotations (labels) for large-scale technical
text data. However, such annotations may fail in practice because of the change
in annotation requirements, application scenarios, and modeling goals, where
label validation and relabeling by domain experts are required. To approach
this issue, we present LabelVizier, a human-in-the-loop workflow that
incorporates domain knowledge and user-specific requirements to reveal
actionable insights into annotation flaws, then produce better-quality labels
for large-scale multi-label datasets. We implement our workflow as an
interactive notebook to facilitate flexible error profiling, in-depth
annotation validation for three error types, and efficient annotation
relabeling on different data scales. We evaluated the efficiency and
generalizability of our workflow with two use cases and four expert reviews.
The results indicate that LabelVizier is applicable in various application
scenarios and assist domain experts with different knowledge backgrounds to
efficiently improve technical text annotation quality.Comment: 10 pages, 5 figure
Doctor of Philosophy
dissertationRay tracing presents an efficient rendering algorithm for scientific visualization using common visualization tools and scales with increasingly large geometry counts while allowing for accurate physically-based visualization and analysis, which enables enhanced rendering and new visualization techniques. Interactivity is of great importance for data exploration and analysis in order to gain insight into large-scale data. Increasingly large data sizes are pushing the limits of brute-force rasterization algorithms present in the most widely-used visualization software. Interactive ray tracing presents an alternative rendering solution which scales well on multicore shared memory machines and multinode distributed systems while scaling with increasing geometry counts through logarithmic acceleration structure traversals. Ray tracing within existing tools also provides enhanced rendering options over current implementations, giving users additional insight from better depth cues while also enabling publication-quality rendering and new models of visualization such as replicating photographic visualization techniques
Image and video processing using graphics hardware
Graphic Processing Units have during the recent years evolved into inexpensive high-performance many-core computing units. Earlier being accessible only by graphic APIs, new hardware architectures and programming tools have made it possible to program these devices using arbitrary data types and standard languages like C.
This thesis investigates the development process and performance of image and video processing algorithms on graphic processing units, regardless of vendors. The tool used for programming the graphic processing units is OpenCL, a rela- tively new specification for heterogenous computing. Two image algorithms are investigated, bilateral filter and histogram. In addition, an attempt have been tried to make a template-based solution for generation and auto-optimalization of device code, but this approach seemed to have some shortcomings to be usable enough at this time
Analysis of Distributed Systems Dynamics with Erlang Performance Lab
Modern, highly concurrent and large-scale systems require new methods for design, testing and monitoring. Their dynamics and scale require real-time tools, providing a holistic view of the whole system and the ability of showing a more detailed view when needed. Such tools can help identifying the causes of unwanted states, which is hardly possible with static analysis or metrics-based approach. In this paper a new tool for analysis of distributed systems in Erlang is presented. It provides real-time monitoring of system dynamics on different levels of abstraction. The tool has been used for analyzing a large-scale urban traffic simulation system running on a cluster of 20 computing nodes
A Multi-Code Analysis Toolkit for Astrophysical Simulation Data
The analysis of complex multiphysics astrophysical simulations presents a
unique and rapidly growing set of challenges: reproducibility, parallelization,
and vast increases in data size and complexity chief among them. In order to
meet these challenges, and in order to open up new avenues for collaboration
between users of multiple simulation platforms, we present yt (available at
http://yt.enzotools.org/), an open source, community-developed astrophysical
analysis and visualization toolkit. Analysis and visualization with yt are
oriented around physically relevant quantities rather than quantities native to
astrophysical simulation codes. While originally designed for handling Enzo's
structure adaptive mesh refinement (AMR) data, yt has been extended to work
with several different simulation methods and simulation codes including Orion,
RAMSES, and FLASH. We report on its methods for reading, handling, and
visualizing data, including projections, multivariate volume rendering,
multi-dimensional histograms, halo finding, light cone generation and
topologically-connected isocontour identification. Furthermore, we discuss the
underlying algorithms yt uses for processing and visualizing data, and its
mechanisms for parallelization of analysis tasks.Comment: 18 pages, 6 figures, emulateapj format. Resubmitted to Astrophysical
Journal Supplement Series with revisions from referee. yt can be found at
http://yt.enzotools.org
- …