22 research outputs found
Overlap Removal of Dimensionality Reduction Scatterplot Layouts
Dimensionality Reduction (DR) scatterplot layouts have become a ubiquitous
visualization tool for analyzing multidimensional data items with presence in
different areas. Despite its popularity, scatterplots suffer from occlusion,
especially when markers convey information, making it troublesome for users to
estimate items' groups' sizes and, more importantly, potentially obfuscating
critical items for the analysis under execution. Different strategies have been
devised to address this issue, either producing overlap-free layouts, lacking
the powerful capabilities of contemporary DR techniques in uncover interesting
data patterns, or eliminating overlaps as a post-processing strategy. Despite
the good results of post-processing techniques, the best methods typically
expand or distort the scatterplot area, thus reducing markers' size (sometimes)
to unreadable dimensions, defeating the purpose of removing overlaps. This
paper presents a novel post-processing strategy to remove DR layouts' overlaps
that faithfully preserves the original layout's characteristics and markers'
sizes. We show that the proposed strategy surpasses the state-of-the-art in
overlap removal through an extensive comparative evaluation considering
multiple different metrics while it is 2 or 3 orders of magnitude faster for
large datasets.Comment: 11 pages and 9 figure
OpenEssayist: a supply and demand learning analytics tool for drafting academic essays
This paper focuses on the use of a natural language analytics engine to provide feedback to students when preparing an essay for summative assessment. OpenEssayist is a real-time learning analytics tool, which operates through the combination of a linguistic analysis engine that processes the text in the essay, and a web application that uses the output of the linguistic analysis engine to generate the feedback. We outline the system itself and present analysis of observed patterns of activity as a cohort of students engaged with the system for their module assignments. We report a significant positive correlation between the number of drafts submitted to the system and the grades awarded for the first assignment. We can also report that this cohort of students gained significantly higher overall grades than the students in the previous cohort, who had no access to OpenEssayist. As a system that is content free, OpenEssayist can be used to support students working in any domain that requires the writing of essays
Recommended from our members
OpenEssayist: real-life testing of an automated feedback system for draft essay writing
OpenEssayist is unique in being an automated feedback system that has been developed to offer feedback on students' draft essays, rather than assessment on their finished work. This is therefore a system that offers opportunities for students to engage with and reflect on their work, and to improve their work through understanding of the requirements of academic essay writing. In trialling use of the system in a genuine Open University course, we found that students made use of it to varying degrees, which is perhaps likely with any study resource. Those who took the time to explore system affordances and what they could be used for however tended to report more positively on its perceived value. From our analysis we were also able to conclude that a significant positive correlation exists in this sample of students between marks on essay 1 and the number of drafts submitted. We could speculate as to what this may mean for this set of students, or more widely, but it seems clear that use of a system such as OpenEssayist has many potential advantages to students and tutors, which will benefit from further research and exploration
Semantic Search and Visual Exploration of Computational Notebooks
Code search is an important and frequent activity for developers using computational notebooks (e.g., Jupyter). The flexibility of notebooks brings challenges for effective code search, where classic search interfaces for traditional software code may be limited. In this thesis, we propose, NBSearch, a novel system that supports semantic code search in notebook collections and interactive visual exploration of search results. NBSearch leverages advanced machine learning models to enable natural language search queries and intuitive visualizations to present complicated intra- and inter-notebook relationships in the returned results. We developed NBSearch through an iterative participatory design process with two experts from a large software company. We evaluated the models with a series of experiments and the whole system with a controlled user study. The results indicate the feasibility of our analytical pipeline and the effectiveness of NBSearch to support code search in large
notebook collections. As one important aspect of the future directions, the search quality of NBSearch was further improved by
incorporating the impact of markdowns in notebooks, and its performance was evaluated by comparing to the original implementation
Improved Approximation Algorithms for Box Contact Representations
We study the following geometric representation problem: Given a graph whose vertices correspond to axis-aligned rectangles with fixed dimensions, arrange the rectangles without overlaps in the plane such that two rectangles touch if the graph contains an edge between them. This problem is called Contact Representation of Word Networks (Crown) since it formalizes the geometric problem behind drawing word clouds in which semantically related words are close to each other. Crown is known to be NP-hard, and there are approximation algorithms for certain graph classes for the optimization version, Max-Crown, in which realizing each desired adjacency yields a certain profit. We present the first O(1)-approximation algorithm for the general case, when the input is a complete weighted graph, and for the bipartite case. Since the subgraph of realized adjacencies is necessarily planar, we also consider several planar graph classes (namely stars, trees, outerplanar, and planar graphs), improving upon the known results. For some graph classes, we also describe improvements in the unweighted case, where each adjacency yields the same profit. Finally, we show that the problem is APX-complete on bipartite graphs of bounded maximum degree. © 2016, Springer Science+Business Media New York
Beyond the Third Dimension:Visualizing High-Dimensional Data with Projections
Multidimensional projections are an increasingly popular technique for visualizing large datasets containing observations having tens or even hundreds of dimensions. Compared to other techniques such as parallel coordinates, tables, and scatterplot matrices, they support tasks such as finding groups of related observations and outliers in simpler, more effective, ways. The authors discuss here the advantages of multidimensional projections, how to compute them, and recent advances that enhance them by visual explanatory techniques, so as to make them efficient and effective instruments that should be part of the toolkit of any scientist interested in high-dimensional data exploration