26 research outputs found

    Personalized Sketch-Based Brushing in Scatterplots

    Get PDF
    Brushing is at the heart of most modern visual analytics solutions and effective and efficient brushing is crucial for successful interactive data exploration and analysis. As the user plays a central role in brushing, several data-driven brushing tools have been designed that are based on predicting the user's brushing goal. All of these general brushing models learn the users' average brushing preference, which is not optimal for every single user. In this paper, we propose an innovative framework that offers the user opportunities to improve the brushing technique while using it. We realized this framework with a CNN-based brushing technique and the result shows that with additional data from a particular user, the model can be refined (better performance in terms of accuracy), eventually converging to a personalized model based on a moderate amount of retraining.acceptedVersio

    Improving Interaction in Visual Analytics using Machine Learning

    Get PDF
    Interaction is one of the most fundamental components in visual analytical systems, which transforms people from mere viewers to active participants in the process of analyzing and understanding data. Therefore, fast and accurate interaction techniques are key to establishing a successful human-computer dialogue, enabling a smooth visual data exploration. Machine learning is a branch of artificial intelligence that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. It has been utilized in a wide variety of fields, where it is not straightforward to develop a conventional algorithm for effectively performing a task. Inspired by this, we see the opportunity to improve the current interactions in visual analytics by using machine learning methods. In this thesis, we address the need for interaction techniques that are both fast, enabling a fluid interaction in visual data exploration and analysis, and also accurate, i.e., enabling the user to effectively select specific data subsets. First, we present a new, fast and accurate brushing technique for scatterplots, based on the Mahalanobis brush, which we have optimized using data from a user study. Further, we present a new solution for a near-perfect sketch-based brushing technique, where we exploit a convolutional neural network (CNN) for estimating the intended data selection from a fast and simple click-and-drag interaction and from the data distribution in the visualization. Next, we propose an innovative framework which offers the user opportunities to improve the brushing technique while using it. We tested this framework with CNN-based brushing and the result shows that the underlying model can be refined (better performance in terms of accuracy) and personalized by very little time of retraining. Besides, in order to investigate to which degree the human should be involved into the model design and how good the empirical model can be with a more careful design, we extended our Mahalanobis brush (the best current empirical model in terms of accuracy for brushing points in a scatterplot) by further incorporating the data distribution information, captured by kernel density estimation (KDE). Based on this work, we then provide a detailed comparison between empirical modeling and implicit modeling by machine learning (deep learning). Lastly, we introduce a new, machine learning based approach that enables the fast and accurate querying of time series data based on a swift sketching interaction. To achieve this, we build upon existing LSTM technology (long short-term memory) to encode both the sketch and the time series data in two networks with shared parameters. All the proposed interaction techniques in this thesis were demonstrated by application examples and evaluated via user studies. The integration of machine learning knowledge into visualization opens further possible research directions.Doktorgradsavhandlin

    A Survey on ML4VIS: Applying Machine Learning Advances to Data Visualization

    Full text link
    Inspired by the great success of machine learning (ML), researchers have applied ML techniques to visualizations to achieve a better design, development, and evaluation of visualizations. This branch of studies, known as ML4VIS, is gaining increasing research attention in recent years. To successfully adapt ML techniques for visualizations, a structured understanding of the integration of ML4VISis needed. In this paper, we systematically survey 88 ML4VIS studies, aiming to answer two motivating questions: "what visualization processes can be assisted by ML?" and "how ML techniques can be used to solve visualization problems?" This survey reveals seven main processes where the employment of ML techniques can benefit visualizations:Data Processing4VIS, Data-VIS Mapping, InsightCommunication, Style Imitation, VIS Interaction, VIS Reading, and User Profiling. The seven processes are related to existing visualization theoretical models in an ML4VIS pipeline, aiming to illuminate the role of ML-assisted visualization in general visualizations.Meanwhile, the seven processes are mapped into main learning tasks in ML to align the capabilities of ML with the needs in visualization. Current practices and future opportunities of ML4VIS are discussed in the context of the ML4VIS pipeline and the ML-VIS mapping. While more studies are still needed in the area of ML4VIS, we hope this paper can provide a stepping-stone for future exploration. A web-based interactive browser of this survey is available at https://ml4vis.github.ioComment: 19 pages, 12 figures, 4 table

    From insights to innovations : data mining, visualization, and user interfaces

    Get PDF
    This thesis is about data mining (DM) and visualization methods for gaining insight into multidimensional data. Novel, exploratory data analysis tools and adaptive user interfaces are developed by tailoring and combining existing DM and visualization methods in order to advance in different applications. The thesis presents new visual data mining (VDM) methods that are also implemented in software toolboxes and applied to industrial and biomedical signals: First, we propose a method that has been applied to investigating industrial process data. The self-organizing map (SOM) is combined with scatterplots using the traditional color linking or interactive brushing. The original contribution is to apply color linked or brushed scatterplots and the SOM to visually survey local dependencies between a pair of attributes in different parts of the SOM. Clusters can be visualized on a SOM with different colors, and we also present how a color coding can be automatically obtained by using a proximity preserving projection of the SOM model vectors. Second, we present a new method for an (interactive) visualization of cluster structures in a SOM. By using a contraction model, the regular grid of a SOM visualization is smoothly changed toward a presentation that shows better the proximities in the data space. Third, we propose a novel VDM method for investigating the reliability of estimates resulting from a stochastic independent component analysis (ICA) algorithm. The method can be extended also to other problems of similar kind. As a benchmarking task, we rank independent components estimated on a biomedical data set recorded from the brain and gain a reasonable result. We also utilize DM and visualization for mobile-awareness and personalization. We explore how to infer information about the usage context from features that are derived from sensory signals. The signals originate from a mobile phone with on-board sensors for ambient physical conditions. In previous studies, the signals are transformed into descriptive (fuzzy or binary) context features. In this thesis, we present how the features can be transformed into higher-level patterns, contexts, by rather simple statistical methods: we propose and test using minimum-variance cost time series segmentation, ICA, and principal component analysis (PCA) for this purpose. Both time-series segmentation and PCA revealed meaningful contexts from the features in a visual data exploration. We also present a novel type of adaptive soft keyboard where the aim is to obtain an ergonomically better, more comfortable keyboard. The method starts from some conventional keypad layout, but it gradually shifts the keys into new positions according to the user's grasp and typing pattern. Related to the applications, we present two algorithms that can be used in a general context: First, we describe a binary mixing model for independent binary sources. The model resembles the ordinary ICA model, but the summation is replaced by the Boolean operator OR and the multiplication by AND. We propose a new, heuristic method for estimating the binary mixing matrix and analyze its performance experimentally. The method works for signals that are sparse enough. We also discuss differences on the results when using different objective functions in the FastICA estimation algorithm. Second, we propose "global iterative replacement" (GIR), a novel, greedy variant of a merge-split segmentation method. Its performance compares favorably to that of the traditional top-down binary split segmentation algorithm.reviewe

    Visualisation of Long in Time Dynamic Networks on Large Touch Displays

    Get PDF
    Any dataset containing information about relationships between entities can be modelled as a network. This network can be static, where the entities/relationships do not change over time, or dynamic, where the entities/relationships change over time. Network data that changes over time, dynamic network data, is a powerful resource when studying many important phenomena, across wide-ranging fields from travel networks to epidemiology.However, it is very difficult to analyse this data, especially if it covers a long period of time (e.g, one month) with respect to its temporal resolution (e.g. seconds). In this thesis, we address the problem of visualising long in time dynamic networks: networks that may not be particularly large in terms of the number of entities or relationships, but are long in terms of the length of time they cover when compared to their temporal resolution.We first introduce Dynamic Network Plaid, a system for the visualisation and analysis of long in time dynamic networks. We design and build for an 84" touch-screen vertically-mounted display as existing work reports positive results for the use of these in a visualisation context, and that they are useful for collaboration. The Plaid integrates multiple views and we prioritise the visualisation of interaction provenance. In this system we also introduce a novel method of time exploration called ‘interactive timeslicing’. This allows the selection and comparison of points that are far apart in time, a feature not offered by existing visualisation systems. The Plaid is validated through an expert user evaluation with three public health researchers.To confirm observations of the expert user evaluation, we then carry out a formal laboratory study with a large touch-screen display to verify our novel method of time navigation against existing animation and small multiples approaches. From this study, we find that interactive timeslicing outperforms animation and small multiples for complex tasks requiring a compari-son between multiple points that are far apart in time. We also find that small multiples is best suited to comparisons of multiple sequential points in time across a time interval.To generalise the results of this experiment, we later run a second formal laboratory study in the same format as the first, but this time using standard-sized displays with indirect mouse input. The second study reaffirms the results of the first, showing that our novel method of time navigation can facilitate the visual comparison of points that are distant in time in a way that existing approaches, small multiples and animation, cannot. The study demonstrates that our previous results generalise across display size and interaction type (touch vs mouse).In this thesis we introduce novel representations and time interaction techniques to improve the visualisation of long in time dynamic networks, and experimentally show that our novel method of time interaction outperforms other popular methods for some task types

    Scaling Up Medical Visualization : Multi-Modal, Multi-Patient, and Multi-Audience Approaches for Medical Data Exploration, Analysis and Communication

    Get PDF
    Medisinsk visualisering er en av de mest applikasjonsrettede områdene av visualiseringsforsking. Tett samarbeid med medisinske eksperter er nødvendig for å tolke medisinsk bildedata og lage betydningsfulle visualiseringsteknikker og visualiseringsapplikasjoner. Kreft er en av de vanligste dødsårsakene, og med økende gjennomsnittsalder i i-land øker også antallet diagnoser av gynekologisk kreft. Moderne avbildningsteknikker er et viktig verktøy for å vurdere svulster og produsere et økende antall bildedata som radiologer må tolke. I tillegg til antallet bildemodaliteter, øker også antallet pasienter, noe som fører til at visualiseringsløsninger må bli skalert opp for å adressere den økende kompleksiteten av multimodal- og multipasientdata. Dessuten er ikke medisinsk visualisering kun tiltenkt medisinsk personale, men har også som mål å informere pasienter, pårørende, og offentligheten om risikoen relatert til visse sykdommer, og mulige behandlinger. Derfor har vi identifisert behovet for å skalere opp medisinske visualiseringsløsninger for å kunne håndtere multipublikumdata. Denne avhandlingen adresserer skaleringen av disse dimensjonene i forskjellige bidrag vi har kommet med. Først presenterer vi teknikkene våre for å skalere visualiseringer i flere modaliteter. Vi introduserer en visualiseringsteknikk som tar i bruk små multipler for å vise data fra flere modaliteter innenfor et bildesnitt. Dette lar radiologer utforske dataen effektivt uten å måtte bruke flere sidestilte vinduer. I det neste steget utviklet vi en analyseplatform ved å ta i bruk «radiomic tumor profiling» på forskjellige bildemodaliteter for å analysere kohortdata og finne nye biomarkører fra bilder. Biomarkører fra bilder er indikatorer basert på bildedata som kan forutsi variabler relatert til kliniske utfall. «Radiomic tumor profiling» er en teknikk som genererer mulige biomarkører fra bilder basert på første- og andregrads statistiske målinger. Applikasjonen lar medisinske eksperter analysere multiparametrisk bildedata for å finne mulige korrelasjoner mellom kliniske parameter og data fra «radiomic tumor profiling». Denne tilnærmingen skalerer i to dimensjoner, multimodal og multipasient. I en senere versjon la vi til funksjonalitet for å skalere multipublikumdimensjonen ved å gjøre applikasjonen vår anvendelig for livmorhalskreft- og prostatakreftdata, i tillegg til livmorkreftdataen som applikasjonen var designet for. I et senere bidrag fokuserer vi på svulstdata på en annen skala og muliggjør analysen av svulstdeler ved å bruke multimodal bildedata i en tilnærming basert på hierarkisk gruppering. Applikasjonen vår finner mulige interessante regioner som kan informere fremtidige behandlingsavgjørelser. I et annet bidrag, en digital sonderingsinteraksjon, fokuserer vi på multipasientdata. Bildedata fra flere pasienter kan sammenlignes for å finne interessante mønster i svulstene som kan være knyttet til hvor aggressive svulstene er. Til slutt skalerer vi multipublikumdimensjonen med en likhetsvisualisering som er anvendelig for forskning på livmorkreft, på bilder av nevrologisk kreft, og maskinlæringsforskning på automatisk segmentering av svulstdata. Som en kontrast til de allerede fremhevete bidragene, fokuserer vårt siste bidrag, ScrollyVis, hovedsakelig på multipublikumkommunikasjon. Vi muliggjør skapelsen av dynamiske og vitenskapelige “scrollytelling”-opplevelser for spesifikke eller generelle publikum. Slike historien kan bli brukt i spesifikke brukstilfeller som kommunikasjon mellom lege og pasient, eller for å kommunisere vitenskapelige resultater via historier til et generelt publikum i en digital museumsutstilling. Våre foreslåtte applikasjoner og interaksjonsteknikker har blitt demonstrert i brukstilfeller og evaluert med domeneeksperter og fokusgrupper. Dette har ført til at noen av våre bidrag allerede er i bruk på andre forskingsinstitusjoner. Vi ønsker å evaluere innvirkningen deres på andre vitenskapelige felt og offentligheten i fremtidige arbeid.Medical visualization is one of the most application-oriented areas of visualization research. Close collaboration with medical experts is essential for interpreting medical imaging data and creating meaningful visualization techniques and visualization applications. Cancer is one of the most common causes of death, and with increasing average age in developed countries, gynecological malignancy case numbers are rising. Modern imaging techniques are an essential tool in assessing tumors and produce an increasing number of imaging data radiologists must interpret. Besides the number of imaging modalities, the number of patients is also rising, leading to visualization solutions that must be scaled up to address the rising complexity of multi-modal and multi-patient data. Furthermore, medical visualization is not only targeted toward medical professionals but also has the goal of informing patients, relatives, and the public about the risks of certain diseases and potential treatments. Therefore, we identify the need to scale medical visualization solutions to cope with multi-audience data. This thesis addresses the scaling of these dimensions in different contributions we made. First, we present our techniques to scale medical visualizations in multiple modalities. We introduced a visualization technique using small multiples to display the data of multiple modalities within one imaging slice. This allows radiologists to explore the data efficiently without having several juxtaposed windows. In the next step, we developed an analysis platform using radiomic tumor profiling on multiple imaging modalities to analyze cohort data and to find new imaging biomarkers. Imaging biomarkers are indicators based on imaging data that predict clinical outcome related variables. Radiomic tumor profiling is a technique that generates potential imaging biomarkers based on first and second-order statistical measurements. The application allows medical experts to analyze the multi-parametric imaging data to find potential correlations between clinical parameters and the radiomic tumor profiling data. This approach scales up in two dimensions, multi-modal and multi-patient. In a later version, we added features to scale the multi-audience dimension by making our application applicable to cervical and prostate cancer data and the endometrial cancer data the application was designed for. In a subsequent contribution, we focus on tumor data on another scale and enable the analysis of tumor sub-parts by using the multi-modal imaging data in a hierarchical clustering approach. Our application finds potentially interesting regions that could inform future treatment decisions. In another contribution, the digital probing interaction, we focus on multi-patient data. The imaging data of multiple patients can be compared to find interesting tumor patterns potentially linked to the aggressiveness of the tumors. Lastly, we scale the multi-audience dimension with our similarity visualization applicable to endometrial cancer research, neurological cancer imaging research, and machine learning research on the automatic segmentation of tumor data. In contrast to the previously highlighted contributions, our last contribution, ScrollyVis, focuses primarily on multi-audience communication. We enable the creation of dynamic scientific scrollytelling experiences for a specific or general audience. Such stories can be used for specific use cases such as patient-doctor communication or communicating scientific results via stories targeting the general audience in a digital museum exhibition. Our proposed applications and interaction techniques have been demonstrated in application use cases and evaluated with domain experts and focus groups. As a result, we brought some of our contributions to usage in practice at other research institutes. We want to evaluate their impact on other scientific fields and the general public in future work.Doktorgradsavhandlin

    Enabling Collaborative Visual Analysis across Heterogeneous Devices

    Get PDF
    We are surrounded by novel device technologies emerging at an unprecedented pace. These devices are heterogeneous in nature: in large and small sizes with many input and sensing mechanisms. When many such devices are used by multiple users with a shared goal, they form a heterogeneous device ecosystem. A device ecosystem has great potential in data science to act as a natural medium for multiple analysts to make sense of data using visualization. It is essential as today's big data problems require more than a single mind or a single machine to solve them. Towards this vision, I introduce the concept of collaborative, cross-device visual analytics (C2-VA) and outline a reference model to develop user interfaces for C2-VA. This dissertation covers interaction models, coordination techniques, and software platforms to enable full stack support for C2-VA. Firstly, we connected devices to form an ecosystem using software primitives introduced in the early frameworks from this dissertation. To work in a device ecosystem, we designed multi-user interaction for visual analysis in front of large displays by finding a balance between proxemics and mid-air gestures. Extending these techniques, we considered the roles of different devices–large and small–to present a conceptual framework for utilizing multiple devices for visual analytics. When applying this framework, findings from a user study showcase flexibility in the analytic workflow and potential for generation of complex insights in device ecosystems. Beyond this, we supported coordination between multiple users in a device ecosystem by depicting the presence, attention, and data coverage of each analyst within a group. Building on these parts of the C2-VA stack, the culmination of this dissertation is a platform called Vistrates. This platform introduces a component model for modular creation of user interfaces that work across multiple devices and users. A component is an analytical primitive–a data processing method, a visualization, or an interaction technique–that is reusable, composable, and extensible. Together, components can support a complex analytical activity. On top of the component model, the support for collaboration and device ecosystems comes for granted in Vistrates. Overall, this enables the exploration of new research ideas within C2-VA

    Interaction for Immersive Analytics

    Get PDF
    International audienceIn this chapter, we briefly review the development of natural user interfaces and discuss their role in providing human-computer interaction that is immersive in various ways. Then we examine some opportunities for how these technologies might be used to better support data analysis tasks. Specifically, we review and suggest some interaction design guidelines for immersive analytics. We also review some hardware setups for data visualization that are already archetypal. Finally, we look at some emerging system designs that suggest future directions
    corecore