471 research outputs found

    Improving Interaction in Visual Analytics using Machine Learning

    Get PDF
    Interaction is one of the most fundamental components in visual analytical systems, which transforms people from mere viewers to active participants in the process of analyzing and understanding data. Therefore, fast and accurate interaction techniques are key to establishing a successful human-computer dialogue, enabling a smooth visual data exploration. Machine learning is a branch of artificial intelligence that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. It has been utilized in a wide variety of fields, where it is not straightforward to develop a conventional algorithm for effectively performing a task. Inspired by this, we see the opportunity to improve the current interactions in visual analytics by using machine learning methods. In this thesis, we address the need for interaction techniques that are both fast, enabling a fluid interaction in visual data exploration and analysis, and also accurate, i.e., enabling the user to effectively select specific data subsets. First, we present a new, fast and accurate brushing technique for scatterplots, based on the Mahalanobis brush, which we have optimized using data from a user study. Further, we present a new solution for a near-perfect sketch-based brushing technique, where we exploit a convolutional neural network (CNN) for estimating the intended data selection from a fast and simple click-and-drag interaction and from the data distribution in the visualization. Next, we propose an innovative framework which offers the user opportunities to improve the brushing technique while using it. We tested this framework with CNN-based brushing and the result shows that the underlying model can be refined (better performance in terms of accuracy) and personalized by very little time of retraining. Besides, in order to investigate to which degree the human should be involved into the model design and how good the empirical model can be with a more careful design, we extended our Mahalanobis brush (the best current empirical model in terms of accuracy for brushing points in a scatterplot) by further incorporating the data distribution information, captured by kernel density estimation (KDE). Based on this work, we then provide a detailed comparison between empirical modeling and implicit modeling by machine learning (deep learning). Lastly, we introduce a new, machine learning based approach that enables the fast and accurate querying of time series data based on a swift sketching interaction. To achieve this, we build upon existing LSTM technology (long short-term memory) to encode both the sketch and the time series data in two networks with shared parameters. All the proposed interaction techniques in this thesis were demonstrated by application examples and evaluated via user studies. The integration of machine learning knowledge into visualization opens further possible research directions.Doktorgradsavhandlin

    Interactive Visual Analysis of Process Data

    Get PDF
    Data gathered from processes, or process data, contains many different aspects that a visualization system should also convey. Aspects such as, temporal coherence, spatial connectivity, streaming data, and the need for in-situ visualizations, which all come with their independent challenges. Additionally, as sensors get more affordable, and the benefits of measurements get clearer we are faced with a deluge of data, of which sizes are rapidly growing. With all the aspects that should be supported and the vast increase in the amount of data, the traditional techniques of dashboards showing the recent data becomes insufficient for practical use. In this thesis we investigate how to extend the traditional process visualization techniques by bringing the streaming process data into an interactive visual analysis setting. The augmentation of process visualization with interactivity enables the users to go beyond the mere observation, pose questions about observed phenomena and delve into the data to mine for the answers. Furthermore, this thesis investigates how to utilize frequency based, as opposed to item based, techniques to show such large amounts of data. By utilizing Kernel Density Estimates (KDE) we show how the display of streaming data benefit by the non-parametric automatic aggregation to interpret incoming data put in context to historic data

    Abstraction and cartographic generalization of geographic user-generated content: use-case motivated investigations for mobile users

    Full text link
    On a daily basis, a conventional internet user queries different internet services (available on different platforms) to gather information and make decisions. In most cases, knowingly or not, this user consumes data that has been generated by other internet users about his/her topic of interest (e.g. an ideal holiday destination with a family traveling by a van for 10 days). Commercial service providers, such as search engines, travel booking websites, video-on-demand providers, food takeaway mobile apps and the like, have found it useful to rely on the data provided by other users who have commonalities with the querying user. Examples of commonalities are demography, location, interests, internet address, etc. This process has been in practice for more than a decade and helps the service providers to tailor their results based on the collective experience of the contributors. There has been also interest in the different research communities (including GIScience) to analyze and understand the data generated by internet users. The research focus of this thesis is on finding answers for real-world problems in which a user interacts with geographic information. The interactions can be in the form of exploration, querying, zooming and panning, to name but a few. We have aimed our research at investigating the potential of using geographic user-generated content to provide new ways of preparing and visualizing these data. Based on different scenarios that fulfill user needs, we have investigated the potential of finding new visual methods relevant to each scenario. The methods proposed are mainly based on pre-processing and analyzing data that has been offered by data providers (both commercial and non-profit organizations). But in all cases, the contribution of the data was done by ordinary internet users in an active way (compared to passive data collections done by sensors). The main contributions of this thesis are the proposals for new ways of abstracting geographic information based on user-generated content contributions. Addressing different use-case scenarios and based on different input parameters, data granularities and evidently geographic scales, we have provided proposals for contemporary users (with a focus on the users of location-based services, or LBS). The findings are based on different methods such as semantic analysis, density analysis and data enrichment. In the case of realization of the findings of this dissertation, LBS users will benefit from the findings by being able to explore large amounts of geographic information in more abstract and aggregated ways and get their results based on the contributions of other users. The research outcomes can be classified in the intersection between cartography, LBS and GIScience. Based on our first use case we have proposed the inclusion of an extended semantic measure directly in the classic map generalization process. In our second use case we have focused on simplifying geographic data depiction by reducing the amount of information using a density-triggered method. And finally, the third use case was focused on summarizing and visually representing relatively large amounts of information by depicting geographic objects matched to the salient topics emerged from the data

    Density estimation and adaptive bandwidths: A primer for public health practitioners

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Geographic information systems have advanced the ability to both visualize and analyze point data. While point-based maps can be aggregated to differing areal units and examined at varying resolutions, two problems arise 1) the modifiable areal unit problem and 2) any corresponding data must be available both at the scale of analysis and in the same geographic units. Kernel density estimation (KDE) produces a smooth, continuous surface where each location in the study area is assigned a density value irrespective of arbitrary administrative boundaries. We review KDE, and introduce the technique of utilizing an adaptive bandwidth to address the underlying heterogeneous population distributions common in public health research.</p> <p>Results</p> <p>The density of occurrences should not be interpreted without knowledge of the underlying population distribution. When the effect of the background population is successfully accounted for, differences in point patterns in similar population areas are more discernible; it is generally these variations that are of most interest. A static bandwidth KDE does not distinguish the spatial extents of interesting areas, nor does it expose patterns above and beyond those due to geographic variations in the density of the underlying population. An adaptive bandwidth method uses background population data to calculate a kernel of varying size for each individual case. This limits the influence of a single case to a small spatial extent where the population density is high as the bandwidth is small. If the primary concern is distance, a static bandwidth is preferable because it may be better to define the "neighborhood" or exposure risk based on distance. If the primary concern is differences in exposure across the population, a bandwidth adapting to the population is preferred.</p> <p>Conclusions</p> <p>Kernel density estimation is a useful way to consider exposure at any point within a spatial frame, irrespective of administrative boundaries. Utilization of an adaptive bandwidth may be particularly useful in comparing two similarly populated areas when studying health disparities or other issues comparing populations in public health.</p

    Digital representation of park use and visual analysis of visitor activities

    Full text link
    Urban public parks can serve an important function by contributing to urban citizens' quality of life. At the same time, they can be the location of processes of displacement and exclusion. Despite this ambiguous role, little is known about actual park use patterns. To learn more about park use in three parks in Zurich, Switzerland, extensive data on visitor activities was collected using a new method based on direct recording via a portable GIS solution. Then, the data was analyzed using qualitative and quantitative methods. This paper examines whether geographic visualization of these data can help domain experts like landscape designers and park managers to assess park use. To maximize accessibility, the visualizations are made available through a web-interface of a common, off-the-shelf GIS. The technical limitations imposed by this choice are critically assessed, before the available visualization techniques are evaluated in respect to the needs and tasks of practitioners with limited knowledge on spatial analysis and GIS. Key criteria are each technique's level of abstraction and graphical complexity. The utility and suitability of the visualization techniques is characterized for the distinct phases of exploration, analysis and synthesis. The findings suggest that for a target user group of practitioners, a combination of dot maps showing the raw data and surface maps showing derived density values for several attributes serves the purpose of knowledge generation best

    DEPLOYING, IMPROVING AND EVALUATING EDGE BUNDLING METHODS FOR VISUALIZING LARGE GRAPHS

    Get PDF
    A tremendous increase in the scale of graphs has been witnessed in a wide range of fields, which demands efficient and effective visualization techniques to assist users in better understandings of large graphs. Conventional node-link diagrams are often used to visualize graphs, whereas excessive edge crossings can easily incur severe visual clutter in the node-link diagram of a large graph. Edge bundling can effectively remedy visual clutter and reveal high-level graph structures. Although significant efforts have been devoted to developing edge bundling, three challenging problems remain. First, edge bundling techniques are often computationally expensive and are not easy to deploy for web-based applications. The state-of-the-art edge bundling methods often require special system supports and techniques such as high-end GPU acceleration for large graphs, which makes these methods less portable, especially for ubiquitous mobile devices. Second, the quantitative quality of edge bundling results is barely assessed in the literature. Currently, the comparison of edge bundling mainly focuses on computational performance and perceptual results. Third, although the family of edge bundling techniques has a rich set of bundling layout, there is a lack of a generic method to generate different styles of edge bundling. In this research, I aim to address these problems and have made the following contributions. First, I provide an efficient framework to deploy edge bundling for web-based platforms by exploiting standard graphics hardware functions and libraries. My framework can generate high-quality edge bundling results on web-based platforms, and achieve a speedup of 50X compared to the previous state-of-the-art edge bundling method on a graph with half of a million edges. Second, I propose a new moving least squares based approach to lower the algorithm complexity of edge bundling. In addition, my approach can generate better bundling results compared to other methods based on a quality metric. Third, I provide an information-theoretic metric to evaluate the edge bundling methods. I leverage information theory in this metric. With my information-theoretic metric, domain users can choose appropriate edge bundling methods with proper parameters for their applications. Last but not least, I present a deep learning framework for edge bundling visualizations. Through a training process that learns the results of a specific edge bundling method, my deep learning framework can infer the final layout of the edge bundling method. My deep learning framework is a generic framework that can generate the corresponding results of different edge bundling methods. Adviser: Hongfeng Y

    Interactive Spatiotemporal Analysis of Oil Spills Using Comap in North Dakota

    Get PDF
    The aim of the study is to analyze the oil spill pattern from various types of incidents and contaminants to determine the extent that incident data can be used as a baseline to prevent hazardous material releases and improve response activities at a state level. This study addresses the importance of collecting and sharing oil spill incidents as well as analytics using the data. Temporal, spatial and spatiotemporal analysis techniques are employed for the oil-spill related environmental incidents observed in the state of North Dakota, United States of America, from 2000 to 2014, as a result of the oil boom. Specifically, spatiotemporal methods are used to examine how the patterns of environmental incidents in North Dakota, which vary with the time of day, the day, the month, and the season. Results indicate that there were critical spatial and time variations in the distribution of environmental incidents. Application of spatiotemporal interaction visualization techniques, called comap has the potential to help planners and decision makers formulate policy to mitigate the risks associated with environmental incidents, improve safety, and allocate resources

    A Spatial Approach to Surveying Crime‐problematic Areas at the Street Level

    Get PDF
    Ponencias, comunicaciones y pósters presentados en el 17th AGILE Conference on Geographic Information Science "Connecting a Digital Europe through Location and Place", celebrado en la Universitat Jaume I del 3 al 6 de junio de 2014.Reaching far beyond the realm of geography and its related disciplines, spatial analysis and visualization tools now actively support the decision-making processes of law enforcement agencies. Interactive mapping of crime outperforms the previously manual and laborious querying of crime databases. Using burglary and robbery events reported in the urban city of Manchester, England, we illustrate the utility of graphical methods for interactive analysis and visualization of event data. These novel surveillance techniques provide insight into offending characteristics and changes in the offending process in ways that cannot be replicated by traditional crime investigative methods. We present a step-wise methodology for computing the intensity of aggregated crime events which can potentially accelerate law enforcers’ decision making processes by mapping concentrations of crime in near real time
    • …
    corecore