163,113 research outputs found

    Topological Machine Learning with Persistence Indicator Functions

    Full text link
    Techniques from computational topology, in particular persistent homology, are becoming increasingly relevant for data analysis. Their stable metrics permit the use of many distance-based data analysis methods, such as multidimensional scaling, while providing a firm theoretical ground. Many modern machine learning algorithms, however, are based on kernels. This paper presents persistence indicator functions (PIFs), which summarize persistence diagrams, i.e., feature descriptors in topological data analysis. PIFs can be calculated and compared in linear time and have many beneficial properties, such as the availability of a kernel-based similarity measure. We demonstrate their usage in common data analysis scenarios, such as confidence set estimation and classification of complex structured data.Comment: Topology-based Methods in Visualization 201

    Visual Analytics Of Sports Data

    Get PDF
    In this dissertation, we discuss analysis and visualization of performance anxiety in tennis matches along with confidence and momentum. We also discuss the micro-level analysis and visualization of tennis shot patterns with fractal tables and tactical rings, followed by discussion about mapping a tennis player\u27s style of play with a visual analysis technique called tennis fingerprinting. According to sports psychology, anxiety, confidence and momentum has a big impact on an athlete\u27s performance in a sport event. Although much work has been done in sports data analysis and visualization, analysis of anxiety, confidence and momentum has rarely been included in recent literature. We propose a method to analyze a tennis player\u27s anxiety level, confidence and momentum levels during a tennis match. This method is based on the psychological theories of anxiety and a database of over 4,000 professional tennis matches. Since sports data analysis and visualization can be a useful tool for gaining insights into the games, we present new techniques to analyze and visualize the shot patterns in tennis matches via our Fractal Tables and Tactical Rings. Tennis is a complicated game that involves a rich set of tactics and strategies. The current tennis analysis are usually conducted at a high level, which often fail to show the useful patterns and nuances embedded in low level data. However, based on a very detailed database of professional tennis matches, we have developed a system to analyze the serve and shot patterns so that an user can explore questions such as What are the favorite patterns of this player? What are the most effective patterns for this player? This can help tennis experts and fans gain a deeper insight and appreciation of the sport that are not usually obvious just by watching the match. Further, we present a new visual analytics technique called Tennis Fingerprinting to analyze tennis players\u27 tactical patterns and styles of play. In tennis, style is a complicated and often abstract concept that cannot be easily described or analyzed. The proposed visualization method is an attempt to provide a concrete and visual representation of a tennis player\u27s style

    Power system security boundary visualization using intelligent techniques

    Get PDF
    In the open access environment, one of the challenges for utilities is that typical operating conditions tend to be much closer to security boundaries. Consequently, security levels for the transmission network must be accurately assessed and easily identified on-line by system operators;Security assessment through boundary visualization provides the operator with knowledge of system security levels in terms of easily monitorable pre-contingency operating parameters. The traditional boundary visualization approach results in a two-dimensional graph called a nomogram. However, an intensive labor involvement, inaccurate boundary representation, and little flexibility in integrating with the energy management system greatly restrict use of nomograms under competitive utility environment. Motivated by the new operating environment and based on the traditional nomogram development procedure, an automatic security boundary visualization methodology has been developed using neural networks with feature selection. This methodology provides a new security assessment tool for power system operations;The main steps for this methodology include data generation, feature selection, neural network training, and boundary visualization. In data generation, a systematic approach to data generation has been developed to generate high quality data. Several data analysis techniques have been used to analyze the data before neural network training. In feature selection, genetic algorithm based methods have been used to select the most predicative precontingency operating parameters. Following neural network training, a confidence interval calculation method to measure the neural network output reliability has been derived. Sensitivity analysis of the neural network output with respect to input parameters has also been derived. In boundary visualization, a composite security boundary visualization algorithm has been proposed to present accurate boundaries in two dimensional diagrams to operators for any type of security problem;This methodology has been applied to thermal overload, voltage instability problems for a sample system

    How to apply the novel dynamic ARDL simulations (dynardl) and Kernel-based regularized least squares (krls)

    Get PDF
    The application of dynamic Autoregressive Distributed Lag (dynardl) simulations and Kernel-based Regularized Least Squares (krls) to time series data is gradually gaining recognition in energy, environmental and health economics. The Kernel-based Regularized Least Squares technique is a simplified machine learning-based algorithm with strength in its interpretation and accounting for heterogeneity, additivity and nonlinear effects. The novel dynamic ARDL Simulations algorithm is useful for testing cointegration, long and short-run equilibrium relationships in both levels and differences. Advantageously, the novel dynamic ARDL Simulations has visualization interface to examine the possible counterfactual change in the desired variable based on the notion of ceteris paribus. Thus, the novel dynamic ARDL Simulations and Kernel-based Regularized Least Squares techniques are useful and improved time series techniques for policy formulation. • We customize ARDL and dynamic simulated ARDL by adding plot estimates with confidence intervals. • A step-by-step procedure of applying ARDL, dynamic ARDL Simulations and Kernel-based Regularized Least Squares is provided. • All techniques are applied to examine the economic effect of denuclearization in Switzerland by 2034.publishedVersionUnit Licence Agreemen

    Rapid Sampling for Visualizations with Ordering Guarantees

    Get PDF
    Visualizations are frequently used as a means to understand trends and gather insights from datasets, but often take a long time to generate. In this paper, we focus on the problem of rapidly generating approximate visualizations while preserving crucial visual proper- ties of interest to analysts. Our primary focus will be on sampling algorithms that preserve the visual property of ordering; our techniques will also apply to some other visual properties. For instance, our algorithms can be used to generate an approximate visualization of a bar chart very rapidly, where the comparisons between any two bars are correct. We formally show that our sampling algorithms are generally applicable and provably optimal in theory, in that they do not take more samples than necessary to generate the visualizations with ordering guarantees. They also work well in practice, correctly ordering output groups while taking orders of magnitude fewer samples and much less time than conventional sampling schemes.Comment: Tech Report. 17 pages. Condensed version to appear in VLDB Vol. 8 No.

    Combining Clustering techniques and Formal Concept Analysis to characterize Interestingness Measures

    Full text link
    Formal Concept Analysis "FCA" is a data analysis method which enables to discover hidden knowledge existing in data. A kind of hidden knowledge extracted from data is association rules. Different quality measures were reported in the literature to extract only relevant association rules. Given a dataset, the choice of a good quality measure remains a challenging task for a user. Given a quality measures evaluation matrix according to semantic properties, this paper describes how FCA can highlight quality measures with similar behavior in order to help the user during his choice. The aim of this article is the discovery of Interestingness Measures "IM" clusters, able to validate those found due to the hierarchical and partitioning clustering methods "AHC" and "k-means". Then, based on the theoretical study of sixty one interestingness measures according to nineteen properties, proposed in a recent study, "FCA" describes several groups of measures.Comment: 13 pages, 2 figure

    Fuzzy Fibers: Uncertainty in dMRI Tractography

    Full text link
    Fiber tracking based on diffusion weighted Magnetic Resonance Imaging (dMRI) allows for noninvasive reconstruction of fiber bundles in the human brain. In this chapter, we discuss sources of error and uncertainty in this technique, and review strategies that afford a more reliable interpretation of the results. This includes methods for computing and rendering probabilistic tractograms, which estimate precision in the face of measurement noise and artifacts. However, we also address aspects that have received less attention so far, such as model selection, partial voluming, and the impact of parameters, both in preprocessing and in fiber tracking itself. We conclude by giving impulses for future research
    • …
    corecore