163,113 research outputs found
Topological Machine Learning with Persistence Indicator Functions
Techniques from computational topology, in particular persistent homology,
are becoming increasingly relevant for data analysis. Their stable metrics
permit the use of many distance-based data analysis methods, such as
multidimensional scaling, while providing a firm theoretical ground. Many
modern machine learning algorithms, however, are based on kernels. This paper
presents persistence indicator functions (PIFs), which summarize persistence
diagrams, i.e., feature descriptors in topological data analysis. PIFs can be
calculated and compared in linear time and have many beneficial properties,
such as the availability of a kernel-based similarity measure. We demonstrate
their usage in common data analysis scenarios, such as confidence set
estimation and classification of complex structured data.Comment: Topology-based Methods in Visualization 201
Visual Analytics Of Sports Data
In this dissertation, we discuss analysis and visualization of performance anxiety in tennis matches along with confidence and momentum. We also discuss the micro-level analysis and visualization of tennis shot patterns with fractal tables and tactical rings, followed by discussion about mapping a tennis player\u27s style of play with a visual analysis technique called tennis fingerprinting.
According to sports psychology, anxiety, confidence and momentum has a big impact on an athlete\u27s performance in a sport event. Although much work has been done in sports data analysis and visualization, analysis of anxiety, confidence and momentum has rarely been included in recent literature. We propose a method to analyze a tennis player\u27s anxiety level, confidence and momentum levels during a tennis match. This method is based on the psychological theories of anxiety and a database of over 4,000 professional tennis matches. Since sports data analysis and visualization can be a useful tool for gaining insights into the games, we present new techniques to analyze and visualize the shot patterns in tennis matches via our Fractal Tables and Tactical Rings. Tennis is a complicated game that involves a rich set of tactics and strategies. The current tennis analysis are usually conducted at a high level, which often fail to show the useful patterns and nuances embedded in low level data. However, based on a very detailed database of professional tennis matches, we have developed a system to analyze the serve and shot patterns so that an user can explore questions such as What are the favorite patterns of this player? What are the most effective patterns for this player? This can help tennis experts and fans gain a deeper insight and appreciation of the sport that are not usually obvious just by watching the match. Further, we present a new visual analytics technique called Tennis Fingerprinting to analyze tennis players\u27 tactical patterns and styles of play. In tennis, style is a complicated and often abstract concept that cannot be easily described or analyzed. The proposed visualization method is an attempt to provide a concrete and visual representation of a tennis player\u27s style
Power system security boundary visualization using intelligent techniques
In the open access environment, one of the challenges for utilities is that typical operating conditions tend to be much closer to security boundaries. Consequently, security levels for the transmission network must be accurately assessed and easily identified on-line by system operators;Security assessment through boundary visualization provides the operator with knowledge of system security levels in terms of easily monitorable pre-contingency operating parameters. The traditional boundary visualization approach results in a two-dimensional graph called a nomogram. However, an intensive labor involvement, inaccurate boundary representation, and little flexibility in integrating with the energy management system greatly restrict use of nomograms under competitive utility environment. Motivated by the new operating environment and based on the traditional nomogram development procedure, an automatic security boundary visualization methodology has been developed using neural networks with feature selection. This methodology provides a new security assessment tool for power system operations;The main steps for this methodology include data generation, feature selection, neural network training, and boundary visualization. In data generation, a systematic approach to data generation has been developed to generate high quality data. Several data analysis techniques have been used to analyze the data before neural network training. In feature selection, genetic algorithm based methods have been used to select the most predicative precontingency operating parameters. Following neural network training, a confidence interval calculation method to measure the neural network output reliability has been derived. Sensitivity analysis of the neural network output with respect to input parameters has also been derived. In boundary visualization, a composite security boundary visualization algorithm has been proposed to present accurate boundaries in two dimensional diagrams to operators for any type of security problem;This methodology has been applied to thermal overload, voltage instability problems for a sample system
How to apply the novel dynamic ARDL simulations (dynardl) and Kernel-based regularized least squares (krls)
The application of dynamic Autoregressive Distributed Lag (dynardl) simulations and Kernel-based Regularized Least Squares (krls) to time series data is gradually gaining recognition in energy, environmental and health economics. The Kernel-based Regularized Least Squares technique is a simplified machine learning-based algorithm with strength in its interpretation and accounting for heterogeneity, additivity and nonlinear effects. The novel dynamic ARDL Simulations algorithm is useful for testing cointegration, long and short-run equilibrium relationships in both levels and differences. Advantageously, the novel dynamic ARDL Simulations has visualization interface to examine the possible counterfactual change in the desired variable based on the notion of ceteris paribus. Thus, the novel dynamic ARDL Simulations and Kernel-based Regularized Least Squares techniques are useful and improved time series techniques for policy formulation.
• We customize ARDL and dynamic simulated ARDL by adding plot estimates with confidence intervals.
• A step-by-step procedure of applying ARDL, dynamic ARDL Simulations and Kernel-based Regularized Least Squares is provided.
• All techniques are applied to examine the economic effect of denuclearization in Switzerland by 2034.publishedVersionUnit Licence Agreemen
Rapid Sampling for Visualizations with Ordering Guarantees
Visualizations are frequently used as a means to understand trends and gather
insights from datasets, but often take a long time to generate. In this paper,
we focus on the problem of rapidly generating approximate visualizations while
preserving crucial visual proper- ties of interest to analysts. Our primary
focus will be on sampling algorithms that preserve the visual property of
ordering; our techniques will also apply to some other visual properties. For
instance, our algorithms can be used to generate an approximate visualization
of a bar chart very rapidly, where the comparisons between any two bars are
correct. We formally show that our sampling algorithms are generally applicable
and provably optimal in theory, in that they do not take more samples than
necessary to generate the visualizations with ordering guarantees. They also
work well in practice, correctly ordering output groups while taking orders of
magnitude fewer samples and much less time than conventional sampling schemes.Comment: Tech Report. 17 pages. Condensed version to appear in VLDB Vol. 8 No.
Combining Clustering techniques and Formal Concept Analysis to characterize Interestingness Measures
Formal Concept Analysis "FCA" is a data analysis method which enables to
discover hidden knowledge existing in data. A kind of hidden knowledge
extracted from data is association rules. Different quality measures were
reported in the literature to extract only relevant association rules. Given a
dataset, the choice of a good quality measure remains a challenging task for a
user. Given a quality measures evaluation matrix according to semantic
properties, this paper describes how FCA can highlight quality measures with
similar behavior in order to help the user during his choice. The aim of this
article is the discovery of Interestingness Measures "IM" clusters, able to
validate those found due to the hierarchical and partitioning clustering
methods "AHC" and "k-means". Then, based on the theoretical study of sixty one
interestingness measures according to nineteen properties, proposed in a recent
study, "FCA" describes several groups of measures.Comment: 13 pages, 2 figure
Fuzzy Fibers: Uncertainty in dMRI Tractography
Fiber tracking based on diffusion weighted Magnetic Resonance Imaging (dMRI)
allows for noninvasive reconstruction of fiber bundles in the human brain. In
this chapter, we discuss sources of error and uncertainty in this technique,
and review strategies that afford a more reliable interpretation of the
results. This includes methods for computing and rendering probabilistic
tractograms, which estimate precision in the face of measurement noise and
artifacts. However, we also address aspects that have received less attention
so far, such as model selection, partial voluming, and the impact of
parameters, both in preprocessing and in fiber tracking itself. We conclude by
giving impulses for future research
- …