205,510 research outputs found
SUPPORT EFFECTIVE DISCOVERY MANAGEMENT IN VISUAL ANALYTICS
Visual analytics promises to supply analysts with the means necessary to ana- lyze complex datasets and make effective decisions in a timely manner. Although significant progress has been made towards effective data exploration in existing vi- sual analytics systems, few of them provide systematic solutions for managing the vast amounts of discoveries generated in data exploration processes. Analysts have to use off line tools to manually annotate, browse, retrieve, organize, and connect their discoveries. In addition, they have no convenient access to the important discoveries captured by collaborators. As a consequence, the lack of effective discovery manage- ment approaches severely hinders the analysts from utilizing the discoveries to make effective decisions.
In response to this challenge, this dissertation aims to support effective discov- ery management in visual analytics. It contributes a general discovery manage- ment framework which achieves its effectiveness surrounding the concept of patterns, namely the results of users’ low-level analytic tasks. Patterns permit construction of discoveries together with users’ mental models and evaluation. Different from the mental models, the categories of patterns that can be discovered from data are pre- dictable and application-independent. In addition, the same set of information is often used to annotate patterns in the same category. Therefore, visual analytics sys- tems can semi-automatically annotate patterns in a formalized format by predicting what should be recorded for patterns in popular categories. Using the formalized an- notations, the framework also enhances the automation and efficiency of a variety of discovery management activities such as discovery browsing, retrieval, organization, association, and sharing. The framework seamlessly integrates them with the visual interactive explorations to support effective decision making.
Guided by the discovery management framework, our second contribution lies
in proposing a variety of novel discovery management techniques for facilitating the discovery management activities. The proposed techniques and framework are im- plemented in a prototype system, ManyInsights, to facilitate discovery management in multidimensional data exploration. To evaluate the prototype system, two long- term case studies are presented. They investigated how the discovery management techniques worked together to benefit exploratory data analysis and collaborative analysis. The studies allowed us to understand the advantages, the limitations, and design implications of ManyInsights and its underlying framework
TriSig: Assessing the statistical significance of triclusters
Tensor data analysis allows researchers to uncover novel patterns and
relationships that cannot be obtained from matrix data alone. The information
inferred from the patterns provides valuable insights into disease progression,
bioproduction processes, weather fluctuations, and group dynamics. However,
spurious and redundant patterns hamper this process. This work aims at
proposing a statistical frame to assess the probability of patterns in tensor
data to deviate from null expectations, extending well-established principles
for assessing the statistical significance of patterns in matrix data. A
comprehensive discussion on binomial testing for false positive discoveries is
entailed at the light of: variable dependencies, temporal dependencies and
misalignments, and \textit{p}-value corrections under the Benjamini-Hochberg
procedure. Results gathered from the application of state-of-the-art
triclustering algorithms over distinct real-world case studies in biochemical
and biotechnological domains confer validity to the proposed statistical frame
while revealing vulnerabilities of some triclustering searches. The proposed
assessment can be incorporated into existing triclustering algorithms to
mitigate false positive/spurious discoveries and further prune the search
space, reducing their computational complexity.
Availability: The code is freely available at
https://github.com/JupitersMight/TriSig under the MIT license
Predicting Human Lifespan Limits
Recent discoveries show steady improvements in life expectancy during modern
decades. Does this support that humans continue to live longer in future? We
recently put forward the maximum survival tendency, as found in survival curves
of industrialized countries, which is described by extended Weibull model with
age-dependent stretched exponent. The maximum survival tendency suggests that
human survival dynamics may possess its intrinsic limit, beyond which survival
is inevitably forbidden. Based on such tendency, we develop the model and
explore the patterns in the maximum lifespan limits from industrialized
countries during recent three decades. This analysis strategy is simple and
useful to interpret the complicated human survival dynamics.Comment: 11 pages, 3 figures, 2 tables; Natural Science (in press
Intellectual Capital and the Birth of U.S. Biotechnology Enterprises
We examine the relationship between the intellectual capital of scientists making frontier discoveries, the presence of great university bioscience programs, the presence of venture capital firms, other economic variables, and the founding of U.S. biotechnology enterprises during 1976-1989. Using a linked cross-section/time- series panel data set, we find that the timing and location of the birth of biotech enterprises is determined primarily by intellectual capital measures, particularly the local number of highly productive 'star' scientists actively publishing genetic sequence discoveries. Great universities are likely to grow and recruit star scientists, but their effect is separable from the universities. When the intellectual capital measures are included in our poisson regressions, the number of venture capital firms in an area reduces the probability of foundings. At least early in the process, star scientists appear to be the scarce, immobile factors of production. Our focus on intellectual capital is related to knowledge spillovers, but in this case 'natural excludability' permits capture of supranormal returns by scientists. Given this reward structure technology transfer was vigorous without any special intermediating structures. We believe biotechnology may be prototypical of the birth patterns in other innovative industries.
- …