23,430 research outputs found
Mining Closed Itemsets for Coherent Rules: An Inference Analysis Approach
Past observations have shown that a frequent item set mining algorithm are alleged to mine the closed ones because the finish offers a compact and a whole progress set and higher potency. Anyhow, the most recent closed item set mining algorithms works with candidate maintenance combined with check paradigm that is dear in runtime likewise as area usage when support threshold is a smaller amount or the item sets gets long. Here, we show, PEPP with inference analysis that could be a capable approach used for mining closed sequences for coherent rules while not candidate. It implements a unique sequence closure checking format with inference analysis that based mostly on Sequence Graph protruding by an approach labeled Parallel Edge projection and pruning in brief will refer as PEPP. We describe a novel inference analysis approach to prune patterns that tends to derive coherent rules. A whole observation having sparse and dense real-life information sets proved that PEPP with inference analysis performs larger compared to older algorithms because it takes low memory and is quicker than any algorithms those cited in literature frequently
Exploratory topic modeling with distributional semantics
As we continue to collect and store textual data in a multitude of domains,
we are regularly confronted with material whose largely unknown thematic
structure we want to uncover. With unsupervised, exploratory analysis, no prior
knowledge about the content is required and highly open-ended tasks can be
supported. In the past few years, probabilistic topic modeling has emerged as a
popular approach to this problem. Nevertheless, the representation of the
latent topics as aggregations of semi-coherent terms limits their
interpretability and level of detail.
This paper presents an alternative approach to topic modeling that maps
topics as a network for exploration, based on distributional semantics using
learned word vectors. From the granular level of terms and their semantic
similarity relations global topic structures emerge as clustered regions and
gradients of concepts. Moreover, the paper discusses the visual interactive
representation of the topic map, which plays an important role in supporting
its exploration.Comment: Conference: The Fourteenth International Symposium on Intelligent
Data Analysis (IDA 2015
Mining Heterogeneous Multivariate Time-Series for Learning Meaningful Patterns: Application to Home Health Telecare
For the last years, time-series mining has become a challenging issue for
researchers. An important application lies in most monitoring purposes, which
require analyzing large sets of time-series for learning usual patterns. Any
deviation from this learned profile is then considered as an unexpected
situation. Moreover, complex applications may involve the temporal study of
several heterogeneous parameters. In that paper, we propose a method for mining
heterogeneous multivariate time-series for learning meaningful patterns. The
proposed approach allows for mixed time-series -- containing both pattern and
non-pattern data -- such as for imprecise matches, outliers, stretching and
global translating of patterns instances in time. We present the early results
of our approach in the context of monitoring the health status of a person at
home. The purpose is to build a behavioral profile of a person by analyzing the
time variations of several quantitative or qualitative parameters recorded
through a provision of sensors installed in the home
Subjectively Interesting Subgroup Discovery on Real-valued Targets
Deriving insights from high-dimensional data is one of the core problems in
data mining. The difficulty mainly stems from the fact that there are
exponentially many variable combinations to potentially consider, and there are
infinitely many if we consider weighted combinations, even for linear
combinations. Hence, an obvious question is whether we can automate the search
for interesting patterns and visualizations. In this paper, we consider the
setting where a user wants to learn as efficiently as possible about
real-valued attributes. For example, to understand the distribution of crime
rates in different geographic areas in terms of other (numerical, ordinal
and/or categorical) variables that describe the areas. We introduce a method to
find subgroups in the data that are maximally informative (in the formal
Information Theoretic sense) with respect to a single or set of real-valued
target attributes. The subgroup descriptions are in terms of a succinct set of
arbitrarily-typed other attributes. The approach is based on the Subjective
Interestingness framework FORSIED to enable the use of prior knowledge when
finding most informative non-redundant patterns, and hence the method also
supports iterative data mining.Comment: 12 pages, 10 figures, 2 tables, conference submissio
Intrinsically Dynamic Network Communities
Community finding algorithms for networks have recently been extended to
dynamic data. Most of these recent methods aim at exhibiting community
partitions from successive graph snapshots and thereafter connecting or
smoothing these partitions using clever time-dependent features and sampling
techniques. These approaches are nonetheless achieving longitudinal rather than
dynamic community detection. We assume that communities are fundamentally
defined by the repetition of interactions among a set of nodes over time.
According to this definition, analyzing the data by considering successive
snapshots induces a significant loss of information: we suggest that it blurs
essentially dynamic phenomena - such as communities based on repeated
inter-temporal interactions, nodes switching from a community to another across
time, or the possibility that a community survives while its members are being
integrally replaced over a longer time period. We propose a formalism which
aims at tackling this issue in the context of time-directed datasets (such as
citation networks), and present several illustrations on both empirical and
synthetic dynamic networks. We eventually introduce intrinsically dynamic
metrics to qualify temporal community structure and emphasize their possible
role as an estimator of the quality of the community detection - taking into
account the fact that various empirical contexts may call for distinct
`community' definitions and detection criteria.Comment: 27 pages, 11 figure
Integration of decision support systems to improve decision support performance
Decision support system (DSS) is a well-established research and development area. Traditional isolated, stand-alone DSS has been recently facing new challenges. In order to improve the performance of DSS to meet the challenges, research has been actively carried out to develop integrated decision support systems (IDSS). This paper reviews the current research efforts with regard to the development of IDSS. The focus of the paper is on the integration aspect for IDSS through multiple perspectives, and the technologies that support this integration. More than 100 papers and software systems are discussed. Current research efforts and the development status of IDSS are explained, compared and classified. In addition, future trends and challenges in integration are outlined. The paper concludes that by addressing integration, better support will be provided to decision makers, with the expectation of both better decisions and improved decision making processes
- …