15,544 research outputs found
On mining complex sequential data by means of FCA and pattern structures
Nowadays data sets are available in very complex and heterogeneous ways.
Mining of such data collections is essential to support many real-world
applications ranging from healthcare to marketing. In this work, we focus on
the analysis of "complex" sequential data by means of interesting sequential
patterns. We approach the problem using the elegant mathematical framework of
Formal Concept Analysis (FCA) and its extension based on "pattern structures".
Pattern structures are used for mining complex data (such as sequences or
graphs) and are based on a subsumption operation, which in our case is defined
with respect to the partial order on sequences. We show how pattern structures
along with projections (i.e., a data reduction of sequential structures), are
able to enumerate more meaningful patterns and increase the computing
efficiency of the approach. Finally, we show the applicability of the presented
method for discovering and analyzing interesting patient patterns from a French
healthcare data set on cancer. The quantitative and qualitative results (with
annotations and analysis from a physician) are reported in this use case which
is the main motivation for this work.
Keywords: data mining; formal concept analysis; pattern structures;
projections; sequences; sequential data.Comment: An accepted publication in International Journal of General Systems.
The paper is created in the wake of the conference on Concept Lattice and
their Applications (CLA'2013). 27 pages, 9 figures, 3 table
Trajectory Clustering and an Application to Airspace Monitoring
This paper presents a framework aimed at monitoring the behavior of aircraft
in a given airspace. Nominal trajectories are determined and learned using data
driven methods. Standard procedures are used by air traffic controllers (ATC)
to guide aircraft, ensure the safety of the airspace, and to maximize the
runway occupancy. Even though standard procedures are used by ATC, the control
of the aircraft remains with the pilots, leading to a large variability in the
flight patterns observed. Two methods to identify typical operations and their
variability from recorded radar tracks are presented. This knowledge base is
then used to monitor the conformance of current operations against operations
previously identified as standard. A tool called AirTrajectoryMiner is
presented, aiming at monitoring the instantaneous health of the airspace, in
real time. The airspace is "healthy" when all aircraft are flying according to
the nominal procedures. A measure of complexity is introduced, measuring the
conformance of current flight to nominal flight patterns. When an aircraft does
not conform, the complexity increases as more attention from ATC is required to
ensure a safe separation between aircraft.Comment: 15 pages, 20 figure
NEARBY Platform: Algorithm for Automated Asteroids Detection in Astronomical Images
In the past two decades an increasing interest in discovering Near Earth
Objects has been noted in the astronomical community. Dedicated surveys have
been operated for data acquisition and processing, resulting in the present
discovery of over 18.000 objects that are closer than 30 million miles of
Earth. Nevertheless, recent events have shown that there still are many
undiscovered asteroids that can be on collision course to Earth. This article
presents an original NEO detection algorithm developed in the NEARBY research
object, that has been integrated into an automated MOPS processing pipeline
aimed at identifying moving space objects based on the blink method. Proposed
solution can be considered an approach of Big Data processing and analysis,
implementing visual analytics techniques for rapid human data validation.Comment: IEEE 14th International Conference on Intelligent Computer
Communication and Processing (ICCP), Sep 6-8, 2018, Cluj-Napoca, Romani
Analysing Human Mobility Patterns of Hiking Activities through Complex Network Theory
The exploitation of high volume of geolocalized data from social sport
tracking applications of outdoor activities can be useful for natural resource
planning and to understand the human mobility patterns during leisure
activities. This geolocalized data represents the selection of hike activities
according to subjective and objective factors such as personal goals, personal
abilities, trail conditions or weather conditions. In our approach, human
mobility patterns are analysed from trajectories which are generated by hikers.
We propose the generation of the trail network identifying special points in
the overlap of trajectories. Trail crossings and trailheads define our network
and shape topological features. We analyse the trail network of Balearic
Islands, as a case of study, using complex weighted network theory. The
analysis is divided into the four seasons of the year to observe the impact of
weather conditions on the network topology. The number of visited places does
not decrease despite the large difference in the number of samples of the two
seasons with larger and lower activity. It is in summer season where it is
produced the most significant variation in the frequency and localization of
activities from inland regions to coastal areas. Finally, we compare our model
with other related studies where the network possesses a different purpose. One
finding of our approach is the detection of regions with relevant importance
where landscape interventions can be applied in function of the communities.Comment: 20 pages, 9 figures, accepte
Mining Heterogeneous Multivariate Time-Series for Learning Meaningful Patterns: Application to Home Health Telecare
For the last years, time-series mining has become a challenging issue for
researchers. An important application lies in most monitoring purposes, which
require analyzing large sets of time-series for learning usual patterns. Any
deviation from this learned profile is then considered as an unexpected
situation. Moreover, complex applications may involve the temporal study of
several heterogeneous parameters. In that paper, we propose a method for mining
heterogeneous multivariate time-series for learning meaningful patterns. The
proposed approach allows for mixed time-series -- containing both pattern and
non-pattern data -- such as for imprecise matches, outliers, stretching and
global translating of patterns instances in time. We present the early results
of our approach in the context of monitoring the health status of a person at
home. The purpose is to build a behavioral profile of a person by analyzing the
time variations of several quantitative or qualitative parameters recorded
through a provision of sensors installed in the home
- …