Search CORE

790 research outputs found

NEW METHODS FOR MINING SEQUENTIAL AND TIME SERIES DATA

Author: Al-Naymat Ghazi
Publication venue: Faculty of Engineering and Information Technologies, School of Information Technologies
Publication date: 01/01/2009
Field of study

Data mining is the process of extracting knowledge from large amounts of data. It covers a variety of techniques aimed at discovering diverse types of patterns on the basis of the requirements of the domain. These techniques include association rules mining, classification, cluster analysis and outlier detection. The availability of applications that produce massive amounts of spatial, spatio-temporal (ST) and time series data (TSD) is the rationale for developing specialized techniques to excavate such data. In spatial data mining, the spatial co-location rule problem is different from the association rule problem, since there is no natural notion of transactions in spatial datasets that are embedded in continuous geographic space. Therefore, we have proposed an efficient algorithm (GridClique) to mine interesting spatial co-location patterns (maximal cliques). These patterns are used as the raw transactions for an association rule mining technique to discover complex co-location rules. Our proposal includes certain types of complex relationships – especially negative relationships – in the patterns. The relationships can be obtained from only the maximal clique patterns, which have never been used until now. Our approach is applied on a well-known astronomy dataset obtained from the Sloan Digital Sky Survey (SDSS). ST data is continuously collected and made accessible in the public domain. We present an approach to mine and query large ST data with the aim of finding interesting patterns and understanding the underlying process of data generation. An important class of queries is based on the flock pattern. A flock is a large subset of objects moving along paths close to each other for a predefined time. One approach to processing a “flock query” is to map ST data into high-dimensional space and to reduce the query to a sequence of standard range queries that can be answered using a spatial indexing structure; however, the performance of spatial indexing structures rapidly deteriorates in high-dimensional space. This thesis sets out a preprocessing strategy that uses a random projection to reduce the dimensionality of the transformed space. We use probabilistic arguments to prove the accuracy of the projection and to present experimental results that show the possibility of managing the curse of dimensionality in a ST setting by combining random projections with traditional data structures. In time series data mining, we devised a new space-efficient algorithm (SparseDTW) to compute the dynamic time warping (DTW) distance between two time series, which always yields the optimal result. This is in contrast to other approaches which typically sacrifice optimality to attain space efficiency. The main idea behind our approach is to dynamically exploit the existence of similarity and/or correlation between the time series: the more the similarity between the time series, the less space required to compute the DTW between them. Other techniques for speeding up DTW, impose a priori constraints and do not exploit similarity characteristics that may be present in the data. Our experiments demonstrate that SparseDTW outperforms these approaches. We discover an interesting pattern by applying SparseDTW algorithm: “pairs trading” in a large stock-market dataset, of the index daily prices from the Australian stock exchange (ASX) from 1980 to 2002

CiteSeerX

Sydney eScholarship

IEEE Access Special Section Editorial: Big Data Technology and Applications in Intelligent Transportation

Author: Arabnia Hamid R.
Kim Tai-Hoon
Mohammed Sabah
Qu Xiaobo
Zhang Dalin
Zhao Jiandong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

During the last few years, information technology and transportation industries, along with automotive manufacturers and academia, are focusing on leveraging intelligent transportation systems (ITS) to improve services related to driver experience, connected cars, Internet data plans for vehicles, traffic infrastructure, urban transportation systems, traffic collaborative management, road traffic accidents analysis, road traffic flow prediction, public transportation service plan, personal travel route plans, and the development of an effective ecosystem for vehicles, drivers, traffic controllers, city planners, and transportation applications. Moreover, the emerging technologies of the Internet of Things (IoT) and cloud computing have provided unprecedented opportunities for the development and realization of innovative intelligent transportation systems where sensors and mobile devices can gather information and cloud computing, allowing knowledge discovery, information sharing, and supported decision making. However, the development of such data-driven ITS requires the integration, processing, and analysis of plentiful information obtained from millions of vehicles, traffic infrastructures, smartphones, and other collaborative systems like weather stations and road safety and early warning systems. The huge amount of data generated by ITS devices is only of value if utilized in data analytics for decision-making such as accident prevention and detection, controlling road risks, reducing traffic carbon emissions, and other applications which bring big data analytics into the picture

Chalmers Research

Review of Top Quark Physics Results

Author: A. KUMAR
Abachi S.
Abazov V.
Abazov V.
Abulencia A.
Abulencia A.
Abulencia A.
Abulencia A.
Acosta D.
Acosta D.
Akessan T.
Albajar C.
Albajar C.
Behrend H.
Berends F. A.
Breiman L.
Cacciari M.
Catani S.
Corcella G.
Dimoupoulos S.
Elsen E.
Haestier J.
Heinemeyer S.
Heinemeyer S.
M. NARAIN
Maltoni F.
Mangano M.
Mrenna S.
Quigg C.
R. KEHOE
Salam A.
Stelzer T.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/12/2007
Field of study

As the heaviest known fundamental particle, the top quark has taken a central role in the study of fundamental interactions. Production of top quarks in pairs provides an important probe of strong interactions. The top quark mass is a key fundamental parameter which places a valuable constraint on the Higgs boson mass and electroweak symmetry breaking. Observations of the relative rates and kinematics of top quark final states constrain potential new physics. In many cases, the tests available with study of the top quark are both critical and unique. Large increases in data samples from the Fermilab Tevatron have been coupled with major improvements in experimental techniques to produce many new precision measurements of the top quark. The first direct evidence for electroweak production of top quarks has been obtained, with a resulting direct determination of

V_{tb}

. Several of the properties of the top quark have been measured. Progress has also been made in obtaining improved limits on potential anomalous production and decay mechanisms. This review presents an overview of recent theoretical and experimental developments in this field. We also provide a brief discussion of the implications for further efforts.Comment: 119 pages, 55 figure

arXiv.org e-Print Archive

Crossref

UNT Digital Library

An introduction to the Baum and EM algorithms for maximum likelihood estimation

Author: Kamp Yves
Publication venue: Instituut voor Perceptie Onderzoek (IPO)
Publication date: 23/10/1991
Field of study

Pure OAI Repository

Integration of Synthesis and Operational Design of Batch Processes

Author: Papaoikonomou Eirini
Publication venue: Technical University of Denmark
Publication date: 01/03/2006
Field of study

Online Research Database In Technology

Fouille de séquences temporelles pour la maintenance prédictive : application aux données de véhicules traceurs ferroviaires

Author: SAMMOURI Wissam
Publication venue: HAL CCSD
Publication date: 20/06/2014
Field of study

In order to meet the mounting social and economic demands, railway operators and manufacturers are striving for a longer availability and a better reliability of railway transportation systems. Commercial trains are being equipped with state-of-the-art onboard intelligent sensors monitoring various subsystems all over the train. These sensors provide real-time flow of data, called floating train data, consisting of georeferenced events, along with their spatial and temporal coordinates. Once ordered with respect to time, these events can be considered as long temporal sequences which can be mined for possible relationships. This has created a neccessity for sequential data mining techniques in order to derive meaningful associations rules or classification models from these data. Once discovered, these rules and models can then be used to perform an on-line analysis of the incoming event stream in order to predict the occurrence of target events, i.e, severe failures that require immediate corrective maintenance actions. The work in this thesis tackles the above mentioned data mining task. We aim to investigate and develop various methodologies to discover association rules and classification models which can help predict rare tilt and traction failures in sequences using past events that are less critical. The investigated techniques constitute two major axes: Association analysis, which is temporal and Classification techniques, which is not temporal. The main challenges confronting the data mining task and increasing its complexity are mainly the rarity of the target events to be predicted in addition to the heavy redundancy of some events and the frequent occurrence of data bursts. The results obtained on real datasets collected from a fleet of trains allows to highlight the effectiveness of the approaches and methodologies usedDe nos jours, afin de répondre aux exigences économiques et sociales, les systèmes de transport ferroviaire ont la nécessité d'être exploités avec un haut niveau de sécurité et de fiabilité. On constate notamment un besoin croissant en termes d'outils de surveillance et d'aide à la maintenance de manière à anticiper les défaillances des composants du matériel roulant ferroviaire. Pour mettre au point de tels outils, les trains commerciaux sont équipés de capteurs intelligents envoyant des informations en temps réel sur l'état de divers sous-systèmes. Ces informations se présentent sous la forme de longues séquences temporelles constituées d'une succession d'événements. Le développement d'outils d'analyse automatique de ces séquences permettra d'identifier des associations significatives entre événements dans un but de prédiction d'événement signant l'apparition de défaillance grave. Cette thèse aborde la problématique de la fouille de séquences temporelles pour la prédiction d'événements rares et s'inscrit dans un contexte global de développement d'outils d'aide à la décision. Nous visons à étudier et développer diverses méthodes pour découvrir les règles d'association entre événements d'une part et à construire des modèles de classification d'autre part. Ces règles et/ou ces classifieurs peuvent ensuite être exploités pour analyser en ligne un flux d'événements entrants dans le but de prédire l'apparition d'événements cibles correspondant à des défaillances. Deux méthodologies sont considérées dans ce travail de thèse: La première est basée sur la recherche des règles d'association, qui est une approche temporelle et une approche à base de reconnaissance de formes. Les principaux défis auxquels est confronté ce travail sont principalement liés à la rareté des événements cibles à prédire, la redondance importante de certains événements et à la présence très fréquente de "bursts". Les résultats obtenus sur des données réelles recueillies par des capteurs embarqués sur une flotte de trains commerciaux permettent de mettre en évidence l'efficacité des approches proposée

Thèses en Ligne

Theses.fr

HAL - UPEC / UPEM

On the observability of electrical cardiac sources

Author: Damen A.A.H.
Publication venue: Technische Hogeschool Eindhoven
Publication date: 01/01/1980
Field of study

Repository TU/e

Pure OAI Repository

Multivariate Correlation Discovery in Streaming Data

Author: d'Hondt Jens
Publication venue
Publication date: 22/09/2021
Field of study

Pure OAI Repository

Self organizing distributed state estimators

Author: Papp Z.
Sijs J.
Publication venue: CRC Press
Publication date: 01/01/2012
Field of study

Distributed solutions for signal processing techniques are important for establishing large-scale monitoring and control applications. They enable the deployment of scalable sensor networks for particular application areas. Typically, such networks consists of a large number of vulnerable components connected via unreliable communication links and are sometimes deployed in harsh environment. Therefore, dependability of sensor network is a challenging problem. An efficient and cost effective answer to this challenge is provided by employing runtime reconfiguration techniques that assure the integrity of the desired signal processing functionalities. Runtime reconfigurability has thorough impact both on system design, implementation, testing/validation and deployment. The presented research focuses on the widespreaded signal processing method known as state estimation with Kalman filtering in particular. To that extent, a number of distributed state estimation solutions that are suitable for networked systems in general are overviewed, after which robustness of the system is improved according to various runtime reconfiguration techniques

Repository TU/e

Pure OAI Repository