7 research outputs found
Mining Association Rules Events over Data Streams
Data streams have gained considerable attention in data analysis and data mining communities because of the emergence of a new classes of applications, such as monitoring, supply chain execution, sensor networks, oilfield and pipeline operations, financial marketing and health data industries. Telecommunication advancements have provided us with easy access to stream data produced by various applications. Data in streams differ from static data stored in data warehouses or database. Data streams are continuous, arrive at high-speeds and change through time. Traditional data mining algorithms assume presence of data in conventional storage means where data mining is performed centrally with the luxury of accessing the data multiple times, using powerful processors, providing offline output with no time constraints. Such algorithms are not suitable for dynamic data streams. Stream data needs to be mined promptly as it might not be feasible to store such volume of data. In addition, streams reflect live status of the environment generating it, so prompt analysis may provide early detection of faults, delays, performance measurements, trend analysis and other diagnostics. This thesis focuses on developing a data stream association rule mining algorithm among co-occurring events. The proposed algorithm mines association rules over data streams incrementally in a centralized setting. We are interested in association rules that meet a provided minimum confidence threshold and have a lift value greater than 1. We refer to such association rules as strong rules. Experiments on several datasets demonstrate that the proposed algorithms is efficient and effective in extracting association rules from data streams, thus having a faster processing time and better memory management
Incremental Sparse-PCA Feature Extraction For Data Streams
Intruders attempt to penetrate commercial systems daily and cause considerable financial losses for individuals and organizations. Intrusion detection systems monitor network events to detect computer security threats. An extensive amount of network data is devoted to detecting malicious activities.
Storing, processing, and analyzing the massive volume of data is costly and indicate the need to find efficient methods to perform network data reduction that does not require the data to be first captured and stored. A better approach allows the extraction of useful variables from data streams in real time and in a single pass. The removal of irrelevant attributes reduces the data to be fed to the intrusion detection system (IDS) and shortens the analysis time while improving the classification accuracy. This dissertation introduces an online, real time, data processing method for knowledge extraction.
This incremental feature extraction is based on two approaches. First, Chunk Incremental Principal Component Analysis (CIPCA) detects intrusion in data streams. Then, two novel incremental feature extraction methods, Incremental Structured Sparse PCA (ISSPCA) and Incremental Generalized Power Method Sparse PCA (IGSPCA), find malicious elements. Metrics helped compare the performance of all methods.
The IGSPCA was found to perform as well as or better than CIPCA overall in term of dimensionality reduction, classification accuracy, and learning time. ISSPCA yielded better results for higher chunk values and greater accumulation ratio thresholds. CIPCA and IGSPCA reduced the IDS dataset to 10 principal components as opposed to 14 eigenvectors for ISSPCA. ISSPCA is more expensive in terms of learning time in comparison to the other techniques.
This dissertation presents new methods that perform feature extraction from continuous data streams to find the small number of features necessary to express the most data variance. Data subsets derived from a few important variables render their interpretation easier.
Another goal of this dissertation was to propose incremental sparse PCA algorithms capable to process data with concept drift and concept shift. Experiments using WaveForm and WaveFormNoise datasets confirmed this ability. Similar to CIPCA, the ISSPCA and IGSPCA updated eigen-axes as a function of the accumulation ratio value, forming informative eigenspace with few eigenvectors
Analyzing frequent patterns in data streams using a dynamic compact stream pattern algorithm
As a result of modern technology and the advancement in communication, a large amount of data streams are continually generated from various online applications, devices and sources. Mining frequent patterns from these streams of data is now an important research topic in the field of data mining and knowledge discovery. The traditional approach of mining data may not be appropriate for a large volume of data stream environment where the data volume is quite large and unbounded. They have the limitation of extracting recent change of knowledge in an adaptive mode from the data stream. Many algorithms and models have been developed to address the challenging task of mining data from an infinite influx of data generated from various points over the internet. The objective of this thesis is to introduce the concept of Dynamic Compact Pattern Stream tree (DCPS-tree) algorithm for mining recent data from the continuous data stream. Our DCPS-tree will dynamically achieves frequency descending prefix tree structure with only a single-pass over the data by applying tree restructuring techniques such as Branch sort method (BSM). This will cause any low frequency pattern to be maintained at the leaf nodes level and any high frequency components at a higher level. As a result of this, there will be a considerable mining time reduction on the datase
Classificação de pacientes para adaptação de cadeira de rodas inteligente
Doutoramento em Engenharia InformáticaA importância e preocupação dedicadas à autonomia e independência das
pessoas idosas e dos pacientes que sofrem de algum tipo de deficiência tem
vindo a aumentar significativamente ao longo das últimas décadas. As
cadeiras de rodas inteligentes (CRI) são tecnologias que podem ajudar este
tipo de população a aumentar a sua autonomia, sendo atualmente uma área
de investigação bastante ativa. Contudo, a adaptação das CRIs a pacientes
específicos e a realização de experiências com utilizadores reais são assuntos
de estudo ainda muito pouco aprofundados.
A cadeira de rodas inteligente, desenvolvida no âmbito do Projeto IntellWheels,
é controlada a alto nível utilizando uma interface multimodal flexível,
recorrendo a comandos de voz, expressões faciais, movimentos de cabeça e
através de joystick. Este trabalho teve como finalidade a adaptação automática
da CRI atendendo às características dos potenciais utilizadores.
Foi desenvolvida uma metodologia capaz de criar um modelo do utilizador. A
investigação foi baseada num sistema de recolha de dados que permite obter
e armazenar dados de voz, expressões faciais, movimentos de cabeça e do
corpo dos pacientes. A utilização da CRI pode ser efetuada em diferentes
situações em ambiente real e simulado e um jogo sério foi desenvolvido
permitindo especificar um conjunto de tarefas a ser realizado pelos
utilizadores. Os dados foram analisados recorrendo a métodos de extração de
conhecimento, de modo a obter o modelo dos utilizadores. Usando os
resultados obtidos pelo sistema de classificação, foi criada uma metodologia
que permite selecionar a melhor interface e linguagem de comando da cadeira
para cada utilizador.
A avaliação para validação da abordagem foi realizada no âmbito do Projeto
FCT/RIPD/ADA/109636/2009 - "IntellWheels - Intelligent Wheelchair with
Flexible Multimodal Interface". As experiências envolveram um vasto conjunto
de indivíduos que sofrem de diversos níveis de deficiência, em estreita
colaboração com a Escola Superior de Tecnologia de Saúde do Porto e a
Associação do Porto de Paralisia Cerebral. Os dados recolhidos através das
experiências de navegação na CRI foram acompanhados por questionários
preenchidos pelos utilizadores. Estes dados foram analisados estatisticamente,
a fim de provar a eficácia e usabilidade na adequação da interface da CRI ao
utilizador. Os resultados mostraram, em ambiente simulado, um valor de
usabilidade do sistema de 67, baseado na opinião de uma amostra de
pacientes que apresentam os graus IV e V (os mais severos) de Paralisia
Cerebral. Foi também demonstrado estatisticamente que a interface atribuída
automaticamente pela ferramenta tem uma avaliação superior à sugerida pelos
técnicos de Terapia Ocupacional, mostrando a possibilidade de atribuir
automaticamente uma linguagem de comando adaptada a cada utilizador.
Experiências realizadas com distintos modos de controlo revelaram a
preferência dos utilizadores por um controlo compartilhado com um nível de
ajuda associado ao nível de constrangimento do paciente. Em conclusão, este
trabalho demonstra que é possível adaptar automaticamente uma CRI ao
utilizador com claros benefícios a nível de usabilidade e segurança.The importance and concern given to the autonomy and independence of
elderly people and patients suffering from some kind of disability has been
growing significantly in the last few decades. Intelligent wheelchairs (IW) are
technologies that can increase the autonomy and independence of this kind of
population and are nowadays a very active research area. However, the
adaptations to users’ specificities and experiments with real users are topics
that lack deeper studies.
The intelligent wheelchair, developed in the context of the IntellWheels project,
is controlled at a high-level through a flexible multimodal interface, using voice
commands, facial expressions, head movements and joystick as its main input
modalities. This work intended to develop a system enabling the automatic
adaptation, to the user characteristics, of the previously developed intelligent
wheelchair.
A methodology was created enabling the creation of a user model. The
research was based on the development of a data gathering system, enabling
the collection and storage of data from voice commands, facial expressions,
head and body movements from several patients with distinct disabilities such
as Cerebral Palsy. The wheelchair can be used in different situations in real
and simulated environments and a serious game was developed where
different tasks may be performed by users.
Data was analysed using knowledge discovery methods in order to create an
automatic patient classification system. Based on the classification system, a
methodology was developed enabling to select the best wheelchair interface
and command language for each patient.
Evaluation was performed in the context of Project FCT/RIPD/ADA/109636/
2009 – “IntellWheels – Intelligent Wheelchair with Flexible Multimodal
Interface”. Experiments were conducted, using a large set of patients suffering
from severe physical constraints in close collaboration with Escola Superior de
Tecnologia de Saúde do Porto and Associação do Porto de Paralisia Cerebral.
The experiments using the intelligent wheelchair were followed by user
questionnaires. The results were statistically analysed in order to prove the
effectiveness and usability of the adaptation of the Intelligent Wheelchair
multimodal interface to the user characteristics. The results obtained in a
simulated environment showed a 67 score on the system usability scale based
in the opinion of a sample of cerebral palsy patients with the most severe cases
IV and V of the Gross Motor Function Scale. It was also statistically
demonstrated that the data analysis system advised the use of an adapted
interface with higher evaluation than the one suggested by the occupational
therapists, showing the usefulness of defining a command language adapted to
each user. Experiments conducted with distinct control modes revealed the
users' preference for a shared control with an aid level taking into account the
level of constraint of the patient. In conclusion, this work demonstrates that it is
possible to adapt an intelligent wheelchair to the user with clear usability and
safety benefits