1,867 research outputs found
Survey of data mining approaches to user modeling for adaptive hypermedia
The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noise within that data, and the necessity of capturing the imprecise nature of human behavior. Data mining and machine learning techniques have the ability to handle large amounts of data and to process uncertainty. These characteristics make these techniques suitable for automatic generation of user models that simulate human decision making. This paper surveys different data mining techniques that can be used to efficiently and accurately capture user behavior. The paper also presents guidelines that show which techniques may be used more efficiently according to the task implemented by the applicatio
14-08 Big Data Analytics to Aid Developing Livable Communities
In transportation, ubiquitous deployment of low-cost sensors combined with powerful computer hardware and high-speed network makes big data available. USDOT defines big data research in transportation as a number of advanced techniques applied to the capture, management and analysis of very large and diverse volumes of data. Data in transportation are usually well organized into tables and are characterized by relatively low dimensionality and yet huge numbers of records. Therefore, big data research in transportation has unique challenges on how to effectively process huge amounts of data records and data streams. The purpose of this study is to conduct research on the problems caused by large data volume and data streams and to develop applications for data analysis in transportation. To process large number of records efficiently, we have proposed to aggregate the data at multiple resolutions and to explore the data at various resolutions to balance between accuracy and speed. Techniques and algorithms in statistical analysis and data visualization have been developed for efficient data analytics using multiresolution data aggregation. Results will be helpful in setting up a primitive stage towards a rigorous framework for general analytical processing of big data in transportation
Applications of high-frequency telematics for driving behavior analysis
A thesis submitted in partial fulfillment of the requirements for the degree of Doctor in Information Management, specialization in Statistics and EconometricsProcessing driving data and investigating driving behavior has been receiving an
increasing interest in the last decades, with applications ranging from car insurance
pricing to policy-making. A popular way of analyzing driving behavior is to move
the focus to the maneuvers as they give useful information about the driver who is
performing them.
Previous research on maneuver detection can be divided into two strategies, namely,
1) using fixed thresholds in inertial measurements to define the start and end of specific
maneuvers or 2) using features extracted from rolling windows of sensor data
in a supervised learning model to detect maneuvers. While the first strategy is not
adaptable and requires fine-tuning, the second needs a dataset with labels (which is
time-consuming) and cannot identify maneuvers with different lengths in time.
To tackle these shortcomings, we investigate a new way of identifying maneuvers
from vehicle telematics data, through motif detection in time-series. Using a publicly
available naturalistic driving dataset (the UAH-DriveSet), we conclude that motif
detection algorithms are not only capable of extracting simple maneuvers such as accelerations,
brakes, and turns, but also more complex maneuvers, such as lane changes
and overtaking maneuvers, thus validating motif discovery as a worthwhile line for
future research in driving behavior.
We also propose TripMD, a system that extracts the most relevant driving patterns
from sensor recordings (such as acceleration) and provides a visualization that allows
for an easy investigation. We test TripMD in the same UAH-DriveSet dataset and show
that (1) our system can extract a rich number of driving patterns from a single driver
that are meaningful to understand driving behaviors and (2) our system can be used
to identify the driving behavior of an unknown driver from a set of drivers whose
behavior we know.Nas últimas décadas, o processamento e análise de dados de condução tem recebido
um interesse cada vez maior, com aplicações que abrangem a área de seguros de
automóveis até a atea de regulação. Tipicamente, a análise de condução compreende a
extração e estudo de manobras uma vez que estas contêm informação relevante sobre
a performance do condutor.
A investigação prévia sobre este tema pode ser dividida em dois tipos de estratégias,
a saber, 1) o uso de valores fixos de aceleração para definir o início e fim de cada
manobra ou 2) a utilização de modelos de aprendizagem supervisionada em janelas
temporais. Enquanto o primeiro tipo de estratégias é inflexível e requer afinação dos
parâmetros, o segundo precisa de dados de condução anotados (o que é moroso) e não
é capaz de identificar manobras de diferentes durações.
De forma a mitigar estas lacunas, neste trabalho, aplicamos métodos desenvolvidos
na área de investigação de séries temporais de forma a resolver o problema de deteção
de manobras. Em particular, exploramos área de deteção de motifs em séries temporais
e testamos se estes métodos genéricos são bem-sucedidos na deteção de manobras.
Também propomos o TripMD, um sistema que extrai os padrões de condução mais
relevantes de um conjuntos de viagens e fornece uma simples visualização. TripMD é
testado num conjunto de dados públicos (o UAH-DriveSet), do qual concluímos que
(1) o nosso sistema é capaz de extrair padrões de condução/manobras de um único
condutor que estão relacionados com o perfil de condução do condutor em questão e (2)
o nosso sistema pode ser usado para identificar o perfil de condução de um condutor
desconhecido de um conjunto de condutores cujo comportamento nos é conhecido
Dataflow Programming and Acceleration of Computationally-Intensive Algorithms
The volume of unstructured textual information continues to grow due to recent technological advancements. This resulted in an exponential growth of information generated in various formats, including blogs, posts, social networking, and enterprise documents. Numerous Enterprise Architecture (EA) documents are also created daily, such as reports, contracts, agreements, frameworks, architecture requirements, designs, and operational guides. The processing and computation of this massive amount of unstructured information necessitate substantial computing capabilities and the implementation of new techniques. It is critical to manage this unstructured information through a centralized knowledge management platform. Knowledge management is the process of managing information within an organization. This involves creating, collecting, organizing, and storing information in a way that makes it easily accessible and usable. The research involved the development textual knowledge management system, and two use cases were considered for extracting textual knowledge from documents. The first case study focused on the safety-critical documents of a railway enterprise. Safety is of paramount importance in the railway industry. There are several EA documents including manuals, operational procedures, and technical guidelines that contain critical information. Digitalization of these documents is essential for analysing vast amounts of textual knowledge that exist in these documents to improve the safety and security of railway operations. A case study was conducted between the University of Huddersfield and the Railway Safety Standard Board (RSSB) to analyse EA safety documents using Natural language processing (NLP). A graphical user interface was developed that includes various document processing features such as semantic search, document mapping, text summarization, and visualization of key trends. For the second case study, open-source data was utilized, and textual knowledge was extracted. Several features were also developed, including kernel distribution, analysis offkey trends, and sentiment analysis of words (such as unique, positive, and negative) within the documents. Additionally, a heterogeneous framework was designed using CPU/GPU and FPGAs to analyse the computational performance of document mapping
Introducing a method for modeling knowledge bases in expert systems using the example of large software development projects
Goal of this paper is to develop a meta-model, which provides the basis for developing highly scalable artificial intelligence systems that should be able to make autonomously decisions based on different dynamic and specific influences. An artificial neural network builds the entry point for developing a multi-layered human readable model that serves as knowledge base and can be used for further investigations in deductive and inductive reasoning. A graph-theoretical consideration gives a detailed view into the model structure. In addition to it the model is introduced using the example of large software development projects. The integration of Constraints and Deductive Reasoning Element Pruning are illustrated, which are required for executing deductive reasoning efficiently
Data Mining
Data mining is a branch of computer science that is used to automatically extract meaningful, useful knowledge and previously unknown, hidden, interesting patterns from a large amount of data to support the decision-making process. This book presents recent theoretical and practical advances in the field of data mining. It discusses a number of data mining methods, including classification, clustering, and association rule mining. This book brings together many different successful data mining studies in various areas such as health, banking, education, software engineering, animal science, and the environment
- …