3,802 research outputs found
Data mining and classification for traffic systems using genetic network programming
制度:新 ; 報告番号:甲3271号 ; 学位の種類:博士(工学) ; 授与年月日:2011/3/15 ; 早大学位記番号:新557
Recommended from our members
MapReduce network enabled algorithms for classification based on association rules
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.There is growing evidence that integrating classification and association rule mining can produce more efficient and accurate classifiers than traditional techniques. This thesis introduces a new MapReduce based association rule miner for extracting strong rules from large datasets. This miner is used later to develop a new large scale classifier. Also new MapReduce simulator was developed to evaluate the scalability of proposed algorithms on MapReduce clusters.
The developed associative rule miner inherits the MapReduce scalability to huge datasets and to thousands of processing nodes. For finding frequent itemsets, it uses hybrid approach between miners that uses counting methods on horizontal datasets, and miners that use set intersections on datasets of vertical formats. The new miner generates same rules that usually generated using apriori-like algorithms because it uses the same confidence and support thresholds definitions.
In the last few years, a number of associative classification algorithms have been proposed, i.e. CPAR, CMAR, MCAR, MMAC and others. This thesis also introduces a new MapReduce classifier that based MapReduce associative rule mining. This algorithm employs different approaches in rule discovery, rule ranking, rule pruning, rule prediction and rule evaluation methods. The new classifier works on multi-class datasets and is able to produce multi-label predications with probabilities for each predicted label. To evaluate the classifier 20 different datasets from the UCI data collection were used. Results show that the proposed approach is an accurate and effective classification technique, highly competitive and scalable if compared with other traditional and associative classification approaches.
Also a MapReduce simulator was developed to measure the scalability of MapReduce based applications easily and quickly, and to captures the behaviour of algorithms on cluster environments. This also allows optimizing the configurations of MapReduce clusters to get better execution times and hardware utilization
A framework for smart traffic management using heterogeneous data sources
A thesis submitted in partial fulfilment of the requirements of the University of Wolverhampton for the degree of Doctor of Philosophy.Traffic congestion constitutes a social, economic and environmental issue to modern cities as it can negatively impact travel times, fuel consumption and carbon emissions. Traffic forecasting and incident detection systems are fundamental areas of Intelligent Transportation Systems (ITS) that have been widely researched in the last decade. These systems provide real time information about traffic congestion and other unexpected incidents that can support traffic management agencies to activate strategies and notify users accordingly. However, existing techniques suffer from high false alarm rate and incorrect traffic measurements. In recent years, there has been an increasing interest in integrating different types of data sources to achieve higher precision in traffic forecasting and incident detection techniques. In fact, a considerable amount of literature has grown around the influence of integrating data from heterogeneous data sources into existing traffic management systems.
This thesis presents a Smart Traffic Management framework for future cities. The proposed framework fusions different data sources and technologies to improve traffic prediction and incident detection systems. It is composed of two components: social media and simulator component. The social media component consists of a text classification algorithm to identify traffic related tweets. These traffic messages are then geolocated using Natural Language Processing (NLP) techniques. Finally, with the purpose of further analysing user emotions within the tweet, stress and relaxation strength detection is performed. The proposed text classification algorithm outperformed similar studies in the literature and demonstrated to be more accurate than other machine learning algorithms in the same dataset. Results from the stress and relaxation analysis detected a significant amount of stress in 40% of the tweets, while the other portion did not show any emotions associated with them. This information can potentially be used for policy making in transportation, to understand the users��� perception of the transportation network. The simulator component proposes an optimisation procedure for determining missing roundabouts and urban roads flow distribution using constrained optimisation. Existing imputation methodologies have been developed on straight section of highways and their applicability for more complex networks have not been validated. This task presented a solution for the unavailability of roadway sensors in specific parts of the network and was able to successfully predict the missing values with very low percentage error. The proposed imputation methodology can serve as an aid for existing traffic forecasting and incident detection methodologies, as well as for the development of more realistic simulation networks
A Comprehensive Survey on Rare Event Prediction
Rare event prediction involves identifying and forecasting events with a low
probability using machine learning and data analysis. Due to the imbalanced
data distributions, where the frequency of common events vastly outweighs that
of rare events, it requires using specialized methods within each step of the
machine learning pipeline, i.e., from data processing to algorithms to
evaluation protocols. Predicting the occurrences of rare events is important
for real-world applications, such as Industry 4.0, and is an active research
area in statistical and machine learning. This paper comprehensively reviews
the current approaches for rare event prediction along four dimensions: rare
event data, data processing, algorithmic approaches, and evaluation approaches.
Specifically, we consider 73 datasets from different modalities (i.e.,
numerical, image, text, and audio), four major categories of data processing,
five major algorithmic groupings, and two broader evaluation approaches. This
paper aims to identify gaps in the current literature and highlight the
challenges of predicting rare events. It also suggests potential research
directions, which can help guide practitioners and researchers.Comment: 44 page
Web usage mining for click fraud detection
Estágio realizado na AuditMark e orientado pelo Eng.º Pedro FortunaTese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201
Crash/Near-Crash: Impact of Secondary Tasks and Real-Time Detection of Distracted Driving
The main goal of this dissertation is to investigate the problem of distracted driving from two different perspectives. First, the identification of possible sources of distraction and their associated crash/near-crash risk. That can assist government officials toward more informed decision-making process, allowing for optimized allocation of available resources to reduce roadway crashes and improve traffic safety. Second, actively counteracting the distracted driving phenomenon by quantitative evaluation of eye glance patterns.
This dissertation research consists of two different parts. The first part provides an in-depth analysis for the increased crash/near-crash risk associated with different secondary task activities using the largest real-world naturalistic driving dataset (SHRP2 Naturalistic Driving Study). Several statistical and data mining techniques are developed to analyze the distracted driving and crash risk. More specifically, two different models were employed to quantify the increased risk associated with each secondary task: a baseline-category logit model, and a rule mining association model. The baseline-category logit model identified the increased risk in terms of odds ratios, while the A-priori association algorithm detected the associated risks in terms of rules. Each rule was then evaluated based on the lift index. The two models succeeded in ranking all the secondary task activities according to the associated increased crash/near-crash risk efficiently.
To actively counteract to the distracted driving phenomenon, a new approach was developed to analyze eye glance patterns and quantify distracted driving behavior under safety and non-Safety Critical Events (SCEs). This approach is then applied to the Naturalistic Engagement in Secondary Tasks (NEST) dataset to investigate how drivers allocate their attention while driving, especially while distracted. The analysis revealed that distracted driving behavior can be well characterized using two new distraction risk indicators. Additional statistical analyses showed that the two indicators increase significantly for SCE compared to normal driving events. Consequently, an artificial neural network (ANN) model was developed to test the SCEs predictability power when accounting for the two new indicators. The ANN model was able to predict the SCEs with an overall accuracy of 96.1%. This outcome can help build reliable algorithms for in-vehicle driving assistance systems to alert drivers before SCEs
Advances in Robotics, Automation and Control
The book presents an excellent overview of the recent developments in the different areas of Robotics, Automation and Control. Through its 24 chapters, this book presents topics related to control and robot design; it also introduces new mathematical tools and techniques devoted to improve the system modeling and control. An important point is the use of rational agents and heuristic techniques to cope with the computational complexity required for controlling complex systems. Through this book, we also find navigation and vision algorithms, automatic handwritten comprehension and speech recognition systems that will be included in the next generation of productive systems developed by man
Road Traffic Congestion Analysis Via Connected Vehicles
La congestion routière est un état particulier de mobilité où les temps de déplacement augmentent et de plus en plus de temps est passé dans le véhicule. En plus d’être une expérience très stressante pour les conducteurs, la congestion a également un impact négatif sur l’environnement
et l’économie. Dans ce contexte, des pressions sont exercées sur les autorités afin qu’elles prennent des mesures décisives pour améliorer le flot du trafic sur le réseau
routier. En améliorant le flot, la congestion est réduite et la durée totale de déplacement des véhicules est réduite. D’une part, la congestion routière peut être récurrente, faisant référence à la congestion qui se produit régulièrement. La congestion non récurrente (NRC), quant à elle, dans un réseau urbain, est principalement causée par des incidents, des zones de construction, des événements spéciaux ou des conditions météorologiques défavorables. Les
opérateurs d’infrastructure surveillent le trafic sur le réseau mais sont contraints à utiliser le moins de ressources possibles. Cette contrainte implique que l’état du trafic ne peut pas être mesuré partout car il n’est pas réaliste de déployer des équipements sophistiqués pour assurer la collecte précise des données de trafic et la détection en temps réel des événements partout sur le réseau routier. Alors certains emplacements où le flot de trafic doit être amélioré ne sont pas surveillés car ces emplacements varient beaucoup. D’un autre côté, de nombreuses études sur la congestion routière ont été consacrées aux autoroutes plutôt qu’aux régions urbaines, qui sont pourtant beaucoup plus susceptibles d’être surveillées par les autorités de la circulation. De plus, les systèmes actuels de collecte de données de trafic n’incluent pas la possibilité d’enregistrer des informations détaillées sur les événements qui surviennent sur la route, tels que les collisions, les conditions météorologiques défavorables, etc. Aussi, les études proposées dans la littérature ne font que détecter la congestion ; mais ce n’est pas suffisant, nous devrions être en mesure de mieux caractériser l’événement qui en est la cause. Les agences doivent comprendre quelle est la cause qui affecte la variabilité de flot sur leurs installations et dans quelle mesure elles peuvent prendre les actions appropriées pour atténuer la congestion.----------ABSTRACT: Road traffic congestion is a particular state of mobility where travel times increase and more and more time is spent in vehicles. Apart from being a quite-stressful experience for drivers,
congestion also has a negative impact on the environment and the economy. In this context, there is pressure on the authorities to take decisive actions to improve the network traffic flow. By improving network flow, congestion is reduced and the total travel time of vehicles is decreased. In fact, congestion can be classified as recurrent and non-recurrent (NRC). Recurrent congestion refers to congestion that happens on a regular basis. Non-recurrent congestion in an urban network is mainly caused by incidents, workzones, special events and adverse weather. Infrastructure operators monitor traffic on the network while using the least possible resources. Thus, traffic state cannot be directly measured everywhere on the traffic road network. But the location where traffic flow needs to be improved varies highly and certainly, deploying highly sophisticated equipment to ensure the accurate estimation of traffic flows and timely detection of events everywhere on the road network is not feasible. Also, many studies have been devoted to highways rather than highly congested urban
regions which are intricate, complex networks and far more likely to be monitored by the traffic authorities. Moreover, current traffic data collection systems do not incorporate the ability of registring detailed information on the altering events happening on the road, such as vehicle crashes, adverse weather, etc. Operators require external data sources to retireve this information in real time. Current methods only detect congestion but it’s not enough,
we should be able to better characterize the event causing it. Agencies need to understand what is the cause affecting variability on their facilities and to what degree so that they can take the appropriate action to mitigate congestion
Predictive Techniques for Scene Understanding by using Deep Learning in Autonomous Driving
La conducción autónoma es considerada uno de los más grandes retos tecnológicos de la actualidad. Cuando los coches autónomos conquisten nuestras carreteras, los accidentes se reducirán notablemente, hasta casi desaparecer, ya que la tecnología estará testada y no incumplirá las normas de conducción, entre otros beneficios sociales y económicos. Uno de los aspectos más críticos a la hora de desarrollar un vehículo autónomo es percibir y entender la escena que le rodea. Esta tarea debe ser tan precisa y eficiente como sea posible para posteriormente predecir el futuro de esta misma y ayudar a la toma de decisiones. De esta forma, las acciones tomadas por el vehículo garantizarán tanto la seguridad del vehículo en sí mismo y sus ocupantes, como la de los obstáculos circundantes, tales como viandantes, otros vehículos o infraestructura de la carretera. En ese sentido, esta tesis doctoral se centra en el estudio y desarrollo de distintas técnicas predictivas para el entendimiento de la escena en el contexto de la conducción autónoma. Durante la tesis, se observa una incorporación progresiva de técnicas de aprendizaje profundo en los distintos algoritmos propuestos para mejorar el razonamiento sobre qué está ocurriendo en el escenario de tráfico, así como para modelar las complejas interacciones entre la información social (distintos participantes o agentes del escenario, tales como vehículos, ciclistas o peatones) y física (es decir, la información geométrica, semántica y topológica del mapa de alta definición) presente en la escena. La capa de percepción de un vehículo autónomo se divide modularmente en tres etapas: Detección, Seguimiento (Tracking), y Predicción. Para iniciar el estudio de las etapas de seguimiento y predicción, se propone un algoritmo de Multi-Object Tracking basado en técnicas clásicas de estimación de movimiento y asociación validado en el dataset KITTI, el cual obtiene métricas del estado del arte. Por otra parte, se propone el uso de un filtro inteligente basado en información contextual de mapa, cuyo objetivo es monitorizar los agentes más relevantes de la escena en el tiempo, representando estos agentes filtrados la entrada preliminar para realizar predicciones unimodales basadas en un modelo cinemático. Para validar esta propuesta de filtro inteligente se usa CARLA (CAR Learning to Act), uno de los simuladores hiperrealistas para conducción autónoma más prometedores en la actualidad, comprobando cómo al usar información contextual de mapa se puede reducir notablemente el tiempo de inferencia de un algoritmo de tracking y predicción basados en métodos físicos, prestando atención a los agentes realmente relevantes del escenario de tráfico. Tras observar las limitaciones de un modelo de predicción basado en cinemática para la predicción a largo plazo de un agente, los distintos algoritmos de la tesis se centran en el módulo de predicción, usando los datasets Argoverse 1 y Argoverse 2, donde se asume que los agentes proporcionados en cada escenario de tráfico ya están monitorizados durante un cierto número de observaciones. En primer lugar, se introduce un modelo basado en redes neuronales recurrentes (particularmente redes LSTM, Long-Short Term Memory) y mecanismo de atención para codificar las trayectorias pasadas de los agentes, y una representación simplificada del mapa en forma de posiciones finales potenciales en la carretera para calcular las trayectorias futuras unimodales, todo envuelto en un marco GAN (Generative Adversarial Network), obteniendo métricas similares al estado del arte en el caso unimodal. Una vez validado el modelo anterior en Argoverse 1, se proponen distintos modelos base (sólo social, incorporando mapa, y una mejora final basada en Transformer encoder, redes convolucionales 1D y mecanismo de atención cruzada para la fusión de características) precisos y eficientes basados en el modelo de predicción anterior, introduciendo dos nuevos conceptos. Por un lado, el uso de redes neuronales gráficas (particularmente GCN, Graph Convolutional Network) para codificar de una forma potente las interacciones de los agentes. Por otro lado, se propone el preprocesamiento de trayectorias preliminares a partir de un mapa con un método heurístico. Gracias a estas entradas y una arquitectura más potente de codificación, los modelos base serán capaces de predecir distintas trayectorias futuras multimodales, es decir, cubriendo distintos posibles futuros para el agente de interés. Los modelos base propuestos obtienen métricas de regresión del estado del arte tanto en el caso multimodal como unimodal manteniendo un claro compromiso de eficiencia con respecto a otras propuestas. El modelo final de la tesis, inspirado en los modelos anteriores y validado en el más reciente dataset para algoritmos de predicción en conducción autónoma (Argoverse 2), introduce varias mejoras para entender mejor el escenario de tráfico y decodificar la información de una forma precisa y eficiente. Se propone incorporar información topológica y semántica de los carriles futuros preliminares con el método heurístico antes mencionado, codificación de mapa basada en aprendizaje profundo con redes GCN, ciclo de fusión de características físicas y sociales, estimación de posiciones finales en la carretera y agregación de su entorno circundante con aprendizaje profundo y finalmente módulo de refinado para mejorar la calidad de las predicciones multimodales finales de un modo elegante y eficiente. Comparado con el estado del arte, nuestro método logra métricas de predicción a la par con los métodos mejor posicionados en el Leaderboard de Argoverse 2, reduciendo de forma notable el número de parámetros y operaciones de coma flotante por segundo. Por último, el modelo final de la tesis ha sido validado en simulación en distintas aplicaciones de conducción autónoma. En primer lugar, se integra el modelo para proporcionar predicciones a un algoritmo de toma de decisiones basado en aprendizaje por refuerzo en el simulador SMARTS (Scalable Multi-Agent Reinforcement Learning Training School), observando en los estudios como el vehículo es capaz de tomar mejores decisiones si conoce el comportamiento futuro de la escena y no solo el estado actual o pasado de esta misma. En segundo lugar, se ha realizado un estudio de adaptación de dominio exitoso en el simulador hiperrealista CARLA en distintos escenarios desafiantes donde el entendimiento de la escena y predicción del entorno son muy necesarios, como una autopista o rotonda con gran densidad de tráfico o la aparición de un usuario vulnerable de la carretera de forma repentina. En ese sentido, el modelo de predicción ha sido integrado junto con el resto de capas de la arquitectura de navegación autónoma del grupo de investigación donde se desarrolla la tesis como paso previo a su implementación en un vehículo autónomo real
- …