4 research outputs found

    FARSEC: A Reproducible Framework for Automatic Real-Time Vehicle Speed Estimation Using Traffic Cameras

    Full text link
    Estimating the speed of vehicles using traffic cameras is a crucial task for traffic surveillance and management, enabling more optimal traffic flow, improved road safety, and lower environmental impact. Transportation-dependent systems, such as for navigation and logistics, have great potential to benefit from reliable speed estimation. While there is prior research in this area reporting competitive accuracy levels, their solutions lack reproducibility and robustness across different datasets. To address this, we provide a novel framework for automatic real-time vehicle speed calculation, which copes with more diverse data from publicly available traffic cameras to achieve greater robustness. Our model employs novel techniques to estimate the length of road segments via depth map prediction. Additionally, our framework is capable of handling realistic conditions such as camera movements and different video stream inputs automatically. We compare our model to three well-known models in the field using their benchmark datasets. While our model does not set a new state of the art regarding prediction performance, the results are competitive on realistic CCTV videos. At the same time, our end-to-end pipeline offers more consistent results, an easier implementation, and better compatibility. Its modular structure facilitates reproducibility and future improvements

    Enhancing accuracy in brain stroke detection: Multi-layer perceptron with Adadelta, RMSProp and AdaMax optimizers

    Get PDF
    The human brain is an extremely intricate and fascinating organ that is made up of the cerebrum, cerebellum, and brainstem and is protected by the skull. Brain stroke is recognized as a potentially fatal condition brought on by an unfavorable obstruction in the arteries supplying the brain. The severity of brain stroke may be reduced or controlled with its early prognosis to lessen the mortality rate and lead to good health. This paper proposed a technique to predict brain strokes with high accuracy. The model was constructed using data related to brain strokes. The aim of this work is to use Multi Layer Perceptron (MLP) as a classification technique for stroke data and used multi-optimizers that include Adaptive moment estimation with Maximum (AdaMax), Root Mean Squared Propagation (RMSProp) and Adaptive learning rate method (Adadelta). The experiment shows RMSProp optimizer is best with a data training accuracy of 95.8% and a value for data testing accuracy of 94.9%. The novelty of work is to incorporate multiple optimizers alongside the MLP classifier which offers a comprehensive approach to stroke prediction, providing a more robust and accurate solution. The obtained results underscore the effectiveness of the proposed methodology in enhancing the accuracy of brain stroke detection, thereby paving the way for potential advancements in medical diagnosis and treatment

    Agrupamiento espacio-temporal de secuencias de vídeo mediante caracterización por la respuesta de redes convolucionales

    Full text link
    En este trabajo de fin de grado se establece como objetivo el estudio de la viabilidad de la tarea de agrupamiento espacio-temporal de los cuadros de un vídeo capturado por medio de un vehículo en movimiento o una persona caminando mediante las características visuales extraídas tras el desarrollo de entrenamientos de redes neuronales convolucionales. El objetivo último es el de dividir la trayectoria seguida por el vehículo en grupos, atendiendo a sus cambios de orientación o posición (cambio de manzana), basándose exclusivamente en el análisis visual de las imágenes capturadas por la cámara a bordo. Para ello se diseña una base de datos usando imágenes del repositorio de Google Street View. Esta base de datos estará compuesta por vídeos capturados por el vehículo mientras recorre trayectorias en diferentes lugares del mundo. Una vez obtenida la base de datos, se examina y eliminan las imágenes que presenten distancias muy próximas con su predecesora, lo que resultará en la obtención de trayectorias consistentes con imágenes que, en los entrenamientos no aporten información redundante. Creada la base de datos, se carga una red pre-entrenada para la tarea de clasificación de imágenes, GoogleNet. Para ajustar esta red a nuestra base de datos se realizan pruebas, barriendo distintos valores de hiperparámetros para establecer la mejor configuración de la red. Una vez encontrados se realizan varios entrenamientos para la clasificación automática de los vídeos, variando la permeabilidad al aprendizaje de sus capas, a fin de aprovechar parte de las características aprendidas en entrenamientos previos. Habiendo sido entrenadas las redes, se propone extraer características para cada cuadro del vídeo, mediante truncamiento de la red a diferentes niveles. Con estas características, se calculan distancias entre todos los cuadros de cada vídeo, obteniendo matrices de distancias. Por último, haciendo uso de las matrices de distancias, se realizan múltiples pruebas mediante el uso de distintos algoritmos de agrupamiento, métricas de evaluación de distancias y criterios para la obtención de agrupaciones. En dichas pruebas se evalúa mediante la ejecución de casos prospectivos, la viabilidad de las agrupaciones obtenidas para cada una de las trayectorias. Los resultados preliminares sugieren que el método diseñado es capaz de obtener agrupaciones espacio-temporales de los cuadros del vídeo que concuerdan aproximadamente con la trayectoria seguida por el vehícul

    MOR-UAV: A Benchmark Dataset and Baselines for Moving Object Recognition in UAV Videos

    Full text link
    Visual data collected from Unmanned Aerial Vehicles (UAVs) has opened a new frontier of computer vision that requires automated analysis of aerial images/videos. However, the existing UAV datasets primarily focus on object detection. An object detector does not differentiate between the moving and non-moving objects. Given a real-time UAV video stream, how can we both localize and classify the moving objects, i.e. perform moving object recognition (MOR)? The MOR is one of the essential tasks to support various UAV vision-based applications including aerial surveillance, search and rescue, event recognition, urban and rural scene understanding.To the best of our knowledge, no labeled dataset is available for MOR evaluation in UAV videos. Therefore, in this paper, we introduce MOR-UAV, a large-scale video dataset for MOR in aerial videos. We achieve this by labeling axis-aligned bounding boxes for moving objects which requires less computational resources than producing pixel-level estimates. We annotate 89,783 moving object instances collected from 30 UAV videos, consisting of 10,948 frames in various scenarios such as weather conditions, occlusion, changing flying altitude and multiple camera views. We assigned the labels for two categories of vehicles (car and heavy vehicle). Furthermore, we propose a deep unified framework MOR-UAVNet for MOR in UAV videos. Since, this is a first attempt for MOR in UAV videos, we present 16 baseline results based on the proposed framework over the MOR-UAV dataset through quantitative and qualitative experiments. We also analyze the motion-salient regions in the network through multiple layer visualizations. The MOR-UAVNet works online at inference as it requires only few past frames. Moreover, it doesn't require predefined target initialization from user. Experiments also demonstrate that the MOR-UAV dataset is quite challenging
    corecore