Search CORE

165 research outputs found

Robust object detection in the wild via cascaded DCGAN

Author: Dinakaran Ranjith Kumar
Publication venue
Publication date
Field of study

This research deals with the challenges of object detection at a distance or low resolution in the wild. The main intention of this research is to exploit and cascade state-of-the-art models and propose a new framework for enabling successful deployment for diverse applications. Specifically, the proposed deep learning framework uses state-of-the-art deep networks, such as Deep Convolutional Generative Adversarial Network (DCGAN) and Single Shot Detector (SSD). It combines the above two deep learning models to generate a new framework, namely DCGAN-SSD. The proposed model can deal with object detection and recognition in the wild with various image resolutions and scaling differences. To deal with multiple object detection tasks, the training of this network model in this research has been conducted using different cross-domain datasets for various applications. The efficiency of the proposed model can further be determined by the validation of diverse applications such as visual surveillance in the wild in intelligent cities, underwater object detection for crewless underwater vehicles, and on-street in-vehicle object detection for driverless vehicle technologies. The results produced by DCGAN-SSD indicate that the proposed method in this research, along with Particle Swarm Optimization (PSO), outperforms every other application concerning object detection and demonstrates its great superiority in improving object detection performance in diverse testing cases. The DCGAN-SSD model is equipped with PSO, which helps select the hyperparameter for the object detector. Most object detectors struggle in this regard, as they require manual effort in selecting the hyperparameters to obtain better object detection. This research encountered the problem of hyperparameter selection through the integration of PSO with SSD. The main reason the research conducted with deep learning models was the traditional machine learning models lag in accuracy and performance. The advantage of this research and it is achieved with the integration of DCGAN-SSD has been accommodated under a single pipeline

Northumbria Research Link

Towards a Common Software/Hardware Methodology for Future Advanced Driver Assistance Systems

Author
Publication venue: 'Informa UK Limited'
Publication date: 28/11/2022
Field of study

The European research project DESERVE (DEvelopment platform for Safe and Efficient dRiVE, 2012-2015) had the aim of designing and developing a platform tool to cope with the continuously increasing complexity and the simultaneous need to reduce cost for future embedded Advanced Driver Assistance Systems (ADAS). For this purpose, the DESERVE platform profits from cross-domain software reuse, standardization of automotive software component interfaces, and easy but safety-compliant integration of heterogeneous modules. This enables the development of a new generation of ADAS applications, which challengingly combine different functions, sensors, actuators, hardware platforms, and Human Machine Interfaces (HMI). This book presents the different results of the DESERVE project concerning the ADAS development platform, test case functions, and validation and evaluation of different approaches. The reader is invited to substantiate the content of this book with the deliverables published during the DESERVE project. Technical topics discussed in this book include:Modern ADAS development platforms;Design space exploration;Driving modelling;Video-based and Radar-based ADAS functions;HMI for ADAS;Vehicle-hardware-in-the-loop validation system

Directory of Open Access Books (DOAB)

Towards a Common Software/Hardware Methodology for Future Advanced Driver Assistance Systems

Author
Publication venue: 'Informa UK Limited'
Publication date
Field of study

OAPEN Library

Automatic vision based fault detection on electricity transmission components using very highresolution

Author: Igwe Chukwuemeka Fortune
Publication venue
Publication date: 26/02/2021
Field of study

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial TechnologiesElectricity is indispensable to modern-day governments and citizenry’s day-to-day operations. Fault identification is one of the most significant bottlenecks faced by Electricity transmission and distribution utilities in developing countries to deliver credible services to customers and ensure proper asset audit and management for network optimization and load forecasting. This is due to data scarcity, asset inaccessibility and insecurity, ground-surveys complexity, untimeliness, and general human cost. In this context, we exploit the use of oblique drone imagery with a high spatial resolution to monitor four major Electric power transmission network (EPTN) components condition through a fine-tuned deep learning approach, i.e., Convolutional Neural Networks (CNNs). This study explored the capability of the Single Shot Multibox Detector (SSD), a onestage object detection model on the electric transmission power line imagery to localize, classify and inspect faults present. The components fault considered include the broken insulator plate, missing insulator plate, missing knob, and rusty clamp. The adopted network used a CNN based on a multiscale layer feature pyramid network (FPN) using aerial image patches and ground truth to localise and detect faults via a one-phase procedure. The SSD Rest50 architecture variation performed the best with a mean Average Precision of 89.61%. All the developed SSD based models achieve a high precision rate and low recall rate in detecting the faulty components, thus achieving acceptable balance levels F1-score and representation. Finally, comparable to other works of literature within this same domain, deep-learning will boost timeliness of EPTN inspection and their component fault mapping in the long - run if these deep learning architectures are widely understood, adequate training samples exist to represent multiple fault characteristics; and the effects of augmenting available datasets, balancing intra-class heterogeneity, and small-scale datasets are clearly understood

Repositório da Universidade Nova de Lisboa

Advances in Sensors, Big Data and Machine Learning in Intelligent Animal Farming

Author
Publication venue: 'MDPI AG'
Publication date: 21/06/2022
Field of study

Animal production (e.g., milk, meat, and eggs) provides valuable protein production for human beings and animals. However, animal production is facing several challenges worldwide such as environmental impacts and animal welfare/health concerns. In animal farming operations, accurate and efficient monitoring of animal information and behavior can help analyze the health and welfare status of animals and identify sick or abnormal individuals at an early stage to reduce economic losses and protect animal welfare. In recent years, there has been growing interest in animal welfare. At present, sensors, big data, machine learning, and artificial intelligence are used to improve management efficiency, reduce production costs, and enhance animal welfare. Although these technologies still have challenges and limitations, the application and exploration of these technologies in animal farms will greatly promote the intelligent management of farms. Therefore, this Special Issue will collect original papers with novel contributions based on technologies such as sensors, big data, machine learning, and artificial intelligence to study animal behavior monitoring and recognition, environmental monitoring, health evaluation, etc., to promote intelligent and accurate animal farm management

Directory of Open Access Books (DOAB)

Multi-Object Tracking System based on LiDAR and RADAR for Intelligent Vehicles applications

Author: Montiel Marín Santiago
Publication venue
Publication date: 01/01/2021
Field of study

El presente Trabajo Fin de Grado tiene como objetivo el desarrollo de un Sistema de Detección y Multi-Object Tracking 3D basado en la fusión sensorial de LiDAR y RADAR para aplicaciones de conducción autónoma basándose en algoritmos tradicionales de Machine Learning. La implementación realizada está basada en Python, ROS y cumple requerimientos de tiempo real. En la etapa de detección de objetos se utiliza el algoritmo de segmentación del plano RANSAC, para una posterior extracción de Bounding Boxes mediante DBSCAN. Una Late Sensor Fusion mediante Intersection over Union 3D y un sistema de tracking BEV-SORT completan la arquitectura propuesta.This Final Degree Project aims to develop a 3D Multi-Object Tracking and Detection System based on the Sensor Fusion of LiDAR and RADAR for autonomous driving applications based on traditional Machine Learning algorithms. The implementation is based on Python, ROS and complies with real-time requirements. In the Object Detection stage, the RANSAC plane segmentation algorithm is used, for a subsequent extraction of Bounding Boxes using DBSCAN. A Late Sensor Fusion using Intersection over Union 3D and a BEV-SORT tracking system complete the proposed architecture.Grado en Ingeniería en Electrónica y Automática Industria

e_Buah - Biblioteca Digital de la Universidad de Alcalá

Artificial Intelligence for the Edge Computing Paradigm.

Author: DAHER ALI
Publication venue: Università degli studi di Genova
Publication date: 17/03/2023
Field of study

With modern technologies moving towards the internet of things where seemingly every financial, private, commercial and medical transaction being carried out by portable and intelligent devices; Machine Learning has found its way into every smart device and application possible. However, Machine Learning cannot be used on the edge directly due to the limited capabilities of small and battery-powered modules. Therefore, this thesis aims to provide light-weight automated Machine Learning models which are applied on a standard edge device, the Raspberry Pi, where one framework aims to limit parameter tuning while automating feature extraction and a second which can perform Machine Learning classification on the edge traditionally, and can be used additionally for image-based explainable Artificial Intelligence. Also, a commercial Artificial Intelligence software have been ported to work in a client/server setups on the Raspberry Pi board where it was incorporated in all of the Machine Learning frameworks which will be presented in this thesis. This dissertation also introduces multiple algorithms that can convert images into Time-series for classification and explainability but also introduces novel Time-series feature extraction algorithms that are applied to biomedical data while introducing the concept of the Activation Engine, which is a post-processing block that tunes Neural Networks without the need of particular experience in Machine Leaning. Also, a tree-based method for multiclass classification has been introduced which outperforms the One-to-Many approach while being less complex that the One-to-One method.\par The results presented in this thesis exhibit high accuracy when compared with the literature, while remaining efficient in terms of power consumption and the time of inference. Additionally the concepts, methods or algorithms that were introduced are particularly novel technically, where they include: • Feature extraction of professionally annotated, and poorly annotated time-series. • The introduction of the Activation Engine post-processing block. • A model for global image explainability with inference on the edge. • A tree-based algorithm for multiclass classification

Archivio istituzionale della ricerca - Università di Genova

Text-detection and -recognition from natural images

Author: Hanaa Mahmood (1249353)
Publication venue
Publication date: 10/02/2020
Field of study

Text detection and recognition from images could have numerous functional applications for document analysis, such as assistance for visually impaired people; recognition of vehicle license plates; evaluation of articles containing tables, street signs, maps, and diagrams; keyword-based image exploration; document retrieval; recognition of parts within industrial automation; content-based extraction; object recognition; address block location; and text-based video indexing. This research exploited the advantages of artificial intelligence (AI) to detect and recognise text from natural images. Machine learning and deep learning were used to accomplish this task.In this research, we conducted an in-depth literature review on the current detection and recognition methods used by researchers to identify the existing challenges, wherein the differences in text resulting from disparity in alignment, style, size, and orientation combined with low image contrast and a complex background make automatic text extraction a considerably challenging and problematic task. Therefore, the state-of-the-art suggested approaches obtain low detection rates (often less than 80%) and recognition rates (often less than 60%). This has led to the development of new approaches. The aim of the study was to develop a robust text detection and recognition method from natural images with high accuracy and recall, which would be used as the target of the experiments. This method could detect all the text in the scene images, despite certain specific features associated with the text pattern. Furthermore, we aimed to find a solution to the two main problems concerning arbitrarily shaped text (horizontal, multi-oriented, and curved text) detection and recognition in a low-resolution scene and with various scales and of different sizes.In this research, we propose a methodology to handle the problem of text detection by using novel combination and selection features to deal with the classification algorithms of the text/non-text regions. The text-region candidates were extracted from the grey-scale images by using the MSER technique. A machine learning-based method was then applied to refine and validate the initial detection. The effectiveness of the features based on the aspect ratio, GLCM, LBP, and HOG descriptors was investigated. The text-region classifiers of MLP, SVM, and RF were trained using selections of these features and their combinations. The publicly available datasets ICDAR 2003 and ICDAR 2011 were used to evaluate the proposed method. This method achieved the state-of-the-art performance by using machine learning methodologies on both databases, and the improvements were significant in terms of Precision, Recall, and F-measure. The F-measure for ICDAR 2003 and ICDAR 2011 was 81% and 84%, respectively. The results showed that the use of a suitable feature combination and selection approach could significantly increase the accuracy of the algorithms.A new dataset has been proposed to fill the gap of character-level annotation and the availability of text in different orientations and of curved text. The proposed dataset was created particularly for deep learning methods which require a massive completed and varying range of training data. The proposed dataset includes 2,100 images annotated at the character and word levels to obtain 38,500 samples of English characters and 12,500 words. Furthermore, an augmentation tool has been proposed to support the proposed dataset. The missing of object detection augmentation tool encroach to proposed tool which has the ability to update the position of bounding boxes after applying transformations on images. This technique helps to increase the number of samples in the dataset and reduce the time of annotations where no annotation is required. The final part of the thesis presents a novel approach for text spotting, which is a new framework for an end-to-end character detection and recognition system designed using an improved SSD convolutional neural network, wherein layers are added to the SSD networks and the aspect ratio of the characters is considered because it is different from that of the other objects. Compared with the other methods considered, the proposed method could detect and recognise characters by training the end-to-end model completely. The performance of the proposed method was better on the proposed dataset; it was 90.34. Furthermore, the F-measure of the method’s accuracy on ICDAR 2015, ICDAR 2013, and SVT was 84.5, 91.9, and 54.8, respectively. On ICDAR13, the method achieved the second-best accuracy. The proposed method could spot text in arbitrarily shaped (horizontal, oriented, and curved) scene text.</div

Loughborough University Institutional Repository

Proceedings of the 2021 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory

Author: Beyerer Jürgen
Zander Tim
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2022
Field of study

2021, the annual joint workshop of the Fraunhofer IOSB and KIT IES was hosted at the IOSB in Karlsruhe. For a week from the 2nd to the 6th July the doctoral students extensive reports on the status of their research. The results and ideas presented at the workshop are collected in this book in the form of detailed technical reports

KITopen