Search CORE

233 research outputs found

Automatic CNN channel selection and effective detection on face and rotated aerial objects

Author: Fang Zhenyu.
Publication venue
Publication date: 01/01/2020
Field of study

Balancing accuracy and computational cost is a challenging task in computer vision. This is especially true for convolutional neural networks (CNNs), which required far larger scale of processing power than traditional learning algorithms. This thesis is aimed at the development of new CNN structures and loss functions to tackle the unbalanced accuracy-effciency issue in image classification and object detection, which are two fundamental yet challenging tasks of computer vision. For a CNN based object detector, the main computational cost is caused by the feature extractor (backbone), which has been originally applied to image classification.;Optimising the structure of CNN applied to image classification will bring benefits when it is applied to object detection. Although the outputs of detectors may vary across detection tasks, the challenges and the design principles among detectors are similar. Therefore, this thesis will start with face detection (i.e. a single object detection task), which is a significant branch of objection detection and has been widely used in real life. After that, object detection on aerial image will be investigated, which is a more challenging detection task.;Specifically, the objectives of this thesis are: 1. Optimising the CNN structures for image classification; 2. Developing a face detector which enables a trade-off between computational cost and accuracy; and 3. Proposing an object detector for aerial images, which suppresses the background noise without damaging the inference efficiency.;For the first target, this thesis aims to automatically optimise the topology of CNNs to generate the structure of fixed-length models, in which unnecessary convolutional kernels are removed. Experimental results have demonstrated that the optimised model can achieve comparable accuracy to the state-of-the-art models, across a broad range of datasets, whilst significantly reducing the number of parameters.;To tackle the unbalanced accuracy-effciency challenge in face detection, a novel context enhanced approach is proposed which improves the performance of the face detector in terms of both loss function and structure. For loss function optimisation, a hierarchical loss, referred to as 'triple loss' in this thesis, is introduced to optimise the feature pyramid network (FPN) based face detector. For structural optimisation, this thesis proposes a context-sensitive structure to increase the capacity of the network prediction. Experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost of face detection.;To suppress the background noise in aerial image object detection, this thesis presents a two-stage detector, named as 'SAFDet'. To be more specific, a rotation anchor-free-branch (RAFB) is proposed to regress the precise rectangle boundary. Asthe RAFB is anchor free, the computational cost is negligible during training. Meanwhile,a centre prediction module (CPM) is introduced to enhance the capabilities oftarget localisation and noise suppression from the background. As the CPM is only deployed during training, it does not increase the computational cost of inference. Experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost, and it effectively suppresses the background noise at the same time.Balancing accuracy and computational cost is a challenging task in computer vision. This is especially true for convolutional neural networks (CNNs), which required far larger scale of processing power than traditional learning algorithms. This thesis is aimed at the development of new CNN structures and loss functions to tackle the unbalanced accuracy-effciency issue in image classification and object detection, which are two fundamental yet challenging tasks of computer vision. For a CNN based object detector, the main computational cost is caused by the feature extractor (backbone), which has been originally applied to image classification.;Optimising the structure of CNN applied to image classification will bring benefits when it is applied to object detection. Although the outputs of detectors may vary across detection tasks, the challenges and the design principles among detectors are similar. Therefore, this thesis will start with face detection (i.e. a single object detection task), which is a significant branch of objection detection and has been widely used in real life. After that, object detection on aerial image will be investigated, which is a more challenging detection task.;Specifically, the objectives of this thesis are: 1. Optimising the CNN structures for image classification; 2. Developing a face detector which enables a trade-off between computational cost and accuracy; and 3. Proposing an object detector for aerial images, which suppresses the background noise without damaging the inference efficiency.;For the first target, this thesis aims to automatically optimise the topology of CNNs to generate the structure of fixed-length models, in which unnecessary convolutional kernels are removed. Experimental results have demonstrated that the optimised model can achieve comparable accuracy to the state-of-the-art models, across a broad range of datasets, whilst significantly reducing the number of parameters.;To tackle the unbalanced accuracy-effciency challenge in face detection, a novel context enhanced approach is proposed which improves the performance of the face detector in terms of both loss function and structure. For loss function optimisation, a hierarchical loss, referred to as 'triple loss' in this thesis, is introduced to optimise the feature pyramid network (FPN) based face detector. For structural optimisation, this thesis proposes a context-sensitive structure to increase the capacity of the network prediction. Experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost of face detection.;To suppress the background noise in aerial image object detection, this thesis presents a two-stage detector, named as 'SAFDet'. To be more specific, a rotation anchor-free-branch (RAFB) is proposed to regress the precise rectangle boundary. Asthe RAFB is anchor free, the computational cost is negligible during training. Meanwhile,a centre prediction module (CPM) is introduced to enhance the capabilities oftarget localisation and noise suppression from the background. As the CPM is only deployed during training, it does not increase the computational cost of inference. Experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost, and it effectively suppresses the background noise at the same time

STAX (Strathclyde Repository)

Application of deep learning methods in materials microscopy for the quality assessment of lithium-ion batteries and sintered NdFeB magnets

Author: Badmos Olatomiwa
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 19/02/2023
Field of study

Die Qualitätskontrolle konzentriert sich auf die Erkennung von Produktfehlern und die Überwachung von Aktivitäten, um zu überprüfen, ob die Produkte den gewünschten Qualitätsstandard erfüllen. Viele Ansätze für die Qualitätskontrolle verwenden spezialisierte Bildverarbeitungssoftware, die auf manuell entwickelten Merkmalen basiert, die von Fachleuten entwickelt wurden, um Objekte zu erkennen und Bilder zu analysieren. Diese Modelle sind jedoch mühsam, kostspielig in der Entwicklung und schwer zu pflegen, während die erstellte Lösung oft spröde ist und für leicht unterschiedliche Anwendungsfälle erhebliche Anpassungen erfordert. Aus diesen Gründen wird die Qualitätskontrolle in der Industrie immer noch häufig manuell durchgeführt, was zeitaufwändig und fehleranfällig ist. Daher schlagen wir einen allgemeineren datengesteuerten Ansatz vor, der auf den jüngsten Fortschritten in der Computer-Vision-Technologie basiert und Faltungsneuronale Netze verwendet, um repräsentative Merkmale direkt aus den Daten zu lernen. Während herkömmliche Methoden handgefertigte Merkmale verwenden, um einzelne Objekte zu erkennen, lernen Deep-Learning-Ansätze verallgemeinerbare Merkmale direkt aus den Trainingsproben, um verschiedene Objekte zu erkennen. In dieser Dissertation werden Modelle und Techniken für die automatisierte Erkennung von Defekten in lichtmikroskopischen Bildern von materialografisch präparierten Schnitten entwickelt. Wir entwickeln Modelle zur Defekterkennung, die sich grob in überwachte und unüberwachte Deep-Learning-Techniken einteilen lassen. Insbesondere werden verschiedene überwachte Deep-Learning-Modelle zur Erkennung von Defekten in der Mikrostruktur von Lithium-Ionen-Batterien entwickelt, von binären Klassifizierungsmodellen, die auf einem Sliding-Window-Ansatz mit begrenzten Trainingsdaten basieren, bis hin zu komplexen Defekterkennungs- und Lokalisierungsmodellen, die auf ein- und zweistufigen Detektoren basieren. Unser endgültiges Modell kann mehrere Klassen von Defekten in großen Mikroskopiebildern mit hoher Genauigkeit und nahezu in Echtzeit erkennen und lokalisieren. Das erfolgreiche Trainieren von überwachten Deep-Learning-Modellen erfordert jedoch in der Regel eine ausreichend große Menge an markierten Trainingsbeispielen, die oft nicht ohne weiteres verfügbar sind und deren Beschaffung sehr kostspielig sein kann. Daher schlagen wir zwei Ansätze vor, die auf unbeaufsichtigtem Deep Learning zur Erkennung von Anomalien in der Mikrostruktur von gesinterten NdFeB-Magneten basieren, ohne dass markierte Trainingsdaten benötigt werden. Die Modelle sind in der Lage, Defekte zu erkennen, indem sie aus den Trainingsdaten indikative Merkmale von nur "normalen" Mikrostrukturmustern lernen. Wir zeigen experimentelle Ergebnisse der vorgeschlagenen Fehlererkennungssysteme, indem wir eine Qualitätsbewertung an kommerziellen Proben von Lithium-Ionen-Batterien und gesinterten NdFeB-Magneten durchführen

KITopen

Machine Learning Algorithms for Robotic Navigation and Perception and Embedded Implementation Techniques

Author: SALVETTI FRANCESCO
Publication venue: country:Italy
Publication date: 25/09/2023
Field of study

L'abstract è presente nell'allegato / the abstract is in the attachmen

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Advances in Sensors, Big Data and Machine Learning in Intelligent Animal Farming

Author
Publication venue: 'MDPI AG'
Publication date: 21/06/2022
Field of study

Animal production (e.g., milk, meat, and eggs) provides valuable protein production for human beings and animals. However, animal production is facing several challenges worldwide such as environmental impacts and animal welfare/health concerns. In animal farming operations, accurate and efficient monitoring of animal information and behavior can help analyze the health and welfare status of animals and identify sick or abnormal individuals at an early stage to reduce economic losses and protect animal welfare. In recent years, there has been growing interest in animal welfare. At present, sensors, big data, machine learning, and artificial intelligence are used to improve management efficiency, reduce production costs, and enhance animal welfare. Although these technologies still have challenges and limitations, the application and exploration of these technologies in animal farms will greatly promote the intelligent management of farms. Therefore, this Special Issue will collect original papers with novel contributions based on technologies such as sensors, big data, machine learning, and artificial intelligence to study animal behavior monitoring and recognition, environmental monitoring, health evaluation, etc., to promote intelligent and accurate animal farm management

Directory of Open Access Books (DOAB)

A survey of the application of soft computing to investment and financial trading

Author: Tan Clarence
Vanstone Bruce J
Publication venue: The Australian Pattern Recognition Society
Publication date: 01/01/2003
Field of study

Bond University Research Portal

Development of Machine Learning Based Analytical Tools for Pavement Performance Assessment and Crack Detection

Author: Huyan Ju
Publication venue: 'University of Waterloo'
Publication date: 02/12/1941
Field of study

Pavement Management System (PMS) analytical tools mainly consist of pavement condition investigation and evaluation tools, pavement condition rating and assessment tools, pavement performance prediction tools, treatment prioritizations and implementation tools. The effectiveness of a PMS highly depends on the efficiency and reliability of its pavement condition evaluation tools. Traditionally, pavement condition investigation and evaluation practices are based on manual distress surveys and performance level assessments, which have been blamed for low efficiency low reliability. Those kinds of manually surveys are labor intensive and unsafe due to proximity to live traffic conditions. Meanwhile, the accuracy can be lower due to the subjective nature of the evaluators. Considering these factors, semiautomated and automated pavement condition evaluation tools had been developed for several years. In current years, it is undoubtable that highly advanced computerized technologies have resulted successful applications in diverse engineering fields. Therefore, these techniques can be successfully incorporated into pavement condition evaluation distress detection, the analytical tools can improve the performance of existing PMSs. Hence, this research aims to bridge the gaps between highly advanced Machine Learning Techniques (MLTs) and the existing analytical tools of current PMSs. The research outputs intend to provide pavement condition evaluation tools that meet the requirement of high efficiency, accuracy, and reliability. To achieve the objectives of this research, six pavement damage condition and performance evaluation methodologies are developed. The roughness condition of pavement surface directly influences the riding quality of the users. International Roughness Index (IRI) is used worldwide by research institutions, pavement condition evaluation and management agencies to evaluate the roughness condition of the pavement. IRI is a time-dependent variable which generally tends to increase with the increase of the pavement service life. In this consideration, a multi-granularity fuzzy time series analysis based IRI prediction model is developed. Meanwhile, Particle Swarm Optimization (PSO) method is used for model optimization to obtain satisfactory IRI prediction results. Historical IRI data extracted from the InfoPave website have been used for training and testing the model. Experiment results proved the effectiveness of this method. Automated pavement condition evaluation tools can provide overall performance indices, which can then be used for treatment planning. The calculations of those performance indices are required for surface distress level and roughness condition evaluations. However, pavement surface roughness conditions are hard to obtain from surface image indicators. With this consideration, an image indicators-based pavement roughness and the overall performance prediction tools are developed. The state-of-the-art machine learning technique, XGBoost, is utilized as the main method in model training, validating and testing. In order to find the dominant image indicators that influence the pavement roughness condition and the overall performance conditions, the comprehensive pavement performance evaluation data collected by ARAN 900 are analyzed. Back Propagation Neural Network (BPNN) is used to develop the performance prediction models. On this basis, the mean important values (MIVs) for each input factor are calculated to evaluate the contributions of the input indicators. It has been observed that indicators of the wheel path cracking have the highest MIVs, which emphasizes the importance of cracking-focused maintenance treatments. The same issue is also found that current automated pavement condition evaluation systems only include the analysis of pavement surface distresses, without considering the structural capacity of the actual pavement. Hence, the structural performance analysis-based pavement performance prediction tools are developed using the Support Vector Machines (SVMs). To guarantee the overall performance of the proposed methodologies, heuristic methods including Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) are selected to optimize the model. The experiments results show a promising future of machine learning based pavement structural performance prediction. Automated pavement condition analyzers usually detect pavement surface distress through the collected pavement surface images. Then, distress types, severities, quantities, and other parameters are calculated for the overall performance index calculation. Cracks are one of the most important pavement surface distresses that should be quantified. Traditional approaches are less accurate and efficient in locating, counting and quantifying various types of cracks initialed on the pavement surface. An integrated Crack Deep Net (CrackDN) is developed based on deep learning technologies. Through model training, validation and testing, it has proved that CrackDN can detect pavement surface cracks on complex background with high accuracy. Moreover, the combination of box-level pavement crack locating, and pixel-level crack calculation can achieve comprehensive crack analysis. Thereby, more effective maintenance treatments can be assigned. Hence, a methodology regarding pixel-level crack detection which is called CrackU-net, is proposed. CrackU-net is composed of several convolutional, maxpooling, and up-convolutional layers. The model is developed based on the innovations of deep learning-based segmentation. Pavement crack data are collected by multiple devices, including automated pavement condition survey vehicles, smartphones, and action cameras. The proposed CrackU-net is tested on a separate crack image set which has not been used for training the model. The results demonstrate a promising future of use in the PMSs. Finally, the proposed toolboxes are validated through comparative experiments in terms of accuracy (precision, recall, and F-measure) and error levels. The accuracies of all those models are higher than 0.9 and the errors are lower than 0.05. Meanwhile, the findings of this research suggest that the wheel path cracking should be a priority when conducting maintenance activity planning. Benefiting from the highly advanced machine learning technologies, pavement roughness condition and the overall performance levels have a promising future of being predicted by extraction of the image indicators. Moreover, deep learning methods can be utilized to achieve both box-level and pixel-level pavement crack detection with satisfactory performance. Therefore, it is suggested that those state-of-the-art toolboxes be integrated into current PMSs to upgrade their service levels

University of Waterloo's Institutional Repository

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Development of Machine Learning Based Analytical Tools for Pavement Performance Assessment and Crack Detection

Author: Huyan Ju
Publication venue: 'University of Waterloo'
Publication date: 07/08/2019
Field of study

University of Waterloo's Institutional Repository

Gaining Insight into Determinants of Physical Activity using Bayesian Network Learning

Author: Bemelmans R.
Bolman C.
Cao L.
Hommersom A.J.
Lechner L.
Tummers S.
Publication venue: 'Leiden University Library - OAPEN'
Publication date: 01/01/2020
Field of study

Contains fulltext : 228326pre.pdf (preprint version ) (Open Access) Contains fulltext : 228326pub.pdf (publisher's version ) (Open Access)BNAIC/BeneLearn 202

Open University of the Netherlands Research Portal

Radboud Repository