Search CORE

448 research outputs found

Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective

Author: Li Wanqing
Ogunbona Philip
Xu Dong
Zhang Jing
Publication venue
Publication date: 01/01/2019
Field of study

This paper takes a problem-oriented perspective and presents a comprehensive review of transfer learning methods, both shallow and deep, for cross-dataset visual recognition. Specifically, it categorises the cross-dataset recognition into seventeen problems based on a set of carefully chosen data and label attributes. Such a problem-oriented taxonomy has allowed us to examine how different transfer learning approaches tackle each problem and how well each problem has been researched to date. The comprehensive problem-oriented review of the advances in transfer learning with respect to the problem has not only revealed the challenges in transfer learning for visual recognition, but also the problems (e.g. eight of the seventeen problems) that have been scarcely studied. This survey not only presents an up-to-date technical review for researchers, but also a systematic approach and a reference for a machine learning practitioner to categorise a real problem and to look up for a possible solution accordingly

arXiv.org e-Print Archive

Research Online

SaliencyGAN: Deep Learning Semisupervised Salient Object Detection in the Fog of IoT

Author: Dong Shizhou
Papanastasiou Giorgos
Wang Chengjia
Yang Guang
Zhang Heye
Zhao Xiaofeng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2020
Field of study

In modern Internet of Things (IoT), visual analysis and predictions are often performed by deep learning models. Salient object detection (SOD) is a fundamental preprocessing for these applications. Executing SOD on the fog devices is a challenging task due to the diversity of data and fog devices. To adopt convolutional neural networks (CNN) on fog-cloud infrastructures for SOD-based applications, we introduce a semisupervised adversarial learning method in this article. The proposed model, named as SaliencyGAN, is empowered by a novel concatenated generative adversarial network (GAN) framework with partially shared parameters. The backbone CNN can be chosen flexibly based on the specific devices and applications. In the meanwhile, our method uses both the labeled and unlabeled data from different problem domains for training. Using multiple popular benchmark datasets, we compared state-of-the-art baseline methods to our SaliencyGAN obtained with 10-100% labeled training data. SaliencyGAN gained performance comparable to the supervised baselines when the percentage of labeled data reached 30%, and outperformed the weakly supervised and unsupervised baselines. Furthermore, our ablation study shows that SaliencyGAN were more robust to the common “mode missing” (or “mode collapse”) issue compared to the selected popular GAN models. The visualized ablation results have proved that SaliencyGAN learned a better estimation of data distributions. To the best of our knowledge, this is the first IoT-oriented semisupervised SOD method

University of Essex Research Repository

Heriot Watt Pure

Recommended from our members

Simultaneously encoding movement and sEMG-based stiffness for robotic skill learning

Author: Cheng Hong
Dai Shi-Lu
Li Yanan
Yang Chenguang
Zeng Chao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2021
Field of study

Transferring human stiffness regulation strategies to robots enables them to effectively and efficiently acquire adaptive impedance control policies to deal with uncertainties during the accomplishment of physical contact tasks in an unstructured environment. In this work, we develop such a physical human-robot interaction (pHRI) system which allows robots to learn variable impedance skills from human demonstrations. Specifically, the biological signals, i.e., surface electromyography (sEMG) are utilized for the extraction of human arm stiffness features during the task demonstration. The estimated human arm stiffness is then mapped into a robot impedance controller. The dynamics of both movement and stiffness are simultaneously modeled by using a model combining the hidden semi-Markov model (HSMM) and the Gaussian mixture regression (GMR). More importantly, the correlation between the movement information and the stiffness information is encoded in a systematic manner. This approach enables capturing uncertainties over time and space and allows the robot to satisfy both position and stiffness requirements in a task with modulation of the impedance controller. The experimental study validated the proposed approach

Sussex Research Online

Data-Driven Classiﬁcation Methods for Craniosynostosis Using 3D Surface Scans

Author: Schaufelberger Matthias
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 21/12/2023
Field of study

Diese Arbeit befasst sich mit strahlungsfreier Klassifizierung von Kraniosynostose mit zusätzlichem Schwerpunkt auf Datenaugmentierung und auf die Verwendung synthetischer Daten als Ersatz für klinische Daten. Motivation: Kraniosynostose ist eine Erkrankung, die Säuglinge betrifft und zu Kopfdeformitäten führt. Diagnose mittels strahlungsfreier 3D Oberflächenscans ist eine vielversprechende Alternative zu traditioneller computertomographischer Bildgebung. Aufgrund der niedrigen Prävalenz und schwieriger Anonymisierbarkeit sind klinische Daten nur spärlich vorhanden. Diese Arbeit adressiert diese Herausforderungen, indem sie neue Klassifizierungsalgorithmen vorschlägt, synthetische Daten für die wissenschaftliche Gemeinschaft erstellt und zeigt, dass es möglich ist, klinische Daten vollständig durch synthetische Daten zu ersetzen, ohne die Klassifikationsleistung zu beeinträchtigen. Methoden: Ein Statistisches Shape Modell (SSM) von Kraniosynostosepatienten wird erstellt und öffentlich zugänglich gemacht. Es wird eine 3D-2D-Konvertierung von der 3D-Gittergeometrie in ein 2D-Bild vorgeschlagen, die die Verwendung von Convolutional Neural Networks (CNNs) und Datenaugmentierung im Bildbereich ermöglicht. Drei Klassifizierungsansätze (basierend auf cephalometrischen Messungen, basierend auf dem SSM, und basierend auf den 2D Bildern mit einem CNN) zur Unterscheidung zwischen drei Pathologien und einer Kontrollgruppe werden vorgeschlagen und bewertet. Schließlich werden die klinischen Trainingsdaten vollständig durch synthetische Daten aus einem SSM und einem generativen adversarialen Netz (GAN) ersetzt. Ergebnisse: Die vorgeschlagene CNN-Klassifikation übertraf konkurrierende Ansätze in einem klinischen Datensatz von 496 Probanden und erreichte einen F1-Score von 0,964. Datenaugmentierung erhöhte den F1-Score auf 0,975. Zuschreibungen der Klassifizierungsentscheidung zeigten hohe Amplituden an Teilen des Kopfes, die mit Kraniosynostose in Verbindung stehen. Das Ersetzen der klinischen Daten durch synthetische Daten, die mit einem SSM und einem GAN erstellt wurden, ergab noch immer einen F1-Score von über 0,95, ohne dass das Modell ein einziges klinisches Subjekt gesehen hatte. Schlussfolgerung: Die vorgeschlagene Umwandlung von 3D-Geometrie in ein 2D-kodiertes Bild verbesserte die Leistung bestehender Klassifikatoren und ermöglichte eine Datenaugmentierung während des Trainings. Unter Verwendung eines SSM und eines GANs konnten klinische Trainingsdaten durch synthetische Daten ersetzt werden. Diese Arbeit verbessert bestehende diagnostische Ansätze auf strahlungsfreien Aufnahmen und demonstriert die Verwendbarkeit von synthetischen Daten, was klinische Anwendungen objektiver, interpretierbarer, und weniger kostspielig machen

KITopen

Annotate and retrieve in vivo images using hybrid self-organizing map

Author: Kaur Parminder
Malhi Avleen
Pannu Husanbir
Publication venue: Springer
Publication date: 31/10/2023
Field of study

Multimodal retrieval has gained much attention lately due to its effectiveness over uni-modal retrieval. For instance, visual features often under-constrain the description of an image in content-based retrieval; however, another modality, such as collateral text, can be introduced to abridge the semantic gap and make the retrieval process more efficient. This article proposes the application of cross-modal fusion and retrieval on real in vivo gastrointestinal images and linguistic cues, as the visual features alone are insufficient for image description and to assist gastroenterologists. So, a cross-modal information retrieval approach has been proposed to retrieve related images given text and vice versa while handling the heterogeneity gap issue among the modalities. The technique comprises two stages: (1) individual modality feature learning; and (2) fusion of two trained networks. In the first stage, two self-organizing maps (SOMs) are trained separately using images and texts, which are clustered in the respective SOMs based on their similarity. In the second (fusion) stage, the trained SOMs are integrated using an associative network to enable cross-modal retrieval. The underlying learning techniques of the associative network include Hebbian learning and Oja learning (Improved Hebbian learning). The introduced framework can annotate images with keywords and illustrate keywords with images, and it can also be extended to incorporate more diverse modalities. Extensive experimentation has been performed on real gastrointestinal images obtained from a known gastroenterologist that have collateral keywords with each image. The obtained results proved the efficacy of the algorithm and its significance in aiding gastroenterologists in quick and pertinent decision making

Durham Research Online

Image Quality Assessment for Population Cardiac MRI: From Detection to Synthesis

Author: Zhang Le
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 27/08/2019
Field of study

Cardiac magnetic resonance (CMR) images play a growing role in diagnostic imaging of cardiovascular diseases. Left Ventricular (LV) cardiac anatomy and function are widely used for diagnosis and monitoring disease progression in cardiology and to assess the patient's response to cardiac surgery and interventional procedures. For population imaging studies, CMR is arguably the most comprehensive imaging modality for non-invasive and non-ionising imaging of the heart and great vessels and, hence, most suited for population imaging cohorts. Due to insufficient radiographer's experience in planning a scan, natural cardiac muscle contraction, breathing motion, and imperfect triggering, CMR can display incomplete LV coverage, which hampers quantitative LV characterization and diagnostic accuracy. To tackle this limitation and enhance the accuracy and robustness of the automated cardiac volume and functional assessment, this thesis focuses on the development and application of state-of-the-art deep learning (DL) techniques in cardiac imaging. Specifically, we propose new image feature representation types that are learnt with DL models and aimed at highlighting the CMR image quality cross-dataset. These representations are also intended to estimate the CMR image quality for better interpretation and analysis. Moreover, we investigate how quantitative analysis can benefit when these learnt image representations are used in image synthesis. Specifically, a 3D fisher discriminative representation is introduced to identify CMR image quality in the UK Biobank cardiac data. Additionally, a novel adversarial learning (AL) framework is introduced for the cross-dataset CMR image quality assessment and we show that the common representations learnt by AL can be useful and informative for cross-dataset CMR image analysis. Moreover, we utilize the dataset invariance (DI) representations for CMR volumes interpolation by introducing a novel generative adversarial nets (GANs) based image synthesis framework, which enhance the CMR image quality cross-dataset

White Rose E-theses Online