533 research outputs found

    Object Recognition

    Get PDF
    Vision-based object recognition tasks are very familiar in our everyday activities, such as driving our car in the correct lane. We do these tasks effortlessly in real-time. In the last decades, with the advancement of computer technology, researchers and application developers are trying to mimic the human's capability of visually recognising. Such capability will allow machine to free human from boring or dangerous jobs

    A novel diffusion tensor imaging-based computer-aided diagnostic system for early diagnosis of autism.

    Get PDF
    Autism spectrum disorders (ASDs) denote a significant growing public health concern. Currently, one in 68 children has been diagnosed with ASDs in the United States, and most children are diagnosed after the age of four, despite the fact that ASDs can be identified as early as age two. The ultimate goal of this thesis is to develop a computer-aided diagnosis (CAD) system for the accurate and early diagnosis of ASDs using diffusion tensor imaging (DTI). This CAD system consists of three main steps. First, the brain tissues are segmented based on three image descriptors: a visual appearance model that has the ability to model a large dimensional feature space, a shape model that is adapted during the segmentation process using first- and second-order visual appearance features, and a spatially invariant second-order homogeneity descriptor. Secondly, discriminatory features are extracted from the segmented brains. Cortex shape variability is assessed using shape construction methods, and white matter integrity is further examined through connectivity analysis. Finally, the diagnostic capabilities of these extracted features are investigated. The accuracy of the presented CAD system has been tested on 25 infants with a high risk of developing ASDs. The preliminary diagnostic results are promising in identifying autistic from control patients

    Quantitative electron microscopy for microstructural characterisation

    Get PDF
    Development of materials for high-performance applications requires accurate and useful analysis tools. In parallel with advances in electron microscopy hardware, we require analysis approaches to better understand microstructural behaviour. Such improvements in characterisation capability permit informed alloy design. New approaches to the characterisation of metallic materials are presented, primarily using signals collected from electron microscopy experiments. Electron backscatter diffraction is regularly used to investigate crystallography in the scanning electron microscope, and combined with energy-dispersive X-ray spectroscopy to simultaneusly investigate chemistry. New algorithms and analysis pipelines are developed to permit accurate and routine microstructural evaluation, leveraging a variety of machine learning approaches. This thesis investigates the structure and behaviour of Co/Ni-base superalloys, derived from V208C. Use of the presently developed techniques permits informed development of a new generation of advanced gas turbine engine materials.Open Acces

    Mineral identification using data-mining in hyperspectral infrared imagery

    Get PDF
    Les applications de l’imagerie infrarouge dans le domaine de la gĂ©ologie sont principalement des applications hyperspectrales. Elles permettent entre autre l’identification minĂ©rale, la cartographie, ainsi que l’estimation de la portĂ©e. Le plus souvent, ces acquisitions sont rĂ©alisĂ©es in-situ soit Ă  l’aide de capteurs aĂ©roportĂ©s, soit Ă  l’aide de dispositifs portatifs. La dĂ©couverte de minĂ©raux indicateurs a permis d’amĂ©liorer grandement l’exploration minĂ©rale. Ceci est en partie dĂ» Ă  l’utilisation d’instruments portatifs. Dans ce contexte le dĂ©veloppement de systĂšmes automatisĂ©s permettrait d’augmenter Ă  la fois la qualitĂ© de l’exploration et la prĂ©cision de la dĂ©tection des indicateurs. C’est dans ce cadre que s’inscrit le travail menĂ© dans ce doctorat. Le sujet consistait en l’utilisation de mĂ©thodes d’apprentissage automatique appliquĂ©es Ă  l’analyse (au traitement) d’images hyperspectrales prises dans les longueurs d’onde infrarouge. L’objectif recherchĂ© Ă©tant l’identification de grains minĂ©raux de petites tailles utilisĂ©s comme indicateurs minĂ©ral -ogiques. Une application potentielle de cette recherche serait le dĂ©veloppement d’un outil logiciel d’assistance pour l’analyse des Ă©chantillons lors de l’exploration minĂ©rale. Les expĂ©riences ont Ă©tĂ© menĂ©es en laboratoire dans la gamme relative Ă  l’infrarouge thermique (Long Wave InfraRed, LWIR) de 7.7m Ă  11.8 m. Ces essais ont permis de proposer une mĂ©thode pour calculer l’annulation du continuum. La mĂ©thode utilisĂ©e lors de ces essais utilise la factorisation matricielle non nĂ©gative (NMF). En utlisant une factorisation du premier ordre on peut dĂ©duire le rayonnement de pĂ©nĂ©tration, lequel peut ensuite ĂȘtre comparĂ© et analysĂ© par rapport Ă  d’autres mĂ©thodes plus communes. L’analyse des rĂ©sultats spectraux en comparaison avec plusieurs bibliothĂšques existantes de donnĂ©es a permis de mettre en Ă©vidence la suppression du continuum. Les expĂ©rience ayant menĂ©s Ă  ce rĂ©sultat ont Ă©tĂ© conduites en utilisant une plaque Infragold ainsi qu’un objectif macro LWIR. L’identification automatique de grains de diffĂ©rents matĂ©riaux tels que la pyrope, l’olivine et le quartz a commencĂ©. Lors d’une phase de comparaison entre des approches supervisĂ©es et non supervisĂ©es, cette derniĂšre s’est montrĂ©e plus appropriĂ© en raison du comportement indĂ©pendant par rapport Ă  l’étape d’entraĂźnement. Afin de confirmer la qualitĂ© de ces rĂ©sultats quatre expĂ©riences ont Ă©tĂ© menĂ©es. Lors d’une premiĂšre expĂ©rience deux algorithmes ont Ă©tĂ© Ă©valuĂ©s pour application de regroupements en utilisant l’approche FCC (False Colour Composite). Cet essai a permis d’observer une vitesse de convergence, jusqu’a vingt fois plus rapide, ainsi qu’une efficacitĂ© significativement accrue concernant l’identification en comparaison des rĂ©sultats de la littĂ©rature. Cependant des essais effectuĂ©s sur des donnĂ©es LWIR ont montrĂ© un manque de prĂ©diction de la surface du grain lorsque les grains Ă©taient irrĂ©guliers avec prĂ©sence d’agrĂ©gats minĂ©raux. La seconde expĂ©rience a consistĂ©, en une analyse quantitaive comparative entre deux bases de donnĂ©es de Ground Truth (GT), nommĂ©e rigid-GT et observed-GT (rigide-GT: Ă©tiquet manuel de la rĂ©gion, observĂ©e-GT:Ă©tiquetage manuel les pixels). La prĂ©cision des rĂ©sultats Ă©tait 1.5 fois meilleur lorsque l’on a utlisĂ© la base de donnĂ©es observed-GT que rigid-GT. Pour les deux derniĂšres epxĂ©rience, des donnĂ©es venant d’un MEB (Microscope Électronique Ă  Balayage) ainsi que d’un microscopie Ă  fluorescence (XRF) ont Ă©tĂ© ajoutĂ©es. Ces donnĂ©es ont permis d’introduire des informations relatives tant aux agrĂ©gats minĂ©raux qu’à la surface des grains. Les rĂ©sultats ont Ă©tĂ© comparĂ©s par des techniques d’identification automatique des minĂ©raux, utilisant ArcGIS. Cette derniĂšre a montrĂ© une performance prometteuse quand Ă  l’identification automatique et Ă  aussi Ă©tĂ© utilisĂ©e pour la GT de validation. Dans l’ensemble, les quatre mĂ©thodes de cette thĂšse reprĂ©sentent des mĂ©thodologies bĂ©nĂ©fiques pour l’identification des minĂ©raux. Ces mĂ©thodes prĂ©sentent l’avantage d’ĂȘtre non-destructives, relativement prĂ©cises et d’avoir un faible coĂ»t en temps calcul ce qui pourrait les qualifier pour ĂȘtre utilisĂ©e dans des conditions de laboratoire ou sur le terrain.The geological applications of hyperspectral infrared imagery mainly consist in mineral identification, mapping, airborne or portable instruments, and core logging. Finding the mineral indicators offer considerable benefits in terms of mineralogy and mineral exploration which usually involves application of portable instrument and core logging. Moreover, faster and more mechanized systems development increases the precision of identifying mineral indicators and avoid any possible mis-classification. Therefore, the objective of this thesis was to create a tool to using hyperspectral infrared imagery and process the data through image analysis and machine learning methods to identify small size mineral grains used as mineral indicators. This system would be applied for different circumstances to provide an assistant for geological analysis and mineralogy exploration. The experiments were conducted in laboratory conditions in the long-wave infrared (7.7ÎŒm to 11.8ÎŒm - LWIR), with a LWIR-macro lens (to improve spatial resolution), an Infragold plate, and a heating source. The process began with a method to calculate the continuum removal. The approach is the application of Non-negative Matrix Factorization (NMF) to extract Rank-1 NMF and estimate the down-welling radiance and then compare it with other conventional methods. The results indicate successful suppression of the continuum from the spectra and enable the spectra to be compared with spectral libraries. Afterwards, to have an automated system, supervised and unsupervised approaches have been tested for identification of pyrope, olivine and quartz grains. The results indicated that the unsupervised approach was more suitable due to independent behavior against training stage. Once these results obtained, two algorithms were tested to create False Color Composites (FCC) applying a clustering approach. The results of this comparison indicate significant computational efficiency (more than 20 times faster) and promising performance for mineral identification. Finally, the reliability of the automated LWIR hyperspectral infrared mineral identification has been tested and the difficulty for identification of the irregular grain’s surface along with the mineral aggregates has been verified. The results were compared to two different Ground Truth(GT) (i.e. rigid-GT and observed-GT) for quantitative calculation. Observed-GT increased the accuracy up to 1.5 times than rigid-GT. The samples were also examined by Micro X-ray Fluorescence (XRF) and Scanning Electron Microscope (SEM) in order to retrieve information for the mineral aggregates and the grain’s surface (biotite, epidote, goethite, diopside, smithsonite, tourmaline, kyanite, scheelite, pyrope, olivine, and quartz). The results of XRF imagery compared with automatic mineral identification techniques, using ArcGIS, and represented a promising performance for automatic identification and have been used for GT validation. In overall, the four methods (i.e. 1.Continuum removal methods; 2. Classification or clustering methods for mineral identification; 3. Two algorithms for clustering of mineral spectra; 4. Reliability verification) in this thesis represent beneficial methodologies to identify minerals. These methods have the advantages to be a non-destructive, relatively accurate and have low computational complexity that might be used to identify and assess mineral grains in the laboratory conditions or in the field

    Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization

    Full text link
    Automatic speech recognition (ASR) has recently become an important challenge when using deep learning (DL). It requires large-scale training datasets and high computational and storage resources. Moreover, DL techniques and machine learning (ML) approaches in general, hypothesize that training and testing data come from the same domain, with the same input feature space and data distribution characteristics. This assumption, however, is not applicable in some real-world artificial intelligence (AI) applications. Moreover, there are situations where gathering real data is challenging, expensive, or rarely occurring, which can not meet the data requirements of DL models. deep transfer learning (DTL) has been introduced to overcome these issues, which helps develop high-performing models using real datasets that are small or slightly different but related to the training data. This paper presents a comprehensive survey of DTL-based ASR frameworks to shed light on the latest developments and helps academics and professionals understand current challenges. Specifically, after presenting the DTL background, a well-designed taxonomy is adopted to inform the state-of-the-art. A critical analysis is then conducted to identify the limitations and advantages of each framework. Moving on, a comparative study is introduced to highlight the current challenges before deriving opportunities for future research

    Geometry- and Accuracy-Preserving Random Forest Proximities with Applications

    Get PDF
    Many machine learning algorithms use calculated distances or similarities between data observations to make predictions, cluster similar data, visualize patterns, or generally explore the data. Most distances or similarity measures do not incorporate known data labels and are thus considered unsupervised. Supervised methods for measuring distance exist which incorporate data labels and thereby exaggerate separation between data points of different classes. This approach tends to distort the natural structure of the data. Instead of following similar approaches, we leverage a popular algorithm used for making data-driven predictions, known as random forests, to naturally incorporate data labels into similarity measures known as random forest proximities. In this dissertation, we explore previously defined random forest proximities and demonstrate their weaknesses in popular proximity-based applications. Additionally, we develop a new proximity definition that can be used to recreate the random forest’s predictions. We call these random forest-geometry-and accuracy-Preserving proximities or RF-GAP. We show by proof and empirical demonstration can be used to perfectly reconstruct the random forest’s predictions and, as a result, we argue that RF-GAP proximities provide a truer representation of the random forest’s learning when used in proximity-based applications. We provide evidence to suggest that RF-GAP proximities improve applications including imputing missing data, detecting outliers, and visualizing the data. We also introduce a new random forest proximity-based technique that can be used to generate 2- or 3-dimensional data representations which can be used as a tool to visually explore the data. We show that this method does well at portraying the relationship between data variables and the data labels. We show quantitatively and qualitatively that this method surpasses other existing methods for this task

    Trennung und SchĂ€tzung der Anzahl von Audiosignalquellen mit Zeit- und FrequenzĂŒberlappung

    Get PDF
    Everyday audio recordings involve mixture signals: music contains a mixture of instruments; in a meeting or conference, there is a mixture of human voices. For these mixtures, automatically separating or estimating the number of sources is a challenging task. A common assumption when processing mixtures in the time-frequency domain is that sources are not fully overlapped. However, in this work we consider some cases where the overlap is severe — for instance, when instruments play the same note (unison) or when many people speak concurrently ("cocktail party") — highlighting the need for new representations and more powerful models. To address the problems of source separation and count estimation, we use conventional signal processing techniques as well as deep neural networks (DNN). We ïŹrst address the source separation problem for unison instrument mixtures, studying the distinct spectro-temporal modulations caused by vibrato. To exploit these modulations, we developed a method based on time warping, informed by an estimate of the fundamental frequency. For cases where such estimates are not available, we present an unsupervised model, inspired by the way humans group time-varying sources (common fate). This contribution comes with a novel representation that improves separation for overlapped and modulated sources on unison mixtures but also improves vocal and accompaniment separation when used as an input for a DNN model. Then, we focus on estimating the number of sources in a mixture, which is important for real-world scenarios. Our work on count estimation was motivated by a study on how humans can address this task, which lead us to conduct listening experiments, conïŹrming that humans are only able to estimate the number of up to four sources correctly. To answer the question of whether machines can perform similarly, we present a DNN architecture, trained to estimate the number of concurrent speakers. Our results show improvements compared to other methods, and the model even outperformed humans on the same task. In both the source separation and source count estimation tasks, the key contribution of this thesis is the concept of “modulation”, which is important to computationally mimic human performance. Our proposed Common Fate Transform is an adequate representation to disentangle overlapping signals for separation, and an inspection of our DNN count estimation model revealed that it proceeds to ïŹnd modulation-like intermediate features.Im Alltag sind wir von gemischten Signalen umgeben: Musik besteht aus einer Mischung von Instrumenten; in einem Meeting oder auf einer Konferenz sind wir einer Mischung menschlicher Stimmen ausgesetzt. FĂŒr diese Mischungen ist die automatische Quellentrennung oder die Bestimmung der Anzahl an Quellen eine anspruchsvolle Aufgabe. Eine hĂ€uïŹge Annahme bei der Verarbeitung von gemischten Signalen im Zeit-Frequenzbereich ist, dass die Quellen sich nicht vollstĂ€ndig ĂŒberlappen. In dieser Arbeit betrachten wir jedoch einige FĂ€lle, in denen die Überlappung immens ist zum Beispiel, wenn Instrumente den gleichen Ton spielen (unisono) oder wenn viele Menschen gleichzeitig sprechen (Cocktailparty) —, so dass neue Signal-ReprĂ€sentationen und leistungsfĂ€higere Modelle notwendig sind. Um die zwei genannten Probleme zu bewĂ€ltigen, verwenden wir sowohl konventionelle Signalverbeitungsmethoden als auch tiefgehende neuronale Netze (DNN). Wir gehen zunĂ€chst auf das Problem der Quellentrennung fĂŒr Unisono-Instrumentenmischungen ein und untersuchen die speziellen, durch Vibrato ausgelösten, zeitlich-spektralen Modulationen. Um diese Modulationen auszunutzen entwickelten wir eine Methode, die auf Zeitverzerrung basiert und eine SchĂ€tzung der Grundfrequenz als zusĂ€tzliche Information nutzt. FĂŒr FĂ€lle, in denen diese SchĂ€tzungen nicht verfĂŒgbar sind, stellen wir ein unĂŒberwachtes Modell vor, das inspiriert ist von der Art und Weise, wie Menschen zeitverĂ€nderliche Quellen gruppieren (Common Fate). Dieser Beitrag enthĂ€lt eine neuartige ReprĂ€sentation, die die Separierbarkeit fĂŒr ĂŒberlappte und modulierte Quellen in Unisono-Mischungen erhöht, aber auch die Trennung in Gesang und Begleitung verbessert, wenn sie in einem DNN-Modell verwendet wird. Im Weiteren beschĂ€ftigen wir uns mit der SchĂ€tzung der Anzahl von Quellen in einer Mischung, was fĂŒr reale Szenarien wichtig ist. Unsere Arbeit an der SchĂ€tzung der Anzahl war motiviert durch eine Studie, die zeigt, wie wir Menschen diese Aufgabe angehen. Dies hat uns dazu veranlasst, eigene Hörexperimente durchzufĂŒhren, die bestĂ€tigten, dass Menschen nur in der Lage sind, die Anzahl von bis zu vier Quellen korrekt abzuschĂ€tzen. Um nun die Frage zu beantworten, ob Maschinen dies Ă€hnlich gut können, stellen wir eine DNN-Architektur vor, die erlernt hat, die Anzahl der gleichzeitig sprechenden Sprecher zu ermitteln. Die Ergebnisse zeigen Verbesserungen im Vergleich zu anderen Methoden, aber vor allem auch im Vergleich zu menschlichen Hörern. Sowohl bei der Quellentrennung als auch bei der SchĂ€tzung der Anzahl an Quellen ist ein Kernbeitrag dieser Arbeit das Konzept der “Modulation”, welches wichtig ist, um die Strategien von Menschen mittels Computern nachzuahmen. Unsere vorgeschlagene Common Fate Transformation ist eine adĂ€quate Darstellung, um die Überlappung von Signalen fĂŒr die Trennung zugĂ€nglich zu machen und eine Inspektion unseres DNN-ZĂ€hlmodells ergab schließlich, dass sich auch hier modulationsĂ€hnliche Merkmale ïŹnden lassen
    • 

    corecore