Search CORE

139 research outputs found

Finding Semantically Related Videos in Closed Collections

Author: A Argyriou
CGM Snoek
Christos Tzelepis
Christos Tzelepis
DG Lowe
F Markatopoulou
F Markatopoulou
G Csurka
Herbert Bay
Jia Deng
KEA Sande Van de
LE Sucar
M Baumgartner
MF Weng
Nikiforos Pittaras
O Russakovsky
P Dollár
P. Sidiropoulos
V Ferrari
X Wang
X Zhao
Y Wei
Y Yang
Zhanpeng Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Modern newsroom tools offer advanced functionality for automatic and semi-automatic content collection from the web and social media sources to accompany news stories. However, the content collected in this way often tends to be unstructured and may include irrelevant items. An important step in the verification process is to organize this content, both with respect to what it shows, and with respect to its origin. This chapter presents our efforts in this direction, which resulted in two components. One aims to detect semantic concepts in video shots, to help annotation and organization of content collections. We implement a system based on deep learning, featuring a number of advances and adaptations of existing algorithms to increase performance for the task. The other component aims to detect logos in videos in order to identify their provenance. We present our progress from a keypoint-based detection system to a system based on deep learning

Crossref

Queen Mary Research Online

Confocal Laser Endomicroscopy Image Analysis with Deep Convolutional Neural Networks

Author
Publication venue
Publication date: 01/01/2019
Field of study

abstract: Rapid intraoperative diagnosis of brain tumors is of great importance for planning treatment and guiding the surgeon about the extent of resection. Currently, the standard for the preliminary intraoperative tissue analysis is frozen section biopsy that has major limitations such as tissue freezing and cutting artifacts, sampling errors, lack of immediate interaction between the pathologist and the surgeon, and time consuming. Handheld, portable confocal laser endomicroscopy (CLE) is being explored in neurosurgery for its ability to image histopathological features of tissue at cellular resolution in real time during brain tumor surgery. Over the course of examination of the surgical tumor resection, hundreds to thousands of images may be collected. The high number of images requires significant time and storage load for subsequent reviewing, which motivated several research groups to employ deep convolutional neural networks (DCNNs) to improve its utility during surgery. DCNNs have proven to be useful in natural and medical image analysis tasks such as classification, object detection, and image segmentation. This thesis proposes using DCNNs for analyzing CLE images of brain tumors. Particularly, it explores the practicality of DCNNs in three main tasks. First, off-the shelf DCNNs were used to classify images into diagnostic and non-diagnostic. Further experiments showed that both ensemble modeling and transfer learning improved the classifier’s accuracy in evaluating the diagnostic quality of new images at test stage. Second, a weakly-supervised learning pipeline was developed for localizing key features of diagnostic CLE images from gliomas. Third, image style transfer was used to improve the diagnostic quality of CLE images from glioma tumors by transforming the histology patterns in CLE images of fluorescein sodium-stained tissue into the ones in conventional hematoxylin and eosin-stained tissue slides. These studies suggest that DCNNs are opted for analysis of CLE images. They may assist surgeons in sorting out the non-diagnostic images, highlighting the key regions and enhancing their appearance through pattern transformation in real time. With recent advances in deep learning such as generative adversarial networks and semi-supervised learning, new research directions need to be followed to discover more promises of DCNNs in CLE image analysis.Dissertation/ThesisDoctoral Dissertation Neuroscience 201

ASU Digital Repository

Inimaju arvutuslikke protsesside mõistmine masinõpe mudelite tõlgendamise kaudu. Andmepõhine lähenemine arvutuslikku neuroteadusesse

Author: Kuzovkin Ilya
Publication venue
Publication date: 09/07/2020
Field of study

Modelleerimine on inimkonna põline viis keerulistest nähtustest arusaamiseks. Planeetide liikumise mudel, gravitatsiooni mudel ja osakestefüüsika standardmudel on näited selle lähenemise edukusest. Neuroteaduses on olemas kaks viisi mudelite loomiseks: traditsiooniline hüpoteesipõhine lähenemine, mille puhul kõigepealt mudel sõnastatakse ja alles siis valideeritakse andmete peal; ja uuem andmepõhine lähenemine, mis toetub masinõpele, et sõnastada mudeleid automaatselt. Hüpoteesipõhine viis annab täieliku mõistmise sellest, kuidas mudel töötab, aga nõuab aega, kuna iga hüpotees peab olema sõnastatud ja valideeritud käsitsi. Andmepõhine lähenemine toetub ainult andmetele ja arvutuslikele ressurssidele mudelite otsimisel, aga ei seleta kuidas täpselt mudel jõuab oma tulemusteni. Me väidame, et neuroandmestike suur hulk ja nende mahu kiire kasv nõuab andmepõhise lähenemise laiemat kasutuselevõttu neuroteaduses, nihkes uurija rolli mudelite tööprintsiipide tõlgendamisele. Doktoritöö koosneb kolmest näitest neuroteaduse teadmisi avastamisest masinõppe tõlgendamismeetodeid kasutades. Esimeses uuringus tõlgendatava mudeli abiga me kirjeldame millised ajas muutuvad sageduskomponendid iseloomustavad inimese ajusignaali visuaalsete objektide tuvastamise ülesande puhul. Teises uuringus võrdleme omavahel signaale inimese aju ventraalses piirkonnas ja konvolutsiooniliste tehisnärvivõrkude aktivatsioone erinevates kihtides. Säärane võrdlus võimaldas meil kinnitada hüpoteesi, et mõlemad süsteemid kasutavad hierarhilist struktuuri. Viimane näide kasutab topoloogiat säilitavat mõõtmelisuse vähendamise ja visualiseerimise meetodit, et näha, millised ajusignaalid ja mõtteseisundid on üksteisele sarnased. Viimased tulemused masinõppes ja tehisintellektis näitasid et mõned mehhanismid meie ajus on sarnased mehhanismidega, milleni jõuavad õppimise käigus masinõppe algoritmid. Oma tööga me rõhutame masinõppe mudelite tõlgendamise tähtsust selliste mehhanismide avastamiseks.Building a model of a complex phenomenon is an ancient way of gaining knowledge and understanding of the reality around us. Models of planetary motion, gravity, particle physics are examples of this approach. In neuroscience, there are two ways of coming up with explanations of reality: a traditional hypothesis-driven approach, where a model is first formulated and then tested using the data, and a more recent data-driven approach, that relies on machine learning to generate models automatically. Hypothesis-driven approach provides full understanding of the model, but is time-consuming as each model has to be conceived and tested manually. Data-driven approach requires only the data and computational resources to sift through potential models, saving time, but leaving the resulting model itself to be a black box. Given the growing amount of neural data, we argue in favor of a more widespread adoption of the data-driven approach, reallocating part of the human effort from manual modeling. The thesis is based on three examples of how interpretation of machine-learned models leads to neuroscientific insights on three different levels of neural organization. Our first interpretable model is used to characterize neural dynamics of localized neural activity during the task of visual perceptual categorization. Next, we compare the activity of human visual system with the activity of a convolutional neural network, revealing explanations about the functional organization of human visual cortex. Lastly, we use dimensionality reduction and visualization techniques to understand relative organization of mental concepts within a subject's mental state space and apply it in the context of brain-computer interfaces. Recent results in neuroscience and AI show similarities between the mechanisms of both systems. This fact endorses the relevance of our approach: interpreting the mechanisms employed by machine learning models can shed light on the mechanisms employed by our brainhttps://www.ester.ee/record=b536057

DSpace at Tartu University Library

Locomotion Traces Data Mining for Supporting Frail People with Cognitive Impairment

Author: ZOLFAGHARI SAMANEH
Publication venue: Università degli Studi di Cagliari
Publication date: 28/04/2023
Field of study

The rapid increase in the senior population is posing serious challenges to national healthcare systems. Hence, innovative tools are needed to early detect health issues, including cognitive decline. Several clinical studies show that it is possible to identify cognitive impairment based on the locomotion patterns of older people. Thus, this thesis at first focused on providing a systematic literature review of locomotion data mining systems for supporting Neuro-Degenerative Diseases (NDD) diagnosis, identifying locomotion anomaly indicators and movement patterns for discovering low-level locomotion indicators, sensor data acquisition, and processing methods, as well as NDD detection algorithms considering their pros and cons. Then, we investigated the use of sensor data and Deep Learning (DL) to recognize abnormal movement patterns in instrumented smart-homes. In order to get rid of the noise introduced by indoor constraints and activity execution, we introduced novel visual feature extraction methods for locomotion data. Our solutions rely on locomotion traces segmentation, image-based extraction of salient features from locomotion segments, and vision-based DL. Furthermore, we proposed a data augmentation strategy to increase the volume of collected data and generalize the solution to different smart-homes with different layouts. We carried out extensive experiments with a large real-world dataset acquired in a smart-home test-bed from older people, including people with cognitive diseases. Experimental comparisons show that our system outperforms state-of-the-art methods

Archivio istituzionale della ricerca - Università di Cagliari

VISUAL SALIENCY ANALYSIS, PREDICTION, AND VISUALIZATION: A DEEP LEARNING PERSPECTIVE

Author: Mahdi Ali Majeed
Publication venue: OpenSIUC
Publication date: 01/08/2019
Field of study

In the recent years, a huge success has been accomplished in prediction of human eye fixations. Several studies employed deep learning to achieve high accuracy of prediction of human eye fixations. These studies rely on pre-trained deep learning for object classification. They exploit deep learning either as a transfer-learning problem, or the weights of the pre-trained network as the initialization to learn a saliency model. The utilization of such pre-trained neural networks is due to the relatively small datasets of human fixations available to train a deep learning model. Another relatively less prioritized problem is amount of computation of such deep learning models requires expensive hardware. In this dissertation, two approaches are proposed to tackle abovementioned problems. The first approach, codenamed DeepFeat, incorporates the deep features of convolutional neural networks pre-trained for object and scene classifications. This approach is the first approach that uses deep features without further learning. Performance of the DeepFeat model is extensively evaluated over a variety of datasets using a variety of implementations. The second approach is a deep learning saliency model, codenamed ClassNet. Two main differences separate the ClassNet from other deep learning saliency models. The ClassNet model is the only deep learning saliency model that learns its weights from scratch. In addition, the ClassNet saliency model treats prediction of human fixation as a classification problem, while other deep learning saliency models treat the human fixation prediction as a regression problem or as a classification of a regression problem

OpenSIUC

Science of Facial Attractiveness

Author: Ishizu T
Publication venue
Publication date: 01/09/2019
Field of study

UCL Discovery

Varieties of Attractiveness and their Brain Responses

Author: Ishizu T
Publication venue
Publication date: 01/09/2019
Field of study

UCL Discovery

Deep Learning Architectures for Heterogeneous Face Recognition

Author: Iranmanesh Seyed Mehdi
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2021
Field of study

Face recognition has been one of the most challenging areas of research in biometrics and computer vision. Many face recognition algorithms are designed to address illumination and pose problems for visible face images. In recent years, there has been significant amount of research in Heterogeneous Face Recognition (HFR). The large modality gap between faces captured in different spectrum as well as lack of training data makes heterogeneous face recognition (HFR) quite a challenging problem. In this work, we present different deep learning frameworks to address the problem of matching non-visible face photos against a gallery of visible faces. Algorithms for thermal-to-visible face recognition can be categorized as cross-spectrum feature-based methods, or cross-spectrum image synthesis methods. In cross-spectrum feature-based face recognition a thermal probe is matched against a gallery of visible faces corresponding to the real-world scenario, in a feature subspace. The second category synthesizes a visible-like image from a thermal image which can then be used by any commercial visible spectrum face recognition system. These methods also beneficial in the sense that the synthesized visible face image can be directly utilized by existing face recognition systems which operate only on the visible face imagery. Therefore, using this approach one can leverage the existing commercial-off-the-shelf (COTS) and government-off-the-shelf (GOTS) solutions. In addition, the synthesized images can be used by human examiners for different purposes. There are some informative traits, such as age, gender, ethnicity, race, and hair color, which are not distinctive enough for the sake of recognition, but still can act as complementary information to other primary information, such as face and fingerprint. These traits, which are known as soft biometrics, can improve recognition algorithms while they are much cheaper and faster to acquire. They can be directly used in a unimodal system for some applications. Usually, soft biometric traits have been utilized jointly with hard biometrics (face photo) for different tasks in the sense that they are considered to be available both during the training and testing phases. In our approaches we look at this problem in a different way. We consider the case when soft biometric information does not exist during the testing phase, and our method can predict them directly in a multi-tasking paradigm. There are situations in which training data might come equipped with additional information that can be modeled as an auxiliary view of the data, and that unfortunately is not available during testing. This is the LUPI scenario. We introduce a novel framework based on deep learning techniques that leverages the auxiliary view to improve the performance of recognition system. We do so by introducing a formulation that is general, in the sense that can be used with any visual classifier. Every use of auxiliary information has been validated extensively using publicly available benchmark datasets, and several new state-of-the-art accuracy performance values have been set. Examples of application domains include visual object recognition from RGB images and from depth data, handwritten digit recognition, and gesture recognition from video. We also design a novel aggregation framework which optimizes the landmark locations directly using only one image without requiring any extra prior which leads to robust alignment given arbitrary face deformations. Three different approaches are employed to generate the manipulated faces and two of them perform the manipulation via the adversarial attacks to fool a face recognizer. This step can decouple from our framework and potentially used to enhance other landmark detectors. Aggregation of the manipulated faces in different branches of proposed method leads to robust landmark detection. Finally we focus on the generative adversarial networks which is a very powerful tool in synthesizing a visible-like images from the non-visible images. The main goal of a generative model is to approximate the true data distribution which is not known. In general, the choice for modeling the density function is challenging. Explicit models have the advantage of explicitly calculating the probability densities. There are two well-known implicit approaches, namely the Generative Adversarial Network (GAN) and Variational AutoEncoder (VAE) which try to model the data distribution implicitly. The VAEs try to maximize the data likelihood lower bound, while a GAN performs a minimax game between two players during its optimization. GANs overlook the explicit data density characteristics which leads to undesirable quantitative evaluations and mode collapse. This causes the generator to create similar looking images with poor diversity of samples. In the last chapter of thesis, we focus to address this issue in GANs framework

The Research Repository @ WVU (West Virginia University)

Efficient Data Driven Multi Source Fusion

Author: Islam Muhammad Aminul
Publication venue: Scholars Junction
Publication date: 10/08/2018
Field of study

Data/information fusion is an integral component of many existing and emerging applications; e.g., remote sensing, smart cars, Internet of Things (IoT), and Big Data, to name a few. While fusion aims to achieve better results than what any one individual input can provide, often the challenge is to determine the underlying mathematics for aggregation suitable for an application. In this dissertation, I focus on the following three aspects of aggregation: (i) efficient data-driven learning and optimization, (ii) extensions and new aggregation methods, and (iii) feature and decision level fusion for machine learning with applications to signal and image processing. The Choquet integral (ChI), a powerful nonlinear aggregation operator, is a parametric way (with respect to the fuzzy measure (FM)) to generate a wealth of aggregation operators. The FM has 2N variables and N(2N − 1) constraints for N inputs. As a result, learning the ChI parameters from data quickly becomes impractical for most applications. Herein, I propose a scalable learning procedure (which is linear with respect to training sample size) for the ChI that identifies and optimizes only data-supported variables. As such, the computational complexity of the learning algorithm is proportional to the complexity of the solver used. This method also includes an imputation framework to obtain scalar values for data-unsupported (aka missing) variables and a compression algorithm (lossy or losselss) of the learned variables. I also propose a genetic algorithm (GA) to optimize the ChI for non-convex, multi-modal, and/or analytical objective functions. This algorithm introduces two operators that automatically preserve the constraints; therefore there is no need to explicitly enforce the constraints as is required by traditional GA algorithms. In addition, this algorithm provides an efficient representation of the search space with the minimal set of vertices. Furthermore, I study different strategies for extending the fuzzy integral for missing data and I propose a GOAL programming framework to aggregate inputs from heterogeneous sources for the ChI learning. Last, my work in remote sensing involves visual clustering based band group selection and Lp-norm multiple kernel learning based feature level fusion in hyperspectral image processing to enhance pixel level classification

Mississippi State University Libraries ETD database

Scholars Junction - Mississippi State University Institutional Repository