Search CORE

13 research outputs found

A Comprehensive Study of ImageNet Pre-Training for Historical Document Image Analysis

Author: Alberti Michele
Fischer Andreas
Goktepe Pinar
Ingold Rolf
Kolonko Thomas
Liwicki Marcus
Pondenkandath Vinaychandran
Studer Linda
Publication venue
Publication date: 22/05/2019
Field of study

Automatic analysis of scanned historical documents comprises a wide range of image analysis tasks, which are often challenging for machine learning due to a lack of human-annotated learning samples. With the advent of deep neural networks, a promising way to cope with the lack of training data is to pre-train models on images from a different domain and then fine-tune them on historical documents. In the current research, a typical example of such cross-domain transfer learning is the use of neural networks that have been pre-trained on the ImageNet database for object recognition. It remains a mostly open question whether or not this pre-training helps to analyse historical documents, which have fundamentally different image properties when compared with ImageNet. In this paper, we present a comprehensive empirical survey on the effect of ImageNet pre-training for diverse historical document analysis tasks, including character recognition, style classification, manuscript dating, semantic segmentation, and content-based retrieval. While we obtain mixed results for semantic segmentation at pixel-level, we observe a clear trend across different network architectures that ImageNet pre-training has a positive effect on classification as well as content-based retrieval

arXiv.org e-Print Archive

Crossref

Hes-so: ArODES Open Archive (University of Applied Sciences and Arts Western Switzerland / Haute école spécialisée de Suisse occidentale / FH Westschweiz)

Attention-Based Deep Learning Framework for Human Activity Recognition with User Adaptation

Author: Buffelli Davide
Vandin Fabio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Sensor-based human activity recognition (HAR) requires to predict the action of a person based on sensor-generated time series data. HAR has attracted major interest in the past few years, thanks to the large number of applications enabled by modern ubiquitous computing devices. While several techniques based on hand-crafted feature engineering have been proposed, the current state-of-the-art is represented by deep learning architectures that automatically obtain high level representations and that use recurrent neural networks (RNNs) to extract temporal dependencies in the input. RNNs have several limitations, in particular in dealing with long-term dependencies. We propose a novel deep learning framework, \algname, based on a purely attention-based mechanism, that overcomes the limitations of the state-of-the-art. We show that our proposed attention-based architecture is considerably more powerful than previous approaches, with an average increment, of more than

7\%

on the F1 score over the previous best performing model. Furthermore, we consider the problem of personalizing HAR deep learning models, which is of great importance in several applications. We propose a simple and effective transfer-learning based strategy to adapt a model to a specific user, providing an average increment of

6\%

on the F1 score on the predictions for that user. Our extensive experimental evaluation proves the significantly superior capabilities of our proposed framework over the current state-of-the-art and the effectiveness of our user adaptation technique.Comment: Accepted for publication on the IEEE Sensors Journa

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

Cross-modal data retrieval and generation using deep neural networks

Author: Udaiyar Premkumar
Publication venue: RIT Scholar Works
Publication date: 01/02/2020
Field of study

The exponential growth of deep learning has helped solve problems across different fields of study. Convolutional neural networks have become a go-to tool for extracting features from images. Similarly, variations of recurrent neural networks such as Long-Short Term Memory and Gated Recurrent Unit architectures do a good job extracting useful information from temporal data such as text and time series data. Although, these networks are good at extracting features for a particular modality, learning features across multiple modalities is still a challenging task. In this work, we develop a generative common vector space model in which similar concepts from different modalities are brought closer in a common latent space representation while dissimilar concepts are pushed far apart in this same space. The developed model not only aims at solving the cross-modal retrieval problem but also uses the vector generated by the common vector space model to generate real looking data. This work mainly focuses on image and text modalities. However, it can be extended to other modalities as well. We train and evaluate the performance of the model on Caltech CUB and Oxford-102 datasets

RIT Scholar Works

Outlier Detection in Wearable Sensor Data for Human Activity Recognition (HAR) Based on DRNNs

Author: Muñoz Organero Mario
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/06/2019
Field of study

Wearable sensors provide a user-friendly and non-intrusive mechanism to extract user-relateddata that paves the way to the development of personalized applications. Within those applications, humanactivity recognition (HAR) plays an important role in the characterization of the user context. Outlierdetection methods focus on finding anomalous data samples that are likely to have been generated by adifferent mechanism. This paper combines outlier detection and HAR by introducing a novel algorithmthat is able both to detect information from secondary activities inside the main activity and to extract datasegments of a particular sub-activity from a different activity. Several machine learning algorithms havebeen previously used in the area of HAR based on the analysis of the time sequences generated by wearablesensors. Deep recurrent neural networks (DRNNs) have proven to be optimally adapted to the sequentialcharacteristics of wearable sensor data in previous studies. A DRNN-based algorithm is proposed in thispaper for outlier detection in HAR. The results are validated both for intra- and inter-subject cases and bothfor outlier detection and sub-activity recognition using two different datasets. A first dataset comprising4 major activities (walking, running, climbing up, and down) from 15 users is used to train and validatethe proposal. Intra-subject outlier detection is able to detect all major outliers in the walking activity in thisdataset, while inter-subject outlier detection only fails for one participant executing the activity in a peculiarway. Sub-activity detection has been validated by finding out and extracting walking segments present inthe other three activities in this dataset. A second dataset using four different users, a different setting anddifferent sensor devices is used to assess the generalization of results.This work was supported by the ‘‘ANALYTICS USING SENSOR DATA FOR FLATCITY’’ Project (MINECO/ ERDF, EU) funded in partby the Spanish Agencia Estatal de Investigación (AEI) under Grant TIN2016-77158-C4-1-R and in part by the European RegionalDevelopment Fund (ERDF)

Universidad Carlos III de Madrid e-Archivo

Uma abordagem de aprendizagem profunda para identificação de pessoas através do som dos passos

Author: Costa Leonardo Rezende
Publication venue: 'FAI-UFSCar'
Publication date: 01/01/2018
Field of study

With the advent of pervasive computing, the technology has become part of human daily life. Thus, in the construction of ”intelligent”environments, systems are used that are linked to the routine of the individuals who live there. So one of the main needs is the recognition of the individual who lives in the space. The objective of this work is to iden tify individuals in an intelligent environment through information obtained by the sound of their steps. To this end, different configurations of neural networks of deep learning were developed and tested in order to classify the people who participated in an experi ment realized in Carvalho and Rosa (2010). The proposed neural network architecture is composed of a chain of neural networks and the results achieved to 98.57 % accuracy.Com o advento da Computação Ubíqua, a tecnologia passou a fazer parte do cotidiano do ser humano. Deste modo, na construção de ambientes “inteligentes”, são empregados sistemas ligados `a rotina dos indivíduos que ali habitam. Assim, uma das principais necessidades ´e o reconhecimento do indivíduo que convive no espaço em questão. O objetivo deste trabalho ´e identificar os indivíduos em um ambiente inteligente através de informações obtidas pelo som de seus passos. Para isso, foram desenvolvidas e testadas diferentes configurações de redes neurais de aprendizado profundo com o propósito de classificar as pessoas que participaram de um experimento realizado em Carvalho e Rosa (2010). A arquitetura de rede neural proposta ´e composta por um encadeamento de redes neurais e seus resultados alcançaram até 98,57% de acuráci

Repositório Institucional da Universidade Federal do Tocantins

Transforming sensor data to the image domain for deep learning — An application to footstep detection

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref