Search CORE

4 research outputs found

A Deep Learning Approach to Video Processing for Scene Recognition in Smart Office Environments

Author: Casserfelt Karl
Publication venue: Malmö universitet/Teknik och samhälle
Publication date: 01/01/2018
Field of study

The field of computer vision, where the goal is to allow computer systems to interpret and understand image data, has in recent years seen great advances with the emergence of deep learning. Deep learning, a technique that emulates the information processing of the human brain, has been shown to almost solve the problem of object recognition in image data. One of the next big challenges in computer vision is to allow computers to not only recognize objects, but also activities. This study is an exploration of the capabilities of deep learning for the specific problem area of activity recognition in office environments. The study used a re-labeled subset of the AMI Meeting Corpus video data set to comparatively evaluate different neural network models performance in the given problem area, and then evaluated the best performing model on a new novel data set of office activities captured in a research lab in Malmö University. The results showed that the best performing model was a 3D convolutional neural network (3DCNN) with temporal information in the third dimension, however a recurrent convolutional network (RCNN) using a pre-trained VGG16 model to extract features and put into a recurrent neural network with a unidirectional Long-Short-Term-Memory (LSTM) layer performed almost as well with the right configuration. An analysis of the results suggests that a 3DCNN's performance is dependent on the camera angle, specifically how well movement is spatially distributed between people in frame

Digitala Vetenskapliga Arkivet - Academic Archive On-line

A Deep Learning Approach to Video Processing for Scene Recognition in Smart Office Environments

Author: Casserfelt Karl
Publication venue: Malmö universitet/Teknik och samhälle
Publication date: 01/01/2018
Field of study

Malmö University Electronic Publishing

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Malmö University

PiEye in the Wild: Exploring Eye Contact Detection for Small Inexpensive Hardware

Author: Casserfelt Karl
Einestam Ragnar
Publication venue: Malmö högskola/Teknik och samhälle
Publication date: 01/01/2017
Field of study

Ögonkontakt-sensorer skapar möjligheten att tolka användarens uppmärksamhet, vilket kan användas av system på en mängd olika vis. Dessa inkluderar att skapa nya möjligheter för människa-dator-interaktion och mäta mönster i uppmärksamhet hos individer. I den här uppsatsen gör vi ett försök till att konstruera en ögonkontakt-sensor med hjälp av en Raspberry Pi, med målet att göra den praktisk i verkliga scenarion. För att fastställa att den är praktisk satte vi upp ett antal kriterier baserat på tidigare användning av ögonkontakt-sensorer. För att möta dessa kriterier valde vi att använda en maskininlärningsmetod för att träna en klassificerare med bilder för att lära systemet att upptäcka om en användare har ögonkontakt eller ej. Vårt mål var att undersöka hur god prestanda vi kunde uppnå gällande precision, hastighet och avstånd. Efter att ha testat kombinationer av fyra olika metoder för feature extraction kunde vi fastslå att den bästa övergripande precisionen uppnåddes genom att använda LDA-komprimering på pixeldatan från varje bild, medan PCA-komprimering var bäst när input-bilderna liknande de från träningen. När vi undersökte systemets hastighet fann vi att nedskalning av bilder hade en stor effekt på hastigheten, men detta sänkte också både precision och maximalt avstånd. Vi lyckades minska den negativa effekten som en minskad skala hos en bild hade på precisionen, men det maximala avståndet som sensorn fungerade på var fortfarande relativ till skalan och i förlängningen hastigheten.Eye contact detection sensors have the possibility of inferring user attention, which can be utilized by a system in a multitude of different ways, including supporting human-computer interaction and measuring human attention patterns. In this thesis we attempt to build a versatile eye contact sensor using a Raspberry Pi that is suited for real world practical usage. In order to ensure practicality, we constructed a set of criteria for the system based on previous implementations. To meet these criteria, we opted to use an appearance-based machine learning method where we train a classifier with training images in order to infer if users look at the camera or not. Our aim was to investigate how well we could detect eye contacts on the Raspberry Pi in terms of accuracy, speed and range. After extensive testing on combinations of four different feature extraction methods, we found that Linear Discriminant Analysis compression of pixel data provided the best overall accuracy, but Principal Component Analysis compression performed the best when tested on images from the same dataset as the training data. When investigating the speed of the system, we found that down-scaling input images had a huge effect on the speed, but also lowered the accuracy and range. While we managed to mitigate the effects the scale had on the accuracy, the range of the system is still relative to the scale of input images and by extension speed

Malmö University Electronic Publishing

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Malmö University

An investigation of transfer learning for deep architectures in group activity recognition

Author: Casserfelt Karl
Mihailescu Radu-Casian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Pervasive technologies permeating our immediate surroundings provide a wide variety of means for sensing and actuating in our environment, having a great potential to impact the way we live, but also how we work. In this paper, we address the problem of activity recognition in office environments, as a means for inferring contextual information in order to automatically and proactively assists people in their daily activities. To this end we employ state-of-the-art image processing techniques and evaluate their capabilities in a real-world setup. Traditional machine learning is characterized by instances where both the training and test data share the same distribution. When this is not the case, the performance of the learned model is deteriorated. However, often times, the data is expensive or difficult to collect and label. It is therefore important to develop techniques that are able to make the best possible use of existing data sets from related domains, relative to the target domain. To this end, we further investigate in this work transfer learning techniques in deep learning architectures for the task of activity recognition in office settings. We provide herein a solution model that attains a 94% accuracy under the right conditions

Malmö University Electronic Publishing