Search CORE

6 research outputs found

Complex Event Recognition from Images with Few Training Examples

Author: Ahsan Unaiza
Essa Irfan
Hays James
Sun Chen
Publication venue
Publication date: 17/01/2017
Field of study

We propose to leverage concept-level representations for complex event recognition in photographs given limited training examples. We introduce a novel framework to discover event concept attributes from the web and use that to extract semantic features from images and classify them into social event categories with few training examples. Discovered concepts include a variety of objects, scenes, actions and event sub-types, leading to a discriminative and compact representation for event images. Web images are obtained for each discovered event concept and we use (pretrained) CNN features to train concept classifiers. Extensive experiments on challenging event datasets demonstrate that our proposed method outperforms several baselines using deep CNN features directly in classifying images into events with limited training examples. We also demonstrate that our method achieves the best overall accuracy on a dataset with unseen event categories using a single training example.Comment: Accepted to Winter Applications of Computer Vision (WACV'17

arXiv.org e-Print Archive

Crossref

Outfit Recommender System

Author: Ramesh Nikita
Publication venue: SJSU ScholarWorks
Publication date: 01/04/2018
Field of study

The online apparel retail market size in the United States is worth about seventy-two billion US dollars. Recommendation systems on retail websites generate a lot of this revenue. Thus, improving recommendation systems can increase their revenue. Traditional recommendations for clothes consisted of lexical methods. However, visual-based recommendations have gained popularity over the past few years. This involves processing a multitude of images using different image processing techniques. In order to handle such a vast quantity of images, deep neural networks have been used extensively. With the help of fast Graphics Processing Units, these networks provide results which are extremely accurate, within a small amount of time. However, there are still ways in which recommendations for clothes can be improved. We propose an event-based clothing recommendation system which uses object detection. We train a model to identify nine events/scenarios that a user might attend: White Wedding, Indian Wedding, Conference, Funeral, Red Carpet, Pool Party, Birthday, Graduation and Workout. We train another model to detect clothes out of fifty-three categories of clothes worn at the event. Object detection gives a mAP of 84.01. Nearest neighbors of the clothes detected are recommended to the user

SJSU ScholarWorks

Smartphone picture organization: a hierarchical approach

Author: Dimiccoli Mariella
Lonn Stefan
Radeva Petia
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

We live in a society where the large majority of the population has a camera-equipped smartphone. In addition, hard drives and cloud storage are getting cheaper and cheaper, leading to a tremendous growth in stored personal photos. Unlike photo collections captured by a digital camera, which typically are pre-processed by the user who organizes them into event-related folders, smartphone pictures are automatically stored in the cloud. As a consequence, photo collections captured by a smartphone are highly unstructured and because smartphones are ubiquitous, they present a larger variability compared to pictures captured by a digital camera. To solve the need of organizing large smartphone photo collections automatically, we propose here a new methodology for hierarchical photo organization into topics and topic-related categories. Our approach successfully estimates latent topics in the pictures by applying probabilistic Latent Semantic Analysis, and automatically assigns a name to each topic by relying on a lexical database. Topic-related categories are then estimated by using a set of topic-specific Convolutional Neuronal Networks. To validate our approach, we ensemble and make public a large dataset of more than 8,000 smartphone pictures from 40 persons. Experimental results demonstrate major user satisfaction with respect to state of the art solutions in terms of organization.Peer ReviewedPreprin

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

MDRED: Multi-Modal Multi-Task Distributed Recognition for Event Detection

Author: Nandigam Nageswara Rao
Publication venue: University of Missouri -- Kansas City
Publication date
Field of study

Title from PDF of title page viewed September 28, 2018Thesis advisor: Yugyung LeeVitaIncludes bibliographical references (pages 63-67)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2018Understanding users’ context is essential in emerging mobile sensing applications, such as Metal Detector, Glint Finder, Facefirst. Over the last decade, Machine Learning (ML) techniques have evolved dramatically for real-world applications. Specifically, Deep Learning (DL) has attracted tremendous attention for diverse applications including speech recognition, computer vision. However, ML requires extensive computing resources. ML applications are not suitable for devices with limited computing capabilities. Furthermore, customizing ML applications for users’ context is not easy. Such a situation presents real challenges to mobile based ML applications. We are motivated to solve this problem by designing a distributed and collaborative computing framework for ML edge computing and applications. In this thesis, we propose the Multi-Modal Multi-Task Distributed Recognition for Event Detection (MDRED) framework for complex event recognition with images. The MDRED framework is based on a hybrid ML model that is composed of Deep Learning (DL) and Shallow Learning (SL). The lower level of the MDRED framework is based on the DL models for (1) object detection, (2) color recognition, (3) emotion recognition, (4) face detection, (5) text detection with event images. The higher level is based on the SL-based fusion techniques for the event detection based on the outcomes from the lower level DL models. The fusion model is designed as a weighted feature vector generated by a modified Term Frequency and Inverse Document Frequency (TF-IDF) algorithm, considering common and unique multi-modal features that are recognized for event detection. The prototype of the MDRED framework has been implemented: A master-slave architecture was designed for coordinating the distributed computing among multiple mobile devices at the edge while connecting the edge devices to the cloud ML servers. The MDRED model has been evaluated with the benchmark event datasets and compared with the state-of-the-art event detection models. The MDRED accuracy of 90.5%, 98.8%, 78% for SocEID, UIUC Sports, RED Events datasets, respectively, outperformed the baseline models of AlexNet-fc7, WEBLY-fc7, WIDER-fc7 and Event concepts. We also demonstrate the MDRED application running on Android devices for the real-time event detection.Introduction -- Background and related work -- Proposed work -- Results and evaluation -- Conclusion and future wor

University of Missouri: MOspace