6 research outputs found
Complex Event Recognition from Images with Few Training Examples
We propose to leverage concept-level representations for complex event
recognition in photographs given limited training examples. We introduce a
novel framework to discover event concept attributes from the web and use that
to extract semantic features from images and classify them into social event
categories with few training examples. Discovered concepts include a variety of
objects, scenes, actions and event sub-types, leading to a discriminative and
compact representation for event images. Web images are obtained for each
discovered event concept and we use (pretrained) CNN features to train concept
classifiers. Extensive experiments on challenging event datasets demonstrate
that our proposed method outperforms several baselines using deep CNN features
directly in classifying images into events with limited training examples. We
also demonstrate that our method achieves the best overall accuracy on a
dataset with unseen event categories using a single training example.Comment: Accepted to Winter Applications of Computer Vision (WACV'17
Outfit Recommender System
The online apparel retail market size in the United States is worth about seventy-two billion US dollars. Recommendation systems on retail websites generate a lot of this revenue. Thus, improving recommendation systems can increase their revenue. Traditional recommendations for clothes consisted of lexical methods. However, visual-based recommendations have gained popularity over the past few years. This involves processing a multitude of images using different image processing techniques. In order to handle such a vast quantity of images, deep neural networks have been used extensively. With the help of fast Graphics Processing Units, these networks provide results which are extremely accurate, within a small amount of time. However, there are still ways in which recommendations for clothes can be improved. We propose an event-based clothing recommendation system which uses object detection. We train a model to identify nine events/scenarios that a user might attend: White Wedding, Indian Wedding, Conference, Funeral, Red Carpet, Pool Party, Birthday, Graduation and Workout. We train another model to detect clothes out of fifty-three categories of clothes worn at the event. Object detection gives a mAP of 84.01. Nearest neighbors of the clothes detected are recommended to the user
Smartphone picture organization: a hierarchical approach
We live in a society where the large majority of the population has a camera-equipped smartphone. In addition, hard drives and cloud storage are getting cheaper and cheaper, leading to a tremendous growth in stored personal photos. Unlike photo collections captured by a digital camera, which typically are pre-processed by the user who organizes them into event-related folders, smartphone pictures are automatically stored in the cloud. As a consequence, photo collections captured by a smartphone are highly unstructured and because smartphones are ubiquitous, they present a larger variability compared to pictures captured by a digital camera. To solve the need of organizing large smartphone photo collections automatically, we propose here a new methodology for hierarchical photo organization into topics and topic-related categories. Our approach successfully estimates latent topics in the pictures by applying probabilistic Latent Semantic Analysis, and automatically assigns a name to each topic by relying on a lexical database. Topic-related categories are then estimated by using a set of topic-specific Convolutional Neuronal Networks. To validate our approach, we ensemble and make public a large dataset of more than 8,000 smartphone pictures from 40 persons. Experimental results demonstrate major user satisfaction with respect to state of the art solutions in terms of organization.Peer ReviewedPreprin
MDRED: Multi-Modal Multi-Task Distributed Recognition for Event Detection
Title from PDF of title page viewed September 28, 2018Thesis advisor: Yugyung LeeVitaIncludes bibliographical references (pages 63-67)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2018Understanding users’ context is essential in emerging mobile sensing applications, such
as Metal Detector, Glint Finder, Facefirst. Over the last decade, Machine Learning (ML)
techniques have evolved dramatically for real-world applications. Specifically, Deep Learning
(DL) has attracted tremendous attention for diverse applications including speech recognition,
computer vision. However, ML requires extensive computing resources. ML applications are
not suitable for devices with limited computing capabilities. Furthermore, customizing ML
applications for users’ context is not easy. Such a situation presents real challenges to mobile
based ML applications. We are motivated to solve this problem by designing a distributed and
collaborative computing framework for ML edge computing and applications.
In this thesis, we propose the Multi-Modal Multi-Task Distributed Recognition for
Event Detection (MDRED) framework for complex event recognition with images. The MDRED
framework is based on a hybrid ML model that is composed of Deep Learning (DL) and Shallow
Learning (SL). The lower level of the MDRED framework is based on the DL models for (1)
object detection, (2) color recognition, (3) emotion recognition, (4) face detection, (5) text
detection with event images. The higher level is based on the SL-based fusion techniques for
the event detection based on the outcomes from the lower level DL models. The fusion model
is designed as a weighted feature vector generated by a modified Term Frequency and Inverse
Document Frequency (TF-IDF) algorithm, considering common and unique multi-modal
features that are recognized for event detection. The prototype of the MDRED framework has
been implemented: A master-slave architecture was designed for coordinating the distributed
computing among multiple mobile devices at the edge while connecting the edge devices to
the cloud ML servers. The MDRED model has been evaluated with the benchmark event
datasets and compared with the state-of-the-art event detection models. The MDRED
accuracy of 90.5%, 98.8%, 78% for SocEID, UIUC Sports, RED Events datasets, respectively,
outperformed the baseline models of AlexNet-fc7, WEBLY-fc7, WIDER-fc7 and Event concepts.
We also demonstrate the MDRED application running on Android devices for the real-time
event detection.Introduction -- Background and related work -- Proposed work -- Results and evaluation -- Conclusion and future wor