1,271 research outputs found
Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images
We address the problem of fine-grained action localization from temporally
untrimmed web videos. We assume that only weak video-level annotations are
available for training. The goal is to use these weak labels to identify
temporal segments corresponding to the actions, and learn models that
generalize to unconstrained web videos. We find that web images queried by
action names serve as well-localized highlights for many actions, but are
noisily labeled. To solve this problem, we propose a simple yet effective
method that takes weak video labels and noisy image labels as input, and
generates localized action frames as output. This is achieved by cross-domain
transfer between video frames and web images, using pre-trained deep
convolutional neural networks. We then use the localized action frames to train
action recognition models with long short-term memory networks. We collect a
fine-grained sports action data set FGA-240 of more than 130,000 YouTube
videos. It has 240 fine-grained actions under 85 sports activities. Convincing
results are shown on the FGA-240 data set, as well as the THUMOS 2014
localization data set with untrimmed training videos.Comment: Camera ready version for ACM Multimedia 201
Evaluating Two-Stream CNN for Video Classification
Videos contain very rich semantic information. Traditional hand-crafted
features are known to be inadequate in analyzing complex video semantics.
Inspired by the huge success of the deep learning methods in analyzing image,
audio and text data, significant efforts are recently being devoted to the
design of deep nets for video analytics. Among the many practical needs,
classifying videos (or video clips) based on their major semantic categories
(e.g., "skiing") is useful in many applications. In this paper, we conduct an
in-depth study to investigate important implementation options that may affect
the performance of deep nets on video classification. Our evaluations are
conducted on top of a recent two-stream convolutional neural network (CNN)
pipeline, which uses both static frames and motion optical flows, and has
demonstrated competitive performance against the state-of-the-art methods. In
order to gain insights and to arrive at a practical guideline, many important
options are studied, including network architectures, model fusion, learning
parameters and the final prediction methods. Based on the evaluations, very
competitive results are attained on two popular video classification
benchmarks. We hope that the discussions and conclusions from this work can
help researchers in related fields to quickly set up a good basis for further
investigations along this very promising direction.Comment: ACM ICMR'1
Assessment of Workersâ Level of Exposure to Work-Related Musculoskeletal Discomfort in Dewatered Cassava Mash Sieving Process
This study was undertaken to assess the level of exposure of processors to work-related musculoskeletal disorder when using the locally developed traditional sieve in the sieving process. Quick ergonomic checklist (QEC) involving the researcherâs and the processorsâ assessment using the risk assessment checklist, was used in this assessment and data was obtained from a sample of one hundred and eight (108) processors randomly selected from three senatorial districts of Rivers State. Thirty-six processors from each zone comprising of 14 males and 22 females, were selected., and assessed on the bases of their back, shoulder/arm, wrist/hand and neck posture and frequency of movement during traditional sieving process. The result of the assessment showed that the highest risk of discomfort occurred at the region of the wrist/hand, followed by back, shoulder/arm, and neck. The posture used in the sieving process exposed the processors, not only to the discomfort of pain but also put them at high risk of musculoskeletal disorder at indicated by a high level of percentage exposure of 66% QEC rating. The result indicated a need for immediate attention and change to an improved method that will reduce the discomfort on the body parts assessed. identified parts
Efficient On-the-fly Category Retrieval using ConvNets and GPUs
We investigate the gains in precision and speed, that can be obtained by
using Convolutional Networks (ConvNets) for on-the-fly retrieval - where
classifiers are learnt at run time for a textual query from downloaded images,
and used to rank large image or video datasets.
We make three contributions: (i) we present an evaluation of state-of-the-art
image representations for object category retrieval over standard benchmark
datasets containing 1M+ images; (ii) we show that ConvNets can be used to
obtain features which are incredibly performant, and yet much lower dimensional
than previous state-of-the-art image representations, and that their
dimensionality can be reduced further without loss in performance by
compression using product quantization or binarization. Consequently, features
with the state-of-the-art performance on large-scale datasets of millions of
images can fit in the memory of even a commodity GPU card; (iii) we show that
an SVM classifier can be learnt within a ConvNet framework on a GPU in parallel
with downloading the new training images, allowing for a continuous refinement
of the model as more images become available, and simultaneous training and
ranking. The outcome is an on-the-fly system that significantly outperforms its
predecessors in terms of: precision of retrieval, memory requirements, and
speed, facilitating accurate on-the-fly learning and ranking in under a second
on a single GPU.Comment: Published in proceedings of ACCV 201
The age of data-driven proteomics : how machine learning enables novel workflows
A lot of energy in the field of proteomics is dedicated to the application of challenging experimental workflows, which include metaproteomics, proteogenomics, data independent acquisition (DIA), non-specific proteolysis, immunopeptidomics, and open modification searches. These workflows are all challenging because of ambiguity in the identification stage; they either expand the search space and thus increase the ambiguity of identifications, or, in the case of DIA, they generate data that is inherently more ambiguous. In this context, machine learning-based predictive models are now generating considerable excitement in the field of proteomics because these predictive models hold great potential to drastically reduce the ambiguity in the identification process of the above-mentioned workflows. Indeed, the field has already produced classical machine learning and deep learning models to predict almost every aspect of a liquid chromatography-mass spectrometry (LC-MS) experiment. Yet despite all the excitement, thorough integration of predictive models in these challenging LC-MS workflows is still limited, and further improvements to the modeling and validation procedures can still be made. In this viewpoint we therefore point out highly promising recent machine learning developments in proteomics, alongside some of the remaining challenges
Towards Bottom-Up Analysis of Social Food
in ACM Digital Health Conference 201
An early resource characterization of deep learning on wearables, smartphones and internet-of-things devices
Detecting and reacting to user behavior and ambient context are core elements of many emerging mobile sensing and Internet-of-Things (IoT) applications. However, extracting accurate infer-ences from raw sensor data is challenging within the noisy and complex environments where these systems are deployed. Deep Learning { is one of the most promising approaches for overcom-ing this challenge, and achieving more robust and reliable infer-ence. Techniques developed within this rapidly evolving area of machine learning are now state-of-the-art for many inference tasks (such as, audio sensing and computer vision) commonly needed by IoT and wearable applications. But currently deep learning al-gorithms are seldom used in mobile/IoT class hardware because they often impose debilitating levels of system overhead (e.g., memory, computation and energy). Efforts to address this bar-rier to deep learning adoption are slowed by our lack of a system-atic understanding of how these algorithms behave at inference time on resource constrained hardware. In this paper, we present the-rst { albeit preliminary { measurement study of common deep learning models (such as Convolutional Neural Networks and Deep Neural Networks) on representative mobile and embed-ded platforms. The aim of this investigation is to begin to build knowledge of the performance characteristics, resource require-ments and the execution bottlenecks for deep learning models when being used to recognize categories of behavior and context. The results and insights of this study, lay an empirical foundation for the development of optimization methods and execution envi-ronments that enable deep learning to be more readily integrated into next-generation IoT, smartphones and wearable systems
Wildlife surveillance using deep learning methods
Wildlife conservation and the management of humanâwildlife conflicts require costâeffective methods of monitoring wild animal behavior. Still and video camera surveillance can generate enormous quantities of data, which is laborious and expensive to screen for the species of interest. In the present study, we describe a stateâofâtheâart, deep learning approach for automatically identifying and isolating speciesâspecific activity from still images and video data.
We used a dataset consisting of 8,368 images of wild and domestic animals in farm buildings, and we developed an approach firstly to distinguish badgers from other species (binary classification) and secondly to distinguish each of six animal species (multiclassification). We focused on binary classification of badgers first because such a tool would be relevant to efforts to manage Mycobacterium bovis (the cause of bovine tuberculosis) transmission between badgers and cattle.
We used two deep learning frameworks for automatic image recognition. They achieved high accuracies, in the order of 98.05% for binary classification and 90.32% for multiclassification. Based on the deep learning framework, a detection process was also developed for identifying animals of interest in video footage, which to our knowledge is the first application for this purpose.
The algorithms developed here have wide applications in wildlife monitoring where large quantities of visual data require screening for certain species
Bose-Einstein Condensation of Helium and Hydrogen inside Bundles of Carbon Nanotubes
Helium atoms or hydrogen molecules are believed to be strongly bound within
the interstitial channels (between three carbon nanotubes) within a bundle of
many nanotubes. The effects on adsorption of a nonuniform distribution of tubes
are evaluated. The energy of a single particle state is the sum of a discrete
transverse energy Et (that depends on the radii of neighboring tubes) and a
quasicontinuous energy Ez of relatively free motion parallel to the axis of the
tubes. At low temperature, the particles occupy the lowest energy states, the
focus of this study. The transverse energy attains a global minimum value
(Et=Emin) for radii near Rmin=9.95 Ang. for H2 and 8.48 Ang.for He-4. The
density of states N(E) near the lowest energy is found to vary linearly above
this threshold value, i.e. N(E) is proportional to (E-Emin). As a result, there
occurs a Bose-Einstein condensation of the molecules into the channel with the
lowest transverse energy. The transition is characterized approximately as that
of a four dimensional gas, neglecting the interactions between the adsorbed
particles. The phenomenon is observable, in principle, from a singular heat
capacity. The existence of this transition depends on the sample having a
relatively broad distribution of radii values that include some near Rmin.Comment: 21 pages, 9 figure
- âŠ