30,360 research outputs found
Gait recognition and understanding based on hierarchical temporal memory using 3D gait semantic folding
Gait recognition and understanding systems have shown a wide-ranging application prospect. However, their use of unstructured data from image and video has affected their performance, e.g., they are easily influenced by multi-views, occlusion, clothes, and object carrying conditions. This paper addresses these problems using a realistic 3-dimensional (3D) human structural data and sequential pattern learning framework with top-down attention modulating mechanism based on Hierarchical Temporal Memory (HTM). First, an accurate 2-dimensional (2D) to 3D human body pose and shape semantic parameters estimation method is proposed, which exploits the advantages of an instance-level body parsing model and a virtual dressing method. Second, by using gait semantic folding, the estimated body parameters are encoded using a sparse 2D matrix to construct the structural gait semantic image. In order to achieve time-based gait recognition, an HTM Network is constructed to obtain the sequence-level gait sparse distribution representations (SL-GSDRs). A top-down attention mechanism is introduced to deal with various conditions including multi-views by refining the SL-GSDRs, according to prior knowledge. The proposed gait learning model not only aids gait recognition tasks to overcome the difficulties in real application scenarios but also provides the structured gait semantic images for visual cognition. Experimental analyses on CMU MoBo, CASIA B, TUM-IITKGP, and KY4D datasets show a significant performance gain in terms of accuracy and robustness
Recurrent Scene Parsing with Perspective Understanding in the Loop
Objects may appear at arbitrary scales in perspective images of a scene,
posing a challenge for recognition systems that process images at a fixed
resolution. We propose a depth-aware gating module that adaptively selects the
pooling field size in a convolutional network architecture according to the
object scale (inversely proportional to the depth) so that small details are
preserved for distant objects while larger receptive fields are used for those
nearby. The depth gating signal is provided by stereo disparity or estimated
directly from monocular input. We integrate this depth-aware gating into a
recurrent convolutional neural network to perform semantic segmentation. Our
recurrent module iteratively refines the segmentation results, leveraging the
depth and semantic predictions from the previous iterations.
Through extensive experiments on four popular large-scale RGB-D datasets, we
demonstrate this approach achieves competitive semantic segmentation
performance with a model which is substantially more compact. We carry out
extensive analysis of this architecture including variants that operate on
monocular RGB but use depth as side-information during training, unsupervised
gating as a generic attentional mechanism, and multi-resolution gating. We find
that gated pooling for joint semantic segmentation and depth yields
state-of-the-art results for quantitative monocular depth estimation
Usability testing for improving interactive geovisualization techniques
Usability describes a product’s fitness for use according to a set of predefined criteria.
Whatever the aim of the product, it should facilitate users’ tasks or enhance their performance
by providing appropriate analysis tools. In both cases, the main interest is to satisfy users in
terms of providing relevant functionality which they find fit for purpose. “Testing usability
means making sure that people can find and work with [a product’s] functions to meet their
needs” (Dumas and Redish, 1999: 4). It is therefore concerned with establishing whether
people can use a product to complete their tasks with ease and at the same time help them
complete their jobs more effectively.
This document describes the findings of a usability study carried out on DecisionSite Map
Interaction Services (Map IS). DecisionSite, a product of Spotfire, Inc.,1 is an interactive
system for the visual and dynamic exploration of data designed for supporting decisionmaking.
The system was coupled to ArcExplorer (forming DecisionSite Map IS) to provide
limited GIS functionality (simple user interface, basic tools, and data management) and
support users of spatial data. Hence, this study set out to test the suitability of the coupling
between the two software components (DecisionSite and ArcExplorer) for the purpose of
exploring spatial data. The first section briefly discusses DecisionSite’s visualization
functionality. The second section describes the test goals, its design, the participants and data
used. The following section concentrates on the analysis of results, while the final section
discusses future areas of research and possible development
Transcribing Content from Structural Images with Spotlight Mechanism
Transcribing content from structural images, e.g., writing notes from music
scores, is a challenging task as not only the content objects should be
recognized, but the internal structure should also be preserved. Existing image
recognition methods mainly work on images with simple content (e.g., text lines
with characters), but are not capable to identify ones with more complex
content (e.g., structured symbols), which often follow a fine-grained grammar.
To this end, in this paper, we propose a hierarchical Spotlight Transcribing
Network (STN) framework followed by a two-stage "where-to-what" solution.
Specifically, we first decide "where-to-look" through a novel spotlight
mechanism to focus on different areas of the original image following its
structure. Then, we decide "what-to-write" by developing a GRU based network
with the spotlight areas for transcribing the content accordingly. Moreover, we
propose two implementations on the basis of STN, i.e., STNM and STNR, where the
spotlight movement follows the Markov property and Recurrent modeling,
respectively. We also design a reinforcement method to refine the framework by
self-improving the spotlight mechanism. We conduct extensive experiments on
many structural image datasets, where the results clearly demonstrate the
effectiveness of STN framework.Comment: Accepted by KDD2018 Research Track. In proceedings of the 24th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining
(KDD'18
On the User Perception of Configurable Reference Process Models - Initial Insights
Enterprise Systems potentially lead to significant efficiency gains but require a well-conducted configuration process. A configurable reference modelling language based on the widely used EPC notation, which can be used to specify Configurable EPCs (C-EPCs), has been developed to support the task of Enterprise Systems configuration. This paper presents a laboratory experiment on C-EPCs and discusses empirical data on the comparison of C-EPCs to regular EPCs. Using the Method Adoption Model we report on modeller’s perceptions as to the usefulness and ease of use of C-EPCs, concluding that C-EPCs provide sufficient yet improvable conceptual support towards reference model configuration
ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning
To relieve the pain of manually selecting machine learning algorithms and
tuning hyperparameters, automated machine learning (AutoML) methods have been
developed to automatically search for good models. Due to the huge model search
space, it is impossible to try all models. Users tend to distrust automatic
results and increase the search budget as much as they can, thereby undermining
the efficiency of AutoML. To address these issues, we design and implement
ATMSeer, an interactive visualization tool that supports users in refining the
search space of AutoML and analyzing the results. To guide the design of
ATMSeer, we derive a workflow of using AutoML based on interviews with machine
learning experts. A multi-granularity visualization is proposed to enable users
to monitor the AutoML process, analyze the searched models, and refine the
search space in real time. We demonstrate the utility and usability of ATMSeer
through two case studies, expert interviews, and a user study with 13 end
users.Comment: Published in the ACM Conference on Human Factors in Computing Systems
(CHI), 2019, Glasgow, Scotland U
- …