224,557 research outputs found
Information extraction from multimedia web documents: an open-source platform and testbed
The LivingKnowledge project aimed to enhance the current state of the art in search, retrieval and knowledge management on the web by advancing the use of sentiment and opinion analysis within multimedia applications. To achieve this aim, a diverse set of novel and complementary analysis techniques have been integrated into a single, but extensible software platform on which such applications can be built. The platform combines state-of-the-art techniques for extracting facts, opinions and sentiment from multimedia documents, and unlike earlier platforms, it exploits both visual and textual techniques to support multimedia information retrieval. Foreseeing the usefulness of this software in the wider community, the platform has been made generally available as an open-source project. This paper describes the platform design, gives an overview of the analysis algorithms integrated into the system and describes two applications that utilise the system for multimedia information retrieval
Good Features to Correlate for Visual Tracking
During the recent years, correlation filters have shown dominant and
spectacular results for visual object tracking. The types of the features that
are employed in these family of trackers significantly affect the performance
of visual tracking. The ultimate goal is to utilize robust features invariant
to any kind of appearance change of the object, while predicting the object
location as properly as in the case of no appearance change. As the deep
learning based methods have emerged, the study of learning features for
specific tasks has accelerated. For instance, discriminative visual tracking
methods based on deep architectures have been studied with promising
performance. Nevertheless, correlation filter based (CFB) trackers confine
themselves to use the pre-trained networks which are trained for object
classification problem. To this end, in this manuscript the problem of learning
deep fully convolutional features for the CFB visual tracking is formulated. In
order to learn the proposed model, a novel and efficient backpropagation
algorithm is presented based on the loss function of the network. The proposed
learning framework enables the network model to be flexible for a custom
design. Moreover, it alleviates the dependency on the network trained for
classification. Extensive performance analysis shows the efficacy of the
proposed custom design in the CFB tracking framework. By fine-tuning the
convolutional parts of a state-of-the-art network and integrating this model to
a CFB tracker, which is the top performing one of VOT2016, 18% increase is
achieved in terms of expected average overlap, and tracking failures are
decreased by 25%, while maintaining the superiority over the state-of-the-art
methods in OTB-2013 and OTB-2015 tracking datasets.Comment: Accepted version of IEEE Transactions on Image Processin
ProSLAM: Graph SLAM from a Programmer's Perspective
In this paper we present ProSLAM, a lightweight stereo visual SLAM system
designed with simplicity in mind. Our work stems from the experience gathered
by the authors while teaching SLAM to students and aims at providing a highly
modular system that can be easily implemented and understood. Rather than
focusing on the well known mathematical aspects of Stereo Visual SLAM, in this
work we highlight the data structures and the algorithmic aspects that one
needs to tackle during the design of such a system. We implemented ProSLAM
using the C++ programming language in combination with a minimal set of well
known used external libraries. In addition to an open source implementation, we
provide several code snippets that address the core aspects of our approach
directly in this paper. The results of a thorough validation performed on
standard benchmark datasets show that our approach achieves accuracy comparable
to state of the art methods, while requiring substantially less computational
resources.Comment: 8 pages, 8 figure
Categorisation of visualisation methods to support the design of Human-Computer Interaction systems
During the design of Human-Computer Interaction (HCI) systems, the creation of visual artefacts forms an important part of design. On one hand producing a visual artefact has a number of advantages: it helps designers to externalise their thought and acts as a common language between different stakeholders. On the other hand, if an inappropriate visualisation method is employed it could hinder the design process. To support the design of HCI systems, this paper reviews the categorisation of visualisation methods used in HCI. A keyword search is conducted to identify a) current HCI design methods, b) approaches of selecting these methods. The resulting design methods are filtered to create a list of just visualisation methods. These are then categorised using the approaches identified in (b). As a result 23 HCI visualisation methods are identified and categorised in 5 selection approaches (The Recipient, Primary Purpose, Visual Archetype, Interaction Type, and The Design Process).Innovate UK, EPSRC, Airbus Group Innovation
ViZDoom Competitions: Playing Doom from Pixels
This paper presents the first two editions of Visual Doom AI Competition,
held in 2016 and 2017. The challenge was to create bots that compete in a
multi-player deathmatch in a first-person shooter (FPS) game, Doom. The bots
had to make their decisions based solely on visual information, i.e., a raw
screen buffer. To play well, the bots needed to understand their surroundings,
navigate, explore, and handle the opponents at the same time. These aspects,
together with the competitive multi-agent aspect of the game, make the
competition a unique platform for evaluating the state of the art reinforcement
learning algorithms. The paper discusses the rules, solutions, results, and
statistics that give insight into the agents' behaviors. Best-performing agents
are described in more detail. The results of the competition lead to the
conclusion that, although reinforcement learning can produce capable Doom bots,
they still are not yet able to successfully compete against humans in this
game. The paper also revisits the ViZDoom environment, which is a flexible,
easy to use, and efficient 3D platform for research for vision-based
reinforcement learning, based on a well-recognized first-person perspective
game Doom
Medical Image Classification via SVM using LBP Features from Saliency-Based Folded Data
Good results on image classification and retrieval using support vector
machines (SVM) with local binary patterns (LBPs) as features have been
extensively reported in the literature where an entire image is retrieved or
classified. In contrast, in medical imaging, not all parts of the image may be
equally significant or relevant to the image retrieval application at hand. For
instance, in lung x-ray image, the lung region may contain a tumour, hence
being highly significant whereas the surrounding area does not contain
significant information from medical diagnosis perspective. In this paper, we
propose to detect salient regions of images during training and fold the data
to reduce the effect of irrelevant regions. As a result, smaller image areas
will be used for LBP features calculation and consequently classification by
SVM. We use IRMA 2009 dataset with 14,410 x-ray images to verify the
performance of the proposed approach. The results demonstrate the benefits of
saliency-based folding approach that delivers comparable classification
accuracies with state-of-the-art but exhibits lower computational cost and
storage requirements, factors highly important for big data analytics.Comment: To appear in proceedings of The 14th International Conference on
Machine Learning and Applications (IEEE ICMLA 2015), Miami, Florida, USA,
201
How can heat maps of indexing vocabularies be utilized for information seeking purposes?
The ability to browse an information space in a structured way by exploiting
similarities and dissimilarities between information objects is crucial for
knowledge discovery. Knowledge maps use visualizations to gain insights into
the structure of large-scale information spaces, but are still far away from
being applicable for searching. The paper proposes a use case for enhancing
search term recommendations by heat map visualizations of co-word
relation-ships taken from indexing vocabulary. By contrasting areas of
different "heat" the user is enabled to indicate mainstream areas of the field
in question more easily.Comment: URL workshop proceedings: http://ceur-ws.org/Vol-1311
- ā¦