8,046 research outputs found
Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images
We address the problem of fine-grained action localization from temporally
untrimmed web videos. We assume that only weak video-level annotations are
available for training. The goal is to use these weak labels to identify
temporal segments corresponding to the actions, and learn models that
generalize to unconstrained web videos. We find that web images queried by
action names serve as well-localized highlights for many actions, but are
noisily labeled. To solve this problem, we propose a simple yet effective
method that takes weak video labels and noisy image labels as input, and
generates localized action frames as output. This is achieved by cross-domain
transfer between video frames and web images, using pre-trained deep
convolutional neural networks. We then use the localized action frames to train
action recognition models with long short-term memory networks. We collect a
fine-grained sports action data set FGA-240 of more than 130,000 YouTube
videos. It has 240 fine-grained actions under 85 sports activities. Convincing
results are shown on the FGA-240 data set, as well as the THUMOS 2014
localization data set with untrimmed training videos.Comment: Camera ready version for ACM Multimedia 201
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
TagBook: A Semantic Video Representation without Supervision for Event Detection
We consider the problem of event detection in video for scenarios where only
few, or even zero examples are available for training. For this challenging
setting, the prevailing solutions in the literature rely on a semantic video
representation obtained from thousands of pre-trained concept detectors.
Different from existing work, we propose a new semantic video representation
that is based on freely available social tagged videos only, without the need
for training any intermediate concept detectors. We introduce a simple
algorithm that propagates tags from a video's nearest neighbors, similar in
spirit to the ones used for image retrieval, but redesign it for video event
detection by including video source set refinement and varying the video tag
assignment. We call our approach TagBook and study its construction,
descriptiveness and detection performance on the TRECVID 2013 and 2014
multimedia event detection datasets and the Columbia Consumer Video dataset.
Despite its simple nature, the proposed TagBook video representation is
remarkably effective for few-example and zero-example event detection, even
outperforming very recent state-of-the-art alternatives building on supervised
representations.Comment: accepted for publication as a regular paper in the IEEE Transactions
on Multimedi
Recommended from our members
Models for Learning (Mod4L) Final Report: Representing Learning Designs
The Mod4L Models of Practice project is part of the JISC-funded Design for Learning Programme. It ran from 1 May – 31 December 2006. The philosophy underlying the project was that a general split is evident in the e-learning community between development of e-learning tools, services and standards, and research into how teachers can use these most effectively, and is impeding uptake of new tools and methods by teachers. To help overcome this barrier and bridge the gap, a need is felt for practitioner-focused resources which describe a range of learning designs and offer guidance on how these may be chosen and applied, how they can support effective practice in design for learning, and how they can support the development of effective tools, standards and systems with a learning design capability (see, for example, Griffiths and Blat 2005, JISC 2006). Practice models, it was suggested, were such a resource.
The aim of the project was to: develop a range of practice models that could be used by practitioners in real life contexts and have a high impact on improving teaching and learning practice.
We worked with two definitions of practice models. Practice models are:
1. generic approaches to the structuring and orchestration of learning activities. They express elements of pedagogic principle and allow practitioners to make informed choices (JISC 2006)
However, however effective a learning design may be, it can only be shared with others through a representation. The issue of representation of learning designs is, then, central to the concept of sharing and reuse at the heart of JISC’s Design for Learning programme. Thus practice models should be both representations of effective practice, and effective representations of practice. Hence we arrived at the project working definition of practice models as:
2. Common, but decontextualised, learning designs that are represented in a way that is usable by practitioners (teachers, managers, etc).(Mod4L working definition, Falconer & Littlejohn 2006).
A learning design is defined as the outcome of the process of designing, planning and orchestrating learning activities as part of a learning session or programme (JISC 2006).
Practice models have many potential uses: they describe a range of learning designs that are found to be effective, and offer guidance on their use; they support sharing, reuse and adaptation of learning designs by teachers, and also the development of tools, standards and systems for planning, editing and running the designs.
The project took a practitioner-centred approach, working in close collaboration with a focus group of 12 teachers recruited across a range of disciplines and from both FE and HE. Focus group members are listed in Appendix 1. Information was gathered from the focus group through two face to face workshops, and through their contributions to discussions on the project wiki. This was supplemented by an activity at a JISC pedagogy experts meeting in October 2006, and a part workshop at ALT-C in September 2006. The project interim report of August 2006 contained the outcomes of the first workshop (Falconer and Littlejohn, 2006).
The current report refines the discussion of issues of representing learning designs for sharing and reuse evidenced in the interim report and highlights problems with the concept of practice models (section 2), characterises the requirements teachers have of effective representations (section 3), evaluates a number of types of representation against these requirements (section 4), explores the more technically focused role of sequencing representations and controlled vocabularies (sections 5 & 6), documents some generic learning designs (section 8.2) and suggests ways forward for bridging the gap between teachers and developers (section 2.6).
All quotations are taken from the Mod4L wiki unless otherwise stated
Recommended from our members
STELLAR (Semantic Technologies Enhancing the Lifecycle of Learning Resources): Jisc Final Report
[Project Summary]
As one of the earliest distance learning providers The Open University (OU) has a rich heritage of archived learning materials. An ever increasing amount of that is in digital form and is being deposited with the University Archive. This growth has been driven by digitisation activity from projects such as AVA (Access to Video Assets) and the Fedora-based Open University Digital Library ‘a place to discover digital and digitised archival content from the OU Library, from videos and images to digitised documents’. Other digital content is being captured from web archiving activities, such as work to preserve Moodle Virtual Learning Environment course websites. An evidence based understanding is required to inform digital preservation policies, curation strategy and investment in digital library development.
Following the Pre-enhancement, Enhancement and Post-enhancement methodology set out by Jisc, STELLAR adopted the model of a balanced scorecard to ascertain the value ascribed to the non-current learning materials. Four aspects were considered: Personal and professional perspectives of value; Value to the Higher Educational and academic communities; Value to internal processes and cultures; Financial perspectives of value. The outcomes of the survey indicated that stakeholders place a high value on the materials, and that they perceived them to have value in all areas evaluated.
Three OU courses were chosen from the digital library for the transformation stage. These materials were enhanced and transformed into RDF, a process that required more extensive metadata expertise and effort than was expected. Following enhancement the RDF was accessed through a tool called DiscOU, created by a member of the project team from the OU’s Knowledge Media Institute. DiscOU uses both linked data and a semantic meaning engine to analyse the meaning of the text in a search query. This is matched against the meaning of the content derived from an index of the full-text of the digital library content.
In the final stage stakeholders were asked through a survey and series of workshops to use the DiscOU proof-of-concept tool to assess their perception of the value of this transformation. This has revealed that overall, academics and other stakeholders in the university do believe that the value of the selected materials was positively impacted by the application of semantic technologies
Recommended from our members
When users generate music playlists: When words leave off, music begins?
Music systems that generate playlists are gaining increasing popularity, yet ways to select songs to be acceptable to users is still elusive. We present the results of an explorative study that focused on the language of musically untrained end users for playlist choices, in a variety of listening contexts. Our results indicate that there are a number of opportunities for playlist recommendation or retrieval systems, particularly by taking context into account
Assistive technologies for severe and profound hearing loss: beyond hearing aids and implants
Assistive technologies offer capabilities that were previously inaccessible to individuals with severe and profound hearing loss who have no or limited access to hearing aids and implants. This literature review aims to explore existing assistive technologies and identify what still needs to be done. It is found that there is a lack of focus on the overall objectives of assistive technologies. In addition, several other issues are identified i.e. only a very small number of assistive technologies developed within a research context have led to commercial devices, there is a predisposition to use the latest expensive technologies and a tendency to avoid designing products universally. Finally, the further development of plug-ins that translate the text content of a website to various sign languages is needed to make information on the internet more accessible
- …