447 research outputs found
Recommended from our members
User-centred video abstraction
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University LondonThe rapid growth of digital video content in recent years has imposed the need for the development of technologies with the capability to produce condensed but semantically rich versions of the input video stream in an effective manner. Consequently, the topic of Video Summarisation is becoming increasingly popular in multimedia community and numerous video abstraction approaches have been proposed accordingly. These recommended techniques can be divided into two major categories of automatic and semi-automatic in accordance with the required level of human intervention in summarisation process. The fully-automated methods mainly adopt the low-level visual, aural and textual features alongside the mathematical and statistical algorithms in furtherance to extract the most significant segments of original video. However, the effectiveness of this type of techniques is restricted by a number of factors such as domain-dependency, computational expenses and the inability to understand the semantics of videos from low-level features. The second category of techniques however, attempts to alleviate the quality of summaries by involving humans in the abstraction process to bridge the semantic gap. Nonetheless, a single user’s subjectivity and other external contributing factors such as distraction will potentially deteriorate the performance of this group of approaches. Accordingly, in this thesis we have focused on the development of three user-centred effective video summarisation techniques that could be applied to different video categories and generate satisfactory results. According to our first proposed approach, a novel mechanism for a user-centred video summarisation has been presented for the scenarios in which multiple actors are employed in the video summarisation process in order to minimise the negative effects of sole user adoption. Based on our recommended algorithm, the video frames were initially scored by a group of video annotators ‘on the fly’. This was followed by averaging these assigned scores in order to generate a singular saliency score for each video frame and, finally, the highest scored video frames alongside the corresponding audio and textual contents were extracted to be included into the final summary. The effectiveness of our approach has been assessed by comparing the video summaries generated based on our approach against the results obtained from three existing automatic summarisation tools that adopt different modalities for abstraction purposes. The experimental results indicated that our proposed method is capable of delivering remarkable outcomes in terms of Overall Satisfaction and Precision with an acceptable Recall rate, indicating the usefulness of involving user input in the video summarisation process. In an attempt to provide a better user experience, we have proposed our personalised video summarisation method with an ability to customise the generated summaries in accordance with the viewers’ preferences. Accordingly, the end-user’s priority levels towards different video scenes were captured and utilised for updating the average scores previously assigned by the video annotators. Finally, our earlier proposed summarisation method was adopted to extract the most significant audio-visual content of the video. Experimental results indicated the capability of this approach to deliver superior outcomes compared with our previously proposed method and the three other automatic summarisation tools. Finally, we have attempted to reduce the required level of audience involvement for personalisation purposes by proposing a new method for producing personalised video summaries. Accordingly, SIFT visual features were adopted to identify the video scenes’ semantic categories. Fusing this retrieved data with pre-built users’ profiles, personalised video abstracts can be created. Experimental results showed the effectiveness of this method in delivering superior outcomes comparing to our previously recommended algorithm and the three other automatic summarisation techniques
Dynamic Scene Creation from Text
Visual information is an integral part of our daily life. Typically, it tends to convey more information than simple textual information. A visual depiction of a textual story, as an animation or video, provides a more engaging and realistic experience and can be used in different applications. Examples of such applications include but are not limited to education, advertisement, crime scene investigation, forensic analysis of a crime, treatment of different types of mental and psychological disorders, etc. Manual 3D scene creation is a time-consuming process and requires expertise of individuals familiar with the content creation environment. Automatic scene generation using textual description and a library of developed components offers a quick and easy alternative for manual scene representation and proof of concept ideas. In this thesis, we propose a scheme for extraction of objects of interest and their spatial relationships from a user-provided textual description to create a 3D dynamic scene and animation to make it more realistic
Semantic Sort: A Supervised Approach to Personalized Semantic Relatedness
We propose and study a novel supervised approach to learning statistical
semantic relatedness models from subjectively annotated training examples. The
proposed semantic model consists of parameterized co-occurrence statistics
associated with textual units of a large background knowledge corpus. We
present an efficient algorithm for learning such semantic models from a
training sample of relatedness preferences. Our method is corpus independent
and can essentially rely on any sufficiently large (unstructured) collection of
coherent texts. Moreover, the approach facilitates the fitting of semantic
models for specific users or groups of users. We present the results of
extensive range of experiments from small to large scale, indicating that the
proposed method is effective and competitive with the state-of-the-art.Comment: 37 pages, 8 figures A short version of this paper was already
published at ECML/PKDD 201
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
Media streams--representing video for retrieval and repurposing
Thesis (Ph. D.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1995.Includes bibliographical references (p. 325-344).by Marc Eliot Davis.Ph.D
Smart PIN: performance and cost-oriented context-aware personal information network
The next generation of networks will involve interconnection of heterogeneous individual
networks such as WPAN, WLAN, WMAN and Cellular network, adopting the IP as common infrastructural protocol and providing virtually always-connected network. Furthermore,
there are many devices which enable easy acquisition and storage of information as pictures, movies, emails, etc. Therefore, the information overload and divergent content’s
characteristics make it difficult for users to handle their data in manual way. Consequently, there is a need for personalised automatic services which would enable data exchange across heterogeneous network and devices. To support these personalised services, user centric approaches
for data delivery across the heterogeneous network are also required.
In this context, this thesis proposes Smart PIN - a novel performance and cost-oriented context-aware Personal Information Network. Smart PIN's architecture is detailed including its network, service and management components. Within the service component, two novel schemes for efficient delivery of context and content data are proposed:
Multimedia Data Replication Scheme (MDRS) and Quality-oriented Algorithm for Multiple-source Multimedia Delivery (QAMMD).
MDRS supports efficient data accessibility among distributed devices using data replication which is based on a utility function and a minimum data set. QAMMD employs a buffer underflow avoidance scheme for streaming, which achieves high multimedia quality without content adaptation to network conditions. Simulation models for MDRS and
QAMMD were built which are based on various heterogeneous network scenarios. Additionally a multiple-source streaming based on QAMMS was implemented as a prototype and tested in an emulated network environment. Comparative tests show that MDRS and QAMMD perform significantly better than other approaches
Multimedia interaction and access based on emotions:automating video elicited emotions recognition and visualization
Tese de doutoramento, Informática (Engenharia Informática), Universidade de Lisboa, Faculdade de Ciências, 2013Films are an excellent form of art that exploit our affective, perceptual and intellectual
abilities. Technological developments and the trends for media convergence are turning
video into a dominant and pervasive medium, and online video is becoming a growing
entertainment activity on the web. Alongside, physiological measures are making it
possible to study additional ways to identify and use emotions in human-machine
interactions, multimedia retrieval and information visualization.
The work described in this thesis has two main objectives: to develop an Emotions
Recognition and Classification mechanism for video induced emotions; and to enable
Emotional Movie Access and Exploration. Regarding the first objective, we explore
recognition and classification mechanisms, in order to allow video classification based
on emotions, and to identify each user’s emotional states providing different access
mechanisms. We aim to provide video classification and indexing based on emotions,
felt by the users while watching movies. In what concerns the second objective, we
focus on emotional movie access and exploration mechanisms to find ways to access
and visualize videos based on their emotional properties and users’ emotions and
profiles. In this context, we designed a set of methods to access and watch the movies,
both at the level of the whole movie collection, and at the individual movies level.
The automatic recognition mechanism developed in this work allows for the detection
of physiologic patterns, indeed providing valid individual information about users
emotion while they were watching a specific movie; in addition, the user interface
representations and exploration mechanisms proposed and evaluated in this thesis, show
that more perceptive, satisfactory and useful visual representations influenced positively
the exploration of emotional information in movies.Fundação para a Ciência e a Tecnologia (FCT, PROTEC SFRH/BD/49475/2009, LASIGE Multiannual Funding e VIRUS projecto (PTDC/EIAEIA/101012/2008
A Design Framework for Engaging Collective Interaction Applications for Mobile Devices
The main objective of this research is to define the conceptual and technological key factors of engaging collective interaction applications for mobile devices. To answer the problem, a throwaway prototyping software development method is utilized to study design issues. Furthermore, a conceptual framework is constructed in accordance with design science activities. This fundamentally exploratory research is a combination of literature review, design and implementation of mobile device based prototypes, as well as empirical humancomputer interaction studies, which were conducted during the period 2008 - 2012. All the applications described in this thesis were developed mainly for research purposes in order to ensure that attention could be focused on the problem statement.
The thesis presents the design process of the novel Engaging Collective Interaction (ECI) framework that can be used to design engaging collective interaction applications for mobile devices e.g. for public events and co-creational spaces such as sport events, schools or exhibitions. The building and evaluating phases of design science combine the existing knowledge and the results of the throwaway prototyping approach. Thus, the framework was constructed from the key factors identified of six developed and piloted prototypes. Finally, the framework was used to design and implement a collective sound sensing application in a classroom setting. The evaluation results indicated that the framework offered knowledge to develop a purposeful application. Furthermore, the evolutionary and iterative framework building process combined together with the throwaway prototyping process can be presented as an unseen Dual Process Prototyping (DPP) model. Therefore it is claimed that: 1) ECI can be used to design engaging collective interaction applications for mobile devices. 2) DPP is an appropriate method to build a framework or a model.
This research indicates that the key factors of the presented framework are: collaborative control, gamification, playfulness, active spectatorship, continuous sensing, and collective experience. Further, the results supported the assumption that when the focus is more on activity rather than technology, it has a positive impact on the engagement. As a conclusion, this research has shown that a framework for engaging collective interaction applications for mobile devices can be designed (ECI) and it can be utilized to build an appropriate application. In addition, the framework design process can be presented as a novel model (DPP). The framework does not provide a step-by-step guide for designing applications, but it helps to refine the design of successful ones. The overall benefit of the framework is that developers can pay attention to the factors of engaging application at an early stage of design
Design, deployment and assessment of a movie archive system in a film studies context
This thesis describe our work in developing a movie archive system for students of Film Studies in Dublin City University. In particular, our system uses several recent multimedia technologies to automatically process digital video content but at the same time we use the usability engineering process to relate to the real tasks of real users in their real environments.
We investigate how real users take advantage of technologies in a movie browsing system. By designing, building, deploying and assessing the usage of a technology in a user-focused way, the overall impact of a movie browsing system can be determined holistically.
The application domain we work in is film studies where students need to study movie contents and analyse movie sequences. Our work began by identification of user
needs through observations, focus groups and usability testing, followed by sketching and prototyping a web-based system. We then deployed the system to film study classes over a semester, monitored usage and gathered quantitative as well as qualitative data. Focused experiments were carried out to assess students’ performance and satisfaction levels. Our findings show expected patterns of usage of a real-user setting outside the lab, but at the
same time highlighted issues that need to be further investigated. In general, students found most of the provided features were beneficial for their studies. Findings from the experiment shows better performance in the essay assessments and higher satisfaction levels.
An interesting finding shows students are more engaged with the newly-introduced software application and take longer time to complete the same task than without the advanced
features of the application. This phenomenon was rationalized from established learning theory from the psychology domain. In a technologically-oriented multimedia field today, we attempted to bring in a user-centred approach of end-user interactions throughout a 3-year development process, and we identified benefits and challenges in trying to align the technical perspectives of novel multimedia features to real-world setting
- …