1,330 research outputs found
A survey of comics research in computer science
Graphical novels such as comics and mangas are well known all over the world.
The digital transition started to change the way people are reading comics,
more and more on smartphones and tablets and less and less on paper. In the
recent years, a wide variety of research about comics has been proposed and
might change the way comics are created, distributed and read in future years.
Early work focuses on low level document image analysis: indeed comic books are
complex, they contains text, drawings, balloon, panels, onomatopoeia, etc.
Different fields of computer science covered research about user interaction
and content generation such as multimedia, artificial intelligence,
human-computer interaction, etc. with different sets of values. We propose in
this paper to review the previous research about comics in computer science, to
state what have been done and to give some insights about the main outlooks
Interactive video retrieval using implicit user feedback.
PhDIn the recent years, the rapid development of digital technologies and the low
cost of recording media have led to a great increase in the availability of
multimedia content worldwide. This availability places the demand for the
development of advanced search engines. Traditionally, manual annotation of
video was one of the usual practices to support retrieval. However, the vast
amounts of multimedia content make such practices very expensive in terms of
human effort. At the same time, the availability of low cost wearable sensors
delivers a plethora of user-machine interaction data. Therefore, there is an
important challenge of exploiting implicit user feedback (such as user navigation
patterns and eye movements) during interactive multimedia retrieval sessions
with a view to improving video search engines. In this thesis, we focus on
automatically annotating video content by exploiting aggregated implicit
feedback of past users expressed as click-through data and gaze movements.
Towards this goal, we have conducted interactive video retrieval experiments, in
order to collect click-through and eye movement data in not strictly controlled
environments. First, we generate semantic relations between the multimedia
items by proposing a graph representation of aggregated past interaction data and
exploit them to generate recommendations, as well as to improve content-based
search. Then, we investigate the role of user gaze movements in interactive video
retrieval and propose a methodology for inferring user interest by employing
support vector machines and gaze movement-based features. Finally, we propose
an automatic video annotation framework, which combines query clustering into
topics by constructing gaze movement-driven random forests and temporally
enhanced dominant sets, as well as video shot classification for predicting the
relevance of viewed items with respect to a topic. The results show that
exploiting heterogeneous implicit feedback from past users is of added value for
future users of interactive video retrieval systems
PersoNER: Persian named-entity recognition
Š 1963-2018 ACL. Named-Entity Recognition (NER) is still a challenging task for languages with low digital resources. The main difficulties arise from the scarcity of annotated corpora and the consequent problematic training of an effective NER pipeline. To abridge this gap, in this paper we target the Persian language that is spoken by a population of over a hundred million people world-wide. We first present and provide ArmanPerosNERCorpus, the first manually-annotated Persian NER corpus. Then, we introduce PersoNER, an NER pipeline for Persian that leverages a word embedding and a sequential max-margin classifier. The experimental results show that the proposed approach is capable of achieving interesting MUC7 and CoNNL scores while outperforming two alternatives based on a CRF and a recurrent neural network
Access to recorded interviews: A research agenda
Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed
Identifying related landmark tags in urban scenes using spatial and semantic clustering
There is considerable interest in developing landmark saliency models as a basis for describing urban landscapes, and in constructing wayfinding instructions, for text and spoken dialogue based systems. The challenge lies in knowing the truthfulness of such models; is what the model considers salient the same as what is perceived by the user? This paper presents a web based experiment in which users were asked to tag and label the most salient features from urban images for the purposes of navigation and exploration. In order to rank landmark popularity in each scene it was necessary to determine which tags related to the same object (e.g. tags relating to a particular café). Existing clustering techniques did not perform well for this task, and it was therefore necessary to develop a new spatial-semantic clustering method which considered the proximity of nearby tags and the similarity of their label content. The annotation similarity was initially calculated using trigrams in conjunction with a synonym list, generating a set of networks formed from the links between related tags. These networks were used to build related word lists encapsulating conceptual connections (e.g. church tower related to clock) so that during a secondary pass of the data related network segments could be merged. This approach gives interesting insight into the partonomic relationships between the constituent parts of landmarks and the range and frequency of terms used to describe them. The knowledge gained from this will be used to help calibrate a landmark saliency model, and to gain a deeper understanding of the terms typically associated with different types of landmarks
Psychophysiology-based QoE assessment : a survey
We present a survey of psychophysiology-based assessment for quality of experience (QoE) in advanced multimedia technologies. We provide a classification of methods relevant to QoE and describe related psychological processes, experimental design considerations, and signal analysis techniques. We summarize multimodal techniques and discuss several important aspects of psychophysiology-based QoE assessment, including the synergies with psychophysical assessment and the need for standardized experimental design. This survey is not considered to be exhaustive but serves as a guideline for those interested to further explore this emerging field of research
Privacy Intelligence: A Survey on Image Sharing on Online Social Networks
Image sharing on online social networks (OSNs) has become an indispensable
part of daily social activities, but it has also led to an increased risk of
privacy invasion. The recent image leaks from popular OSN services and the
abuse of personal photos using advanced algorithms (e.g. DeepFake) have
prompted the public to rethink individual privacy needs when sharing images on
OSNs. However, OSN image sharing itself is relatively complicated, and systems
currently in place to manage privacy in practice are labor-intensive yet fail
to provide personalized, accurate and flexible privacy protection. As a result,
an more intelligent environment for privacy-friendly OSN image sharing is in
demand. To fill the gap, we contribute a systematic survey of 'privacy
intelligence' solutions that target modern privacy issues related to OSN image
sharing. Specifically, we present a high-level analysis framework based on the
entire lifecycle of OSN image sharing to address the various privacy issues and
solutions facing this interdisciplinary field. The framework is divided into
three main stages: local management, online management and social experience.
At each stage, we identify typical sharing-related user behaviors, the privacy
issues generated by those behaviors, and review representative intelligent
solutions. The resulting analysis describes an intelligent privacy-enhancing
chain for closed-loop privacy management. We also discuss the challenges and
future directions existing at each stage, as well as in publicly available
datasets.Comment: 32 pages, 9 figures. Under revie
Implicit image annotation by using gaze analysis
PhDThanks to the advances in technology, people are storing a massive amount of visual information in the online databases. Today it is normal for a person to take a photo of an event with their smartphone and effortlessly upload it to a host domain. For later quick access, this enormous amount of data needs to be indexed by providing metadata for their content. The challenge is to provide suitable captions for the semantics of the visual content. This thesis investigates the possibility of extracting and using the valuable information stored inside humanâs eye movements when interacting with digital visual content in order to provide information for image annotation implicitly. A non-intrusive framework is developed which is capable of inferring gaze movements to classify the visited images by a user into two classes when the user is searching for a Target Concept (TC) in the images. The first class is formed of the images that contain the TC and it is called the TC+ class and the second class is formed of the images that do not contain the TC and it is called the TC- class. By analysing the eye-movements only, the developed framework was able to identify over 65% of the images that the subject users were searching for with the accuracy over 75%. This thesis shows that the existing information in gaze patterns can be employed to improve the machineâs judgement of image content by assessment of human attention to the objects inside virtual environments.European Commission funded Network of Excellence PetaMedi
- âŚ