6,975 research outputs found
Ranking Archived Documents for Structured Queries on Semantic Layers
Archived collections of documents (like newspaper and web archives) serve as
important information sources in a variety of disciplines, including Digital
Humanities, Historical Science, and Journalism. However, the absence of
efficient and meaningful exploration methods still remains a major hurdle in
the way of turning them into usable sources of information. A semantic layer is
an RDF graph that describes metadata and semantic information about a
collection of archived documents, which in turn can be queried through a
semantic query language (SPARQL). This allows running advanced queries by
combining metadata of the documents (like publication date) and content-based
semantic information (like entities mentioned in the documents). However, the
results returned by such structured queries can be numerous and moreover they
all equally match the query. In this paper, we deal with this problem and
formalize the task of "ranking archived documents for structured queries on
semantic layers". Then, we propose two ranking models for the problem at hand
which jointly consider: i) the relativeness of documents to entities, ii) the
timeliness of documents, and iii) the temporal relations among the entities.
The experimental results on a new evaluation dataset show the effectiveness of
the proposed models and allow us to understand their limitation
Multimedia search without visual analysis: the value of linguistic and contextual information
This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features
Mining semantics for culturomics: towards a knowledge-based approach
The massive amounts of text data made available through the Google Books digitization project have inspired a new field of big-data textual research. Named culturomics, this field has attracted the attention of a growing number of scholars over recent years. However, initial studies based on these data have been criticized for not referring to relevant work in linguistics and language technology. This paper provides some ideas, thoughts and first steps towards a new culturomics initiative, based this time on Swedish data, which pursues a more knowledge-based approach than previous work in this emerging field. The amount of new Swedish text produced daily and older texts being digitized in cultural heritage projects grows at an accelerating rate. These volumes of text being available in digital form have grown far beyond the capacity of human readers, leaving automated semantic processing of the texts as the only realistic option for accessing and using the information contained in them. The aim of our recently initiated research program is to advance the state of the art in language technology resources and methods for semantic processing of Big Swedish text and focus on the theoretical and methodological advancement of the state of the art in extracting and correlating information from large volumes of Swedish text using a combination of knowledge-based and statistical methods
Recommended from our members
Another time, another place : archival media content as temporal consciousness and collective memory
Internet-based video streaming services have arisen in the past decade not only to provide new ways of engaging with current media content, but also with media content of the past, including news archives, movies, and television shows. This ability to âdial upâ the mediated past almost at will with a broadband Internet connection suggests new ways for viewers of such content to use it in constructing temporal consciousness, which refers to how someone experiences and perceives time; and temporal frameworks related to the online content. Likewise, online media archives can be used in the formation and preservation of collective memory. Utilizing a targeted focus group study of 18-30-year-olds and their reactions and memories triggered by viewing selected archival news and entertainment content found online, the study contained within this masterâs thesis proposes to explore elements of online media archives that might assist viewers in building a type of mediated temporal consciousness â time awareness and structuring through the consumption of media content â as well as collective memory. Consideration of these possible effects may better define the social value of media archives and their accessibility to future generations of potential viewers. Additionally, qualitative investigation of these concepts can help us to understand more about the mindâs ability to connect media content with personal experience and memory, as well as understand more about new mediaâs sociological and psychological significance as a depository for archival content. Without a method of preserving and presenting archival content, especially pre-digital content on aging, decaying source materials, large periods of time and history represented through news and other media content may become irrevocably lost.Journalis
Handmade films and artist-run labs. The chemical sites of filmâs counterculture
This article addresses handmade films and especially artist-run labs as sites of hands-on film culture
that reactivate moments and materials from media history. Drawing on existing research, discourses
and discussions with contemporary experimental filmmakers affiliated with labs or practicing their
work in relation to film lab infrastructure, we focus on these sites of creation, preservation and
circulation of technical knowledge about analog film. But instead of reinforcing the binary of
analog vs. digital, we argue that the various material practices from self-made apparatuses to
photochemistry and film emulsions are ways of understanding the multiple materials and layered
histories that define post-digital culture of film. This focus links our discussion with some themes in
media archaeology (experimental media archaeology as a practice) and to current discussions about
labs as arts and humanities infrastructure for collective project and practice-based methods
Bridging Vision and Language over Time with Neural Cross-modal Embeddings
Giving computers the ability to understand multimedia content is one of the goals
of Artificial Intelligence systems. While humans excel at this task, it remains a challenge,
requiring bridging vision and language, which inherently have heterogeneous
computational representations. Cross-modal embeddings are used to tackle this challenge,
by learning a common space that uni es these representations. However, to grasp
the semantics of an image, one must look beyond the pixels and consider its semantic
and temporal context, with the latter being de ned by imagesâ textual descriptions and
time dimension, respectively. As such, external causes (e.g. emerging events) change the
way humans interpret and describe the same visual element over time, leading to the
evolution of visual-textual correlations.
In this thesis we investigate models that capture patterns of visual and textual interactions
over time, by incorporating time in cross-modal embeddings: 1) in a relative manner,
where by using pairwise temporal correlations to aid data structuring, we obtained a
model that provides better visual-textual correspondences on dynamic corpora, and 2) in
a diachronic manner, where the temporal dimension is fully preserved, thus capturing
visual-textual correlations evolution under a principled approach that jointly models
vision+language+time. Rich insights stemming from data evolution were extracted from
a 20 years large-scale dataset. Additionally, towards improving the e ectiveness of these
embedding learning models, we proposed a novel loss function that increases the expressiveness
of the standard triplet-loss, by making it adaptive to the data at hand. With our
adaptive triplet-loss, in which triplet speci c constraints are inferred and scheduled, we
achieved state-of-the-art performance on the standard cross-modal retrieval task
- âŠ