46 research outputs found
Generating Semantic Snapshots of Newscasts Using Entity Expansion
textabstractTV newscasts report about the latest event-related facts occurring in the world. Relying exclusively on them is, however, insufficient to fully grasp the context of the story being reported. In this paper, we propose an approach that retrieves and analyzes related documents from the Web to automatically generate semantic annotations that provide viewers and experts comprehensive information about the news. We detect named entities in the retrieved documents that further disclose relevant concepts that were not explicitly mentioned in the original newscast. A ranking algorithm based on entity frequency, popularity peak analysis, and domain expertsâ rules sorts those annotations to generate what we call Semantic Snapshot of a Newscast (NSS). We benchmark this method against a gold standard generated by domain experts and assessed via a user survey over five BBC newscasts. Results of the experiments show the robustness of our approach holding an Average Normalized Discounted Cumulative Gain of 66.6%
Deliverable D2.7 Final Linked Media Layer and Evaluation
This deliverable presents the evaluation of content annotation and content enrichment systems that are part of the final tool set developed within the LinkedTV consortium. The evaluations were performed on both the Linked News and Linked Culture trial content, as well as on other content annotated for this purpose. The evaluation spans three languages: German (Linked News), Dutch (Linked Culture) and English. Selected algorithms and tools were also subject to benchmarking in two international contests: MediaEval 2014 and TACâ14. Additionally, the Microposts 2015 NEEL Challenge is being organized with the support of LinkedTV
A computational memory and processing model for prosody
Thesis (Ph.D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts & Sciences, 1999.Includes bibliographical references (p. 209-226).This thesis links processing in working memory to prosody in speech, and links different working memory capacities to different prosodic styles. It provides a causal account of prosodic differences and an architecture for reproducing them in synthesized speech. The implemented system mediates text-based information through a model of attention and working memory. The main simulation parameter of the memory model quantifies recall. Changing its value changes what counts as given and new information in a text, and therefore determines the intonation with which the text is uttered. Other aspects of search and storage in the memory model are mapped to the remainder of the continuous and categorical features of pitch and timing, producing prosody in three different styles: for small recall values, the exaggerated and sing-song melodies of children's speech; for mid-range values, an adult expressive style; for the largest values, the prosody of a speaker who is familiar with the text, and at times sounds bored or irritated. In addition, because the storage procedure is stochastic, the prosody from simulation to simulation varies, even for identical control parameters. As with with human speech, no two renditions are alike. Informal feedback indicates that the stylistic differences are recognizable and that the prosody is improved over current offerings. A comparison with natural data shows clear and predictable trends although not at significance. However, a comparison within the natural data also did not produce results at significance. One practical contribution of this work is a text mark-up schema consisting of relational annotations to grammatical structures. Another is the product - varied and plausible prosody in synthesized speech. The main theoretical contribution is to show that resource-bound cognitive activity has prosodic correlates, thus providing a rationale for the individual and stylistic differences in melody and rhythm that are ubiquitous in human speech.by Janet Elizabeth Cahn.Ph.D
Accessing spoken interaction through dialogue processing [online]
Zusammenfassung
Unser Leben, unsere Leistungen und unsere Umgebung, alles wird
derzeit durch Schriftsprache dokumentiert. Die rasante
Fortentwicklung der technischen Möglichkeiten Audio, Bilder und
Video aufzunehmen, abzuspeichern und wiederzugeben kann genutzt
werden um die schriftliche Dokumentation von menschlicher
Kommunikation, zum Beispiel Meetings, zu unterstĂŒtzen, zu
ergÀnzen oder gar zu ersetzen. Diese neuen Technologien können
uns in die Lage versetzen Information aufzunehmen, die
anderweitig verloren gehen, die Kosten der Dokumentation zu
senken und hochwertige Dokumente mit audiovisuellem Material
anzureichern. Die Indizierung solcher Aufnahmen stellt die
Kerntechnologie dar um dieses Potential auszuschöpfen. Diese
Arbeit stellt effektive Alternativen zu schlĂŒsselwortbasierten
Indizes vor, die SuchraumeinschrÀnkungen bewirken und teilweise
mit einfachen Mitteln zu berechnen sind.
Die Indizierung von Sprachdokumenten kann auf verschiedenen
Ebenen erfolgen: Ein Dokument gehört stilistisch einer
bestimmten Datenbasis an, welche durch sehr einfache Merkmale
bei hoher Genauigkeit automatisch bestimmt werden kann.
Durch diese Art von Klassifikation kann eine Reduktion des
Suchraumes um einen Faktor der GröĂenordnung 4Â10 erfolgen. Die
Anwendung von thematischen Merkmalen zur Textklassifikation
bei einer Nachrichtendatenbank resultiert in einer Reduktion um
einen Faktor 18. Da Sprachdokumente sehr lang sein können mĂŒssen
sie in thematische Segmente unterteilt werden. Ein neuer
probabilistischer Ansatz sowie neue Merkmale (SprecherinitiaÂ
tive und Stil) liefern vergleichbare oder bessere Resultate als
traditionelle schlĂŒsselwortbasierte AnsĂ€tze. Diese thematische
Segmente können durch die vorherrschende AktivitÀt
charakterisiert werden (erzÀhlen, diskutieren, planen, ...),
die durch ein neuronales Netz detektiert werden kann. Die
Detektionsraten sind allerdings begrenzt da auch Menschen
diese AktivitÀten nur ungenau bestimmen. Eine maximale
Reduktion des Suchraumes um den Faktor 6 ist bei den verwendeten
Daten theoretisch möglich. Eine thematische Klassifikation
dieser Segmente wurde ebenfalls auf einer Datenbasis
durchgefĂŒhrt, die Detektionsraten fĂŒr diesen Index sind jedoch
gering.
Auf der Ebene der einzelnen ĂuĂerungen können Dialogakte wie
Aussagen, Fragen, RĂŒckmeldungen (aha, ach ja, echt?, ...) usw.
mit einem diskriminativ trainierten Hidden Markov Model erkannt
werden. Dieses Verfahren kann um die Erkennung von kurzen Folgen
wie Frage/AntwortÂSpielen erweitert werden (Dialogspiele).
Dialogakte und Âspiele können eingesetzt werden um
Klassifikatoren fĂŒr globale Sprechstile zu bauen. Ebenso
könnte ein Benutzer sich an eine bestimmte Dialogaktsequenz
erinnern und versuchen, diese in einer grafischen
ReprÀsentation wiederzufinden.
In einer Studie mit sehr pessimistischen Annahmen konnten
Benutzer eines aus vier Àhnlichen und gleichwahrscheinlichen
GesprÀchen mit einer Genauigkeit von ~ 43% durch eine graphische
ReprÀsentation von AktivitÀt bestimmt.
Dialogakte könnte in diesem Szenario ebenso nĂŒtzlich sein, die
Benutzerstudie konnte aufgrund der geringen Datenmenge darĂŒber
keinen endgĂŒltigen AufschluĂ geben. Die Studie konnte allerdings
fĂŒr detailierte Basismerkmale wie FormalitĂ€t und
SprecheridentitÀt keinen Effekt zeigen.
Abstract
Written language is one of our primary means for documenting our
lives, achievements, and environment. Our capabilities to
record, store and retrieve audio, still pictures, and video are
undergoing a revolution and may support, supplement or even
replace written documentation. This technology enables us to
record information that would otherwise be lost, lower the cost
of documentation and enhance highÂquality documents with
original audiovisual material.
The indexing of the audio material is the key technology to
realize those benefits. This work presents effective
alternatives to keyword based indices which restrict the search
space and may in part be calculated with very limited resources.
Indexing speech documents can be done at a various levels:
Stylistically a document belongs to a certain database which can
be determined automatically with high accuracy using very simple
features. The resulting factor in search space reduction is in
the order of 4Â10 while topic classification yielded a factor
of 18 in a news domain.
Since documents can be very long they need to be segmented into
topical regions. A new probabilistic segmentation framework as
well as new features (speaker initiative and style) prove to be
very effective compared to traditional keyword based methods. At
the topical segment level activities (storytelling, discussing,
planning, ...) can be detected using a machine learning approach
with limited accuracy; however even human annotators do not
annotate them very reliably. A maximum search space reduction
factor of 6 is theoretically possible on the databases used. A
topical classification of these regions has been attempted
on one database, the detection accuracy for that index, however,
was very low.
At the utterance level dialogue acts such as statements,
questions, backchannels (aha, yeah, ...), etc. are being
recognized using a novel discriminatively trained HMM procedure.
The procedure can be extended to recognize short sequences such
as question/answer pairs, so called dialogue games.
Dialog acts and games are useful for building classifiers for
speaking style. Similarily a user may remember a certain dialog
act sequence and may search for it in a graphical
representation.
In a study with very pessimistic assumptions users are able to
pick one out of four similar and equiprobable meetings correctly
with an accuracy ~ 43% using graphical activity information.
Dialogue acts may be useful in this situation as well but the
sample size did not allow to draw final conclusions. However the
user study fails to show any effect for detailed basic features
such as formality or speaker identity
Evolutionary dynamics of new media forms: the case of the open mobile web
This thesis is designed to improve our understanding of the evolutionary dynamics of
media forms, with a special historical focus on the recent processes of Web and mobile
convergence and the early development of the cross-platform Web. It aims to investigate
the dynamics that have underpinned the creation, evolution and conventionalisation of
new media forms in the open mobile Web following the launch of 3G mobile networks.
In theoretical terms the thesis explores the possibilities for the analytical
integration of evolutionary approaches that traditionally have shed light on the discrete
components of the evolutionary âensembleâ that comprises mediaâs textual forms, their
technologies and organisational systems. Among the theoretical pillars the study builds
on is, first, the cultural semiotic approach (Lotman) that is utilised for interpreting the
textual dynamics constituting the form evolution. Second, evolutionary economics
(Schumpeter, Freeman and others) is included for interpreting the market dynamics that
condition the formation of the media industries. Third, systems theoretical sociology
(Luhmann) is deployed in order to understand the broader dynamics of social organisation in late modernism. The integration of these approaches provides the conceptual
framework that focuses on the following phenomena: dialogic interchange among
industry sub-systems as enabling innovations and the emergence of new sub-systems; the
self-organisation of the sub-systems in the contingent environment; the role of memory
and systemic âpath-dependenciesâ in guiding the processes of self-organisation; and the
nature of the power relations that shape the dialogic processes.
The empirical study focuses on textual as well as organisational developments.
The semiotic analysis of mobile websites reveals the intertextual relations of the new
forms with other media domains, especially the desktop Web. The interviews with
representatives of industry stakeholders provide insights into the dialogic practices
between the parties engaged in designing the mobile Web, and how, via these practices,
the new platform, its media forms and institutional structures were shaped. The findings
point to the historical formation of two main industry sub-systems â âinfrastructure
enablersâ and content providers â with different preferred alternatives for the design of
the cross-platform Web. The thesis demonstrates how the formation of these groups was
conditioned by their systemic path-dependencies, but also by the mesh of dialogic
relationships among them and by the resulting changes in the discursive constellations
framing the organisation of the industry and the norms for its media forms. The study
points to the first signs of the historically momentous emancipation of the mobile Webmedia forms, their shaking free of path-dependency on the desktop Web
Transnational audiences and the reception of television news: a study of Mexicans in Los Angeles
This doctoral contribution borrows from the discursive practices of transnationalism and diaspora in order to articulate the concept of "transnational audiences" in the United States. The project identifies transnational audiences as formed by individuals and families whose lives straddle two national territories. It draws on the traditions of cultural studies and reception analysis as a strategy to explore the relation between media use and novel experiences of migration in a context of contemporary globalization. This conceptual background is the result of empirical research conducted in Los Angeles which investigated the television news reception of 67 informants of Mexican origin during three months in 2006. Relying on a range of qualitative research methods based in the domestic settings of the participants, the project found high levels of interests across a variety of news occurring in Los Angeles, the US, Mexico and further afield. During interviews, television news-viewing sessions and in daily written accounts, respondents constantly conveyed the idea of being directly impacted by a wide variety of events and developments in the news, regardless of geographic proximity. Heightened sensitivity to realities unfolding in nearby and distant places, it will be argued, would be a result of transnational communitiesâ connections with different social, cultural, economic and political contexts. These links emerged in a variety of ways throughout the research activities. Notably, the interactions in which members of families engaged when discussing the news, revealed the re-articulation â and possible subversion â of patriarchal structures regulating relationships between males and females. At the same time, the research provides hints of a possible intertwining between the mediated and unmediated experiences of contributors to the study, who constantly informed their understanding of the news on the basis of interpersonal and mediated communication, knowledge of places and locations, and circumstances attached to opportunities and constraints related to aspects such as migration and citizen status. While in need of further systematization, this thesisâ findings are relevant for they highlight the need to operationalize the transnational audience in ways which differentiate it from those media publics who are based in their countries of origin. At the same time, this intervention highlights the need to question or move forward from established forms of thinking about the media use of non-native peoples in the developed world. The project as a whole opens a window to explore an alternative academic vocabulary to the notions of "ethnic" and "minority" audiences, privileged in US scholarly endeavour
Focus Mediocene
This issue, following an international conference held at the IKKM in September 2017, is devoted to what may very well be the broadest media-related topic possible, even if it is accessible only through exemplary and experimental approaches: Under the title of the »Mediocene«, it presents contributions which discuss the operations and functions that intertwine media and Planet Earth. The specific relation of media and Planet Earth likely found its most striking and iconic formula in the images of the earth from outer space in 1968/69, showing the earthâaccording to contemporaneous descriptionsâin its brilliance and splendor as the »Blue Marble«, but also in its fragility and desperate loneliness against the black backdrop of the cosmic void. Not only the creation but also the incredible distribution of this image across the globe was already at the time clearly recognized as a media eff ect. In light of space fl ight and television technology, which had expanded the reach of observation, communication, and measurement beyond both the surface of the Earth and its atmosphere, it also became clearly evident that the Planet had been a product of the early telescope by the use of which Galileo found the visual proof for the Copernican world model. Nevertheless, the »Blue Marble« image of the planet conceives of Earth not only as a celestial body, but also as a global, ecological, and economic system. Satellite and spacecraft technology and imaging continue to move beyond Earthâs orbit even as they enable precise, small-scale procedures of navigation and observation on the surface of the planet itself. These instruments of satellite navigation aff ect practices like agriculture, urban planning, and political decision-making. Most recently, three-dimensional images featuring the planetâs surface (generated from space by Synthetic Aperture Radar) or pictures from space probes have been cir-culating on the Web, altering politico-geographical practices and popular and scientifi c knowledge of the cosmos. Today, media not only participate in the shaping of the planet, but also take place on a planetary scale. Communication systems have been installed that operate all over the globe