46 research outputs found

    Generating Semantic Snapshots of Newscasts Using Entity Expansion

    Get PDF
    textabstractTV newscasts report about the latest event-related facts occurring in the world. Relying exclusively on them is, however, insufficient to fully grasp the context of the story being reported. In this paper, we propose an approach that retrieves and analyzes related documents from the Web to automatically generate semantic annotations that provide viewers and experts comprehensive information about the news. We detect named entities in the retrieved documents that further disclose relevant concepts that were not explicitly mentioned in the original newscast. A ranking algorithm based on entity frequency, popularity peak analysis, and domain experts’ rules sorts those annotations to generate what we call Semantic Snapshot of a Newscast (NSS). We benchmark this method against a gold standard generated by domain experts and assessed via a user survey over five BBC newscasts. Results of the experiments show the robustness of our approach holding an Average Normalized Discounted Cumulative Gain of 66.6%

    Deliverable D2.7 Final Linked Media Layer and Evaluation

    Get PDF
    This deliverable presents the evaluation of content annotation and content enrichment systems that are part of the final tool set developed within the LinkedTV consortium. The evaluations were performed on both the Linked News and Linked Culture trial content, as well as on other content annotated for this purpose. The evaluation spans three languages: German (Linked News), Dutch (Linked Culture) and English. Selected algorithms and tools were also subject to benchmarking in two international contests: MediaEval 2014 and TAC’14. Additionally, the Microposts 2015 NEEL Challenge is being organized with the support of LinkedTV

    A computational memory and processing model for prosody

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts & Sciences, 1999.Includes bibliographical references (p. 209-226).This thesis links processing in working memory to prosody in speech, and links different working memory capacities to different prosodic styles. It provides a causal account of prosodic differences and an architecture for reproducing them in synthesized speech. The implemented system mediates text-based information through a model of attention and working memory. The main simulation parameter of the memory model quantifies recall. Changing its value changes what counts as given and new information in a text, and therefore determines the intonation with which the text is uttered. Other aspects of search and storage in the memory model are mapped to the remainder of the continuous and categorical features of pitch and timing, producing prosody in three different styles: for small recall values, the exaggerated and sing-song melodies of children's speech; for mid-range values, an adult expressive style; for the largest values, the prosody of a speaker who is familiar with the text, and at times sounds bored or irritated. In addition, because the storage procedure is stochastic, the prosody from simulation to simulation varies, even for identical control parameters. As with with human speech, no two renditions are alike. Informal feedback indicates that the stylistic differences are recognizable and that the prosody is improved over current offerings. A comparison with natural data shows clear and predictable trends although not at significance. However, a comparison within the natural data also did not produce results at significance. One practical contribution of this work is a text mark-up schema consisting of relational annotations to grammatical structures. Another is the product - varied and plausible prosody in synthesized speech. The main theoretical contribution is to show that resource-bound cognitive activity has prosodic correlates, thus providing a rationale for the individual and stylistic differences in melody and rhythm that are ubiquitous in human speech.by Janet Elizabeth Cahn.Ph.D

    Accessing spoken interaction through dialogue processing [online]

    Get PDF
    Zusammenfassung Unser Leben, unsere Leistungen und unsere Umgebung, alles wird derzeit durch Schriftsprache dokumentiert. Die rasante Fortentwicklung der technischen Möglichkeiten Audio, Bilder und Video aufzunehmen, abzuspeichern und wiederzugeben kann genutzt werden um die schriftliche Dokumentation von menschlicher Kommunikation, zum Beispiel Meetings, zu unterstĂŒtzen, zu ergĂ€nzen oder gar zu ersetzen. Diese neuen Technologien können uns in die Lage versetzen Information aufzunehmen, die anderweitig verloren gehen, die Kosten der Dokumentation zu senken und hochwertige Dokumente mit audiovisuellem Material anzureichern. Die Indizierung solcher Aufnahmen stellt die Kerntechnologie dar um dieses Potential auszuschöpfen. Diese Arbeit stellt effektive Alternativen zu schlĂŒsselwortbasierten Indizes vor, die SuchraumeinschrĂ€nkungen bewirken und teilweise mit einfachen Mitteln zu berechnen sind. Die Indizierung von Sprachdokumenten kann auf verschiedenen Ebenen erfolgen: Ein Dokument gehört stilistisch einer bestimmten Datenbasis an, welche durch sehr einfache Merkmale bei hoher Genauigkeit automatisch bestimmt werden kann. Durch diese Art von Klassifikation kann eine Reduktion des Suchraumes um einen Faktor der GrĂ¶ĂŸenordnung 4­10 erfolgen. Die Anwendung von thematischen Merkmalen zur Textklassifikation bei einer Nachrichtendatenbank resultiert in einer Reduktion um einen Faktor 18. Da Sprachdokumente sehr lang sein können mĂŒssen sie in thematische Segmente unterteilt werden. Ein neuer probabilistischer Ansatz sowie neue Merkmale (Sprecherinitia­ tive und Stil) liefern vergleichbare oder bessere Resultate als traditionelle schlĂŒsselwortbasierte AnsĂ€tze. Diese thematische Segmente können durch die vorherrschende AktivitĂ€t charakterisiert werden (erzĂ€hlen, diskutieren, planen, ...), die durch ein neuronales Netz detektiert werden kann. Die Detektionsraten sind allerdings begrenzt da auch Menschen diese AktivitĂ€ten nur ungenau bestimmen. Eine maximale Reduktion des Suchraumes um den Faktor 6 ist bei den verwendeten Daten theoretisch möglich. Eine thematische Klassifikation dieser Segmente wurde ebenfalls auf einer Datenbasis durchgefĂŒhrt, die Detektionsraten fĂŒr diesen Index sind jedoch gering. Auf der Ebene der einzelnen Äußerungen können Dialogakte wie Aussagen, Fragen, RĂŒckmeldungen (aha, ach ja, echt?, ...) usw. mit einem diskriminativ trainierten Hidden Markov Model erkannt werden. Dieses Verfahren kann um die Erkennung von kurzen Folgen wie Frage/Antwort­Spielen erweitert werden (Dialogspiele). Dialogakte und ­spiele können eingesetzt werden um Klassifikatoren fĂŒr globale Sprechstile zu bauen. Ebenso könnte ein Benutzer sich an eine bestimmte Dialogaktsequenz erinnern und versuchen, diese in einer grafischen ReprĂ€sentation wiederzufinden. In einer Studie mit sehr pessimistischen Annahmen konnten Benutzer eines aus vier Ă€hnlichen und gleichwahrscheinlichen GesprĂ€chen mit einer Genauigkeit von ~ 43% durch eine graphische ReprĂ€sentation von AktivitĂ€t bestimmt. Dialogakte könnte in diesem Szenario ebenso nĂŒtzlich sein, die Benutzerstudie konnte aufgrund der geringen Datenmenge darĂŒber keinen endgĂŒltigen Aufschluß geben. Die Studie konnte allerdings fĂŒr detailierte Basismerkmale wie FormalitĂ€t und SprecheridentitĂ€t keinen Effekt zeigen. Abstract Written language is one of our primary means for documenting our lives, achievements, and environment. Our capabilities to record, store and retrieve audio, still pictures, and video are undergoing a revolution and may support, supplement or even replace written documentation. This technology enables us to record information that would otherwise be lost, lower the cost of documentation and enhance high­quality documents with original audiovisual material. The indexing of the audio material is the key technology to realize those benefits. This work presents effective alternatives to keyword based indices which restrict the search space and may in part be calculated with very limited resources. Indexing speech documents can be done at a various levels: Stylistically a document belongs to a certain database which can be determined automatically with high accuracy using very simple features. The resulting factor in search space reduction is in the order of 4­10 while topic classification yielded a factor of 18 in a news domain. Since documents can be very long they need to be segmented into topical regions. A new probabilistic segmentation framework as well as new features (speaker initiative and style) prove to be very effective compared to traditional keyword based methods. At the topical segment level activities (storytelling, discussing, planning, ...) can be detected using a machine learning approach with limited accuracy; however even human annotators do not annotate them very reliably. A maximum search space reduction factor of 6 is theoretically possible on the databases used. A topical classification of these regions has been attempted on one database, the detection accuracy for that index, however, was very low. At the utterance level dialogue acts such as statements, questions, backchannels (aha, yeah, ...), etc. are being recognized using a novel discriminatively trained HMM procedure. The procedure can be extended to recognize short sequences such as question/answer pairs, so called dialogue games. Dialog acts and games are useful for building classifiers for speaking style. Similarily a user may remember a certain dialog act sequence and may search for it in a graphical representation. In a study with very pessimistic assumptions users are able to pick one out of four similar and equiprobable meetings correctly with an accuracy ~ 43% using graphical activity information. Dialogue acts may be useful in this situation as well but the sample size did not allow to draw final conclusions. However the user study fails to show any effect for detailed basic features such as formality or speaker identity

    Evolutionary dynamics of new media forms: the case of the open mobile web

    Get PDF
    This thesis is designed to improve our understanding of the evolutionary dynamics of media forms, with a special historical focus on the recent processes of Web and mobile convergence and the early development of the cross-platform Web. It aims to investigate the dynamics that have underpinned the creation, evolution and conventionalisation of new media forms in the open mobile Web following the launch of 3G mobile networks. In theoretical terms the thesis explores the possibilities for the analytical integration of evolutionary approaches that traditionally have shed light on the discrete components of the evolutionary ‘ensemble’ that comprises media’s textual forms, their technologies and organisational systems. Among the theoretical pillars the study builds on is, first, the cultural semiotic approach (Lotman) that is utilised for interpreting the textual dynamics constituting the form evolution. Second, evolutionary economics (Schumpeter, Freeman and others) is included for interpreting the market dynamics that condition the formation of the media industries. Third, systems theoretical sociology (Luhmann) is deployed in order to understand the broader dynamics of social organisation in late modernism. The integration of these approaches provides the conceptual framework that focuses on the following phenomena: dialogic interchange among industry sub-systems as enabling innovations and the emergence of new sub-systems; the self-organisation of the sub-systems in the contingent environment; the role of memory and systemic ‘path-dependencies’ in guiding the processes of self-organisation; and the nature of the power relations that shape the dialogic processes. The empirical study focuses on textual as well as organisational developments. The semiotic analysis of mobile websites reveals the intertextual relations of the new forms with other media domains, especially the desktop Web. The interviews with representatives of industry stakeholders provide insights into the dialogic practices between the parties engaged in designing the mobile Web, and how, via these practices, the new platform, its media forms and institutional structures were shaped. The findings point to the historical formation of two main industry sub-systems – ‘infrastructure enablers’ and content providers – with different preferred alternatives for the design of the cross-platform Web. The thesis demonstrates how the formation of these groups was conditioned by their systemic path-dependencies, but also by the mesh of dialogic relationships among them and by the resulting changes in the discursive constellations framing the organisation of the industry and the norms for its media forms. The study points to the first signs of the historically momentous emancipation of the mobile Webmedia forms, their shaking free of path-dependency on the desktop Web

    Transnational audiences and the reception of television news: a study of Mexicans in Los Angeles

    Get PDF
    This doctoral contribution borrows from the discursive practices of transnationalism and diaspora in order to articulate the concept of "transnational audiences" in the United States. The project identifies transnational audiences as formed by individuals and families whose lives straddle two national territories. It draws on the traditions of cultural studies and reception analysis as a strategy to explore the relation between media use and novel experiences of migration in a context of contemporary globalization. This conceptual background is the result of empirical research conducted in Los Angeles which investigated the television news reception of 67 informants of Mexican origin during three months in 2006. Relying on a range of qualitative research methods based in the domestic settings of the participants, the project found high levels of interests across a variety of news occurring in Los Angeles, the US, Mexico and further afield. During interviews, television news-viewing sessions and in daily written accounts, respondents constantly conveyed the idea of being directly impacted by a wide variety of events and developments in the news, regardless of geographic proximity. Heightened sensitivity to realities unfolding in nearby and distant places, it will be argued, would be a result of transnational communities’ connections with different social, cultural, economic and political contexts. These links emerged in a variety of ways throughout the research activities. Notably, the interactions in which members of families engaged when discussing the news, revealed the re-articulation – and possible subversion – of patriarchal structures regulating relationships between males and females. At the same time, the research provides hints of a possible intertwining between the mediated and unmediated experiences of contributors to the study, who constantly informed their understanding of the news on the basis of interpersonal and mediated communication, knowledge of places and locations, and circumstances attached to opportunities and constraints related to aspects such as migration and citizen status. While in need of further systematization, this thesis’ findings are relevant for they highlight the need to operationalize the transnational audience in ways which differentiate it from those media publics who are based in their countries of origin. At the same time, this intervention highlights the need to question or move forward from established forms of thinking about the media use of non-native peoples in the developed world. The project as a whole opens a window to explore an alternative academic vocabulary to the notions of "ethnic" and "minority" audiences, privileged in US scholarly endeavour

    Generating semantic snapshots of newscasts using entity expansion

    No full text

    Focus Mediocene

    Get PDF
    This issue, following an international conference held at the IKKM in September 2017, is devoted to what may very well be the broadest media-related topic possible, even if it is accessible only through exemplary and experimental approaches: Under the title of the »Mediocene«, it presents contributions which discuss the operations and functions that intertwine media and Planet Earth. The specific relation of media and Planet Earth likely found its most striking and iconic formula in the images of the earth from outer space in 1968/69, showing the earth—according to contemporaneous descriptions—in its brilliance and splendor as the »Blue Marble«, but also in its fragility and desperate loneliness against the black backdrop of the cosmic void. Not only the creation but also the incredible distribution of this image across the globe was already at the time clearly recognized as a media eff ect. In light of space fl ight and television technology, which had expanded the reach of observation, communication, and measurement beyond both the surface of the Earth and its atmosphere, it also became clearly evident that the Planet had been a product of the early telescope by the use of which Galileo found the visual proof for the Copernican world model. Nevertheless, the »Blue Marble« image of the planet conceives of Earth not only as a celestial body, but also as a global, ecological, and economic system. Satellite and spacecraft technology and imaging continue to move beyond Earth’s orbit even as they enable precise, small-scale procedures of navigation and observation on the surface of the planet itself. These instruments of satellite navigation aff ect practices like agriculture, urban planning, and political decision-making. Most recently, three-dimensional images featuring the planet’s surface (generated from space by Synthetic Aperture Radar) or pictures from space probes have been cir-culating on the Web, altering politico-geographical practices and popular and scientifi c knowledge of the cosmos. Today, media not only participate in the shaping of the planet, but also take place on a planetary scale. Communication systems have been installed that operate all over the globe
    corecore