104,773 research outputs found
Access to recorded interviews: A research agenda
Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed
Towards Affordable Disclosure of Spoken Word Archives
This paper presents and discusses ongoing work aiming at affordable disclosure of real-world spoken word archives in general, and in particular of a collection of recorded interviews with Dutch survivors of World War II concentration camp Buchenwald. Given such collections, the least we want to be able to provide is search at different levels and a flexible way of presenting results. Strategies for automatic annotation based on speech recognition – supporting e.g., within-document search– are outlined and discussed with respect to the Buchenwald interview collection. In addition, usability aspects of the spoken word search are discussed on the basis of our experiences with the online Buchenwald web portal. It is concluded that, although user feedback is generally fairly positive, automatic annotation performance is still far from satisfactory, and requires additional research
Interactive searching and browsing of video archives: using text and using image matching
Over the last number of decades much research work has been done in the general area of video and audio analysis. Initially the applications driving this included capturing video in digital form and then being able to store, transmit
and render it, which involved a large effort to develop compression and encoding standards. The technology needed to do all this is now easily available and cheap, with applications of digital video processing now commonplace,
ranging from CCTV (Closed Circuit TV) for security, to home capture of broadcast TV on home DVRs for personal viewing.
One consequence of the development in technology for creating, storing and distributing digital video is that there has been a huge increase in the volume of digital video, and this in turn has created a need for techniques to allow effective management of this video, and by that we mean content management. In the BBC, for example, the archives department receives approximately 500,000 queries per year and has over 350,000 hours of content in its library. Having huge archives of video information is hardly any benefit if we have no effective means of being able to locate video clips which are of relevance to whatever our information needs may be. In this chapter we report our work on developing two specific retrieval and browsing tools for digital video information. Both of these are based on an analysis of the captured video for the purpose of automatically structuring into shots or higher level semantic units like TV news stories. Some also include analysis of the video for the automatic detection of features such as the presence or absence of faces. Both include some elements of searching, where a user specifies a query or information need, and browsing, where a user is allowed to browse through sets of retrieved video shots. We support the presentation of these tools with illustrations of actual video retrieval systems developed and working on hundreds of hours of video content
Indexing, browsing and searching of digital video
Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
BlogForever D3.2: Interoperability Prospects
This report evaluates the interoperability prospects of the BlogForever platform. Therefore, existing interoperability models are reviewed, a Delphi study to identify crucial aspects for the interoperability of web archives and digital libraries is conducted, technical interoperability standards and protocols are reviewed regarding their relevance for BlogForever, a simple approach to consider interoperability in specific usage scenarios is proposed, and a tangible approach to develop a succession plan that would allow a reliable transfer of content from the current digital archive to other digital repositories is presented
Archiving Software Surrogates on the Web for Future Reference
Software has long been established as an essential aspect of the scientific
process in mathematics and other disciplines. However, reliably referencing
software in scientific publications is still challenging for various reasons. A
crucial factor is that software dynamics with temporal versions or states are
difficult to capture over time. We propose to archive and reference surrogates
instead, which can be found on the Web and reflect the actual software to a
remarkable extent. Our study shows that about a half of the webpages of
software are already archived with almost all of them including some kind of
documentation.Comment: TPDL 2016, Hannover, German
Requirements for Provenance on the Web
From where did this tweet originate? Was this quote from the New York Times modified? Daily, we rely on data from the Web but often it is difficult or impossible to determine where it came from or how it was produced. This lack of provenance is particularly evident when people and systems deal with Web information or with any environment where information comes from sources of varying quality. Provenance is not captured pervasively in information systems. There are major technical, social, and economic impediments that stand in the way of using provenance effectively. This paper synthesizes requirements for provenance on the Web for a number of dimensions focusing on three key aspects of provenance: the content of provenance, the management of provenance records, and the uses of provenance information. To illustrate these requirements, we use three synthesized scenarios that encompass provenance problems faced by Web users toda
Recommended from our members
The U.S. Newspaper Industry in Transition
[Excerpt] The U.S. newspaper industry is suffering through what could be its worst financial crisis since the Great Depression. Advertising revenues are plummeting due to the severe economic downturn, while readership habits are changing as consumers turn to the Internet for free news and information. Some major newspaper chains are burdened by heavy debt loads. In the past year, seven major newspaper chains have declared bankruptcy, several big city papers have shut down, and many have laid off reporters and editors, imposed pay reductions, cut the size of the physical newspaper, or turned to Web-only publication.
As the problems intensify, there are growing concerns that the rapid decline of the newspaper industry will impact civic and social life. Already there are fewer newspaper reporters covering state capitols and city halls, while the number of states with newspapers covering Congress full-time has dwindled to 23 from the most recent peak of 35 in 1985.
As old-style, print newspapers decline, new journalism startups are developing around the country, aided by low entry costs on the Internet. The emerging ventures hold promise but do not have the experience, resources, and reach of shrinking mainstream newspapers.
Congress has begun debating whether the financial problems in the newspaper industry pose a public policy issue that warrants federal action. Whether a congressional response to the current turmoil is justified may depend on the current causes of the crisis. If the causes are related to significant technological shifts (the Internet, smart phones and electronic readers) or societal changes that are disruptive to established business models and means of news dissemination, the policy options may be quite limited, especially if new models of reporting (and, equally important, advertising) are beginning to emerge. Governmental policy actions to bolster existing businesses could stall or retard such a shift. In this case, policymakers might stand back and allow the market to realign news gathering and delivery, as it has many times in the past. If, on the other hand, the current crisis is related to the struggle of some major newspapers to survive the current recession, possible policy options to ensure the continuing availability of in-depth local and national news coverage by newspapers might include providing tax breaks, relaxing antitrust policy, tightening copyright law, providing general support for the practice of journalism by increasing funding for the Corporation for Public Broadcasting (CPB) or similar public programs, or helping newspapers reorganize as nonprofit organizations. Policymakers may also determine that some set of measures could ease the combination of social and technological transition and the recession-related financial distress of the industry
Radio Oranje: Enhanced Access to a Historical Spoken Word Collection
Access to historical audio collections is typically very restricted:\ud
content is often only available on physical (analog) media and the\ud
metadata is usually limited to keywords, giving access at the level\ud
of relatively large fragments, e.g., an entire tape. Many spoken\ud
word heritage collections are now being digitized, which allows the\ud
introduction of more advanced search technology. This paper presents\ud
an approach that supports online access and search for recordings of\ud
historical speeches. A demonstrator has been built, based on the\ud
so-called Radio Oranje collection, which contains radio speeches by\ud
the Dutch Queen Wilhelmina that were broadcast during World War II.\ud
The audio has been aligned with its original 1940s manual\ud
transcriptions to create a time-stamped index that enables the speeches to be\ud
searched at the word level. Results are presented together with\ud
related photos from an external database
- …