Search CORE

5,145 research outputs found

DARIAH and the Benelux

Author: Backes Marianne
Chambers Sally
Hoogerwerf Maarten
Van der West Jan
Publication venue: Department of Applied Linguistics, Translators and Interpreters, University of Antwerp
Publication date: 01/01/2015
Field of study

Ghent University Academic Bibliography

Fourteenth Biennial Status Report: März 2017 - February 2019

Author
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2019
Field of study

MPG.PuRe

Using Web Archives to Enrich the Live Web Experience Through Storytelling

Author: AlNoamany Yasmin
Publication venue: ODU Digital Commons
Publication date: 01/07/2016
Field of study

Much of our cultural discourse occurs primarily on the Web. Thus, Web preservation is a fundamental precondition for multiple disciplines. Archiving Web pages into themed collections is a method for ensuring these resources are available for posterity. Services such as Archive-It exists to allow institutions to develop, curate, and preserve collections of Web resources. Understanding the contents and boundaries of these archived collections is a challenge for most people, resulting in the paradox of the larger the collection, the harder it is to understand. Meanwhile, as the sheer volume of data grows on the Web, storytelling is becoming a popular technique in social media for selecting Web resources to support a particular narrative or story . In this dissertation, we address the problem of understanding the archived collections through proposing the Dark and Stormy Archive (DSA) framework, in which we integrate storytelling social media and Web archives. In the DSA framework, we identify, evaluate, and select candidate Web pages from archived collections that summarize the holdings of these collections, arrange them in chronological order, and then visualize these pages using tools that users already are familiar with, such as Storify. To inform our work of generating stories from archived collections, we start by building a baseline for the structural characteristics of popular (i.e., receiving the most views) human-generated stories through investigating stories from Storify. Furthermore, we checked the entire population of Archive-It collections for better understanding the characteristics of the collections we intend to summarize. We then filter off-topic pages from the collections the using different methods to detect when an archived page in a collection has gone off-topic. We created a gold standard dataset from three Archive-It collections to evaluate the proposed methods at different thresholds. From the gold standard dataset, we identified five behaviors for the TimeMaps (a list of archived copies of a page) based on the page’s aboutness. Based on a dynamic slicing algorithm, we divide the collection and cluster the pages in each slice. We then select the best representative page from each cluster based on different quality metrics (e.g., the replay quality, and the quality of the generated snippet from the page). At the end, we put the selected pages in chronological order and visualize them using Storify. For evaluating the DSA framework, we obtained a ground truth dataset of hand-crafted stories from Archive-It collections generated by expert archivists. We used Amazon’s Mechanical Turk to evaluate the automatically generated stories against the stories that were created by domain experts. The results show that the automatically generated stories by the DSA are indistinguishable from those created by human subject domain experts, while at the same time both kinds of stories (automatic and human) are easily distinguished from randomly generated storie

Old Dominion University

Hi, how can I help you?: Automating enterprise IT support help desks

Author: Aralikatte Rahul
Dechu Sampath
Gantayat Neelamadhav
Gupta Monika
Khare Shreya
Mani Senthil
Mitchell Barry
Sankaran Anush
Subramanian Hemamalini
Venkatarangan Hema
Publication venue
Publication date: 02/11/2017
Field of study

Question answering is one of the primary challenges of natural language understanding. In realizing such a system, providing complex long answers to questions is a challenging task as opposed to factoid answering as the former needs context disambiguation. The different methods explored in the literature can be broadly classified into three categories namely: 1) classification based, 2) knowledge graph based and 3) retrieval based. Individually, none of them address the need of an enterprise wide assistance system for an IT support and maintenance domain. In this domain the variance of answers is large ranging from factoid to structured operating procedures; the knowledge is present across heterogeneous data sources like application specific documentation, ticket management systems and any single technique for a general purpose assistance is unable to scale for such a landscape. To address this, we have built a cognitive platform with capabilities adopted for this domain. Further, we have built a general purpose question answering system leveraging the platform that can be instantiated for multiple products, technologies in the support domain. The system uses a novel hybrid answering model that orchestrates across a deep learning classifier, a knowledge graph based context disambiguation module and a sophisticated bag-of-words search system. This orchestration performs context switching for a provided question and also does a smooth hand-off of the question to a human expert if none of the automated techniques can provide a confident answer. This system has been deployed across 675 internal enterprise IT support and maintenance projects.Comment: To appear in IAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

BCS SGAI SMA 2013: the BCS SGAI workshop on social media analysis

Author
Publication venue: M. Jeusfeld
Publication date: 01/01/2013
Field of study

Portsmouth University Research Portal (Pure)

Curating E-Mails; A life-cycle approach to the management and preservation of e-mail messages

Author: Pennock Mrs Maureen
Publication venue
Publication date: 01/01/2006
Field of study

E-mail forms the backbone of communications in many modern institutions and organisations and is a valuable type of organisational, cultural, and historical record. Successful management and preservation of valuable e-mail messages and collections is therefore vital if organisational accountability is to be achieved and historical or cultural memory retained for the future. This requires attention by all stakeholders across the entire life-cycle of the e-mail records. This instalment of the Digital Curation Manual reports on the several issues involved in managing and curating e-mail messages for both current and future use. Although there is no 'one-size-fits-all' solution, this instalment outlines a generic framework for e-mail curation and preservation, provides a summary of current approaches, and addresses the technical, organisational and cultural challenges to successful e-mail management and longer-term curation.

Mediating Color Filter Exploration with Color Theme Semantics Derived from Social Curation Data

Author: Jay Caroline
KIM TAEWOOK
MA XIAOJUAN
Reani Manuele
SUN ZHIDA
WU ZIMING
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

The University of Manchester - Institutional Repository

Semi-Automatic, Inline and Collaborative Web Page Code Curations

Author: Fritz Thomas
Holmes Reid
Meyer André N
Rutishauser Roy Adrian
Publication venue
Publication date: 20/05/2023
Field of study

Software developers spend about a quarter of their workday using the web to fulfill various information needs. Searching for relevant information online can be time-consuming, yet acquired information is rarely systematically persisted for later reference. In this work, we introduce SALI, an approach for semi-automated linking web pages to source code locations inline with the source code. SALI helps developers naturally capture high-quality, explicit links between web pages and specific source code locations by suggesting links for curation within the IDE. Through two laboratory studies, we examined the developer’s ability to both curate and consume links between web pages and specific source code locations while performing software development tasks. The studies were performed with 20 subjects working on realistic software change tasks from widely-used open-source projects. Results showed that developers continuously and concisely curate web pages at meaningful locations in the code with little effort. Additionally, we showed that other developers could use these curations while performing new and different change tasks to speed up relevant information gathering within unfamiliar codebases by a factor of 2.4

ZORA