29,098 research outputs found
Evolutionary Subject Tagging in the Humanities; Supporting Discovery and Examination in Digital Cultural Landscapes
In this paper, the authors attempt to identify problematic issues for subject tagging in the humanities, particularly those associated with information objects in digital formats. In the third major section, the authors identify a number of assumptions that lie behind the current practice of subject classification that we think should be challenged. We move then to propose features of classification systems that could increase their effectiveness. These emerged as recurrent themes in many of the conversations with scholars, consultants, and colleagues. Finally, we suggest next steps that we believe will help scholars and librarians develop better subject classification systems to support research in the humanities.NEH Office of Digital Humanities: Digital Humanities Start-Up Grant (HD-51166-10
Content and services issues for digital libraries
Describes the neglected area of e-collection building, on the taxonomy of e-collections and on the possible range of online services
What Others Say About This Work? Scalable Extraction of Citation Contexts from Research Papers
This work presents a new, scalable solution to the problem of extracting citation contexts: the textual fragments surrounding citation references. These citation contexts can be used to navigate digital libraries of research papers to help users in deciding what to read. We have developed a prototype system which can retrieve, on-demand, citation contexts from the full text of over 15 million research articles in the Mendeley catalog for a given reference research paper. The evaluation results show that our citation extraction system provides additional functionality over existing tools, has two orders of magnitude faster runtime performance, while providing a 9% improvement in F-measure over the current state-of-the-art
Classifying document types to enhance search and recommendations in digital libraries
In this paper, we address the problem of classifying documents available from
the global network of (open access) repositories according to their type. We
show that the metadata provided by repositories enabling us to distinguish
research papers, thesis and slides are missing in over 60% of cases. While
these metadata describing document types are useful in a variety of scenarios
ranging from research analytics to improving search and recommender (SR)
systems, this problem has not yet been sufficiently addressed in the context of
the repositories infrastructure. We have developed a new approach for
classifying document types using supervised machine learning based exclusively
on text specific features. We achieve 0.96 F1-score using the random forest and
Adaboost classifiers, which are the best performing models on our data. By
analysing the SR system logs of the CORE [1] digital library aggregator, we
show that users are an order of magnitude more likely to click on research
papers and thesis than on slides. This suggests that using document types as a
feature for ranking/filtering SR results in digital libraries has the potential
to improve user experience.Comment: 12 pages, 21st International Conference on Theory and Practise of
Digital Libraries (TPDL), 2017, Thessaloniki, Greec
Servicing the federation : the case for metadata harvesting
The paper presents a comparative analysis of data harvesting and distributed computing as complementary models of service delivery within large-scale federated digital libraries. Informed by requirements of flexibility and scalability of federated services, the analysis focuses on the identification and assessment of model invariants. In particular, it abstracts over application domains, services, and protocol implementations. The analytical evidence produced shows that the harvesting model offers stronger guarantees of satisfying the identified requirements. In addition, it suggests a first characterisation of services based on their suitability to either model and thus indicates how they could be integrated in the context of a single federated digital library
Towards MKM in the Large: Modular Representation and Scalable Software Architecture
MKM has been defined as the quest for technologies to manage mathematical
knowledge. MKM "in the small" is well-studied, so the real problem is to scale
up to large, highly interconnected corpora: "MKM in the large". We contend that
advances in two areas are needed to reach this goal. We need representation
languages that support incremental processing of all primitive MKM operations,
and we need software architectures and implementations that implement these
operations scalably on large knowledge bases.
We present instances of both in this paper: the MMT framework for modular
theory-graphs that integrates meta-logical foundations, which forms the base of
the next OMDoc version; and TNTBase, a versioned storage system for XML-based
document formats. TNTBase becomes an MMT database by instantiating it with
special MKM operations for MMT.Comment: To appear in The 9th International Conference on Mathematical
Knowledge Management: MKM 201
Research, relativity and relevance : can universal truths answer local questions
It is a commonplace that the internet has led to a globalisation of informatics and that this has had beneficial effects in terms of standards and interoperability. However this necessary harmonisation has also led to a growing understanding that this positive trend has an in-built assumption that "one size fits all". The paper explores the importance of local and national research in addressing global issues and the appropriateness of local solutions and applications. It concludes that federal and collegial solutions are to be preferred to imperial solutions
Digital libraries for creative communities
Digital library technologies have a great deal to offer to creative, design communities. They can enable large collections of text, images, music, video and other information objects to be organised and accessed in interesting and diverse ways. Ordinary people—people not traditionally viewed as 'creators' or 'designers'—can now conceive, assemble, build, and disseminate new information collections. This paper explores the development rationale behind the Greenstone digital library technology. We also examine three examples of creative new techniques for accessing and presenting information in digital libraries and stress the importance of tailoring information access to support the requirements of the users and application area
- …