6,137 research outputs found

    Advanced Knowledge Technologies at the Midterm: Tools and Methods for the Semantic Web

    Get PDF
    The University of Edinburgh and research sponsors are authorised to reproduce and distribute reprints and on-line copies for their purposes notwithstanding any copyright annotation hereon. The views and conclusions contained herein are the author’s and shouldn’t be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of other parties.In a celebrated essay on the new electronic media, Marshall McLuhan wrote in 1962:Our private senses are not closed systems but are endlessly translated into each other in that experience which we call consciousness. Our extended senses, tools, technologies, through the ages, have been closed systems incapable of interplay or collective awareness. Now, in the electric age, the very instantaneous nature of co-existence among our technological instruments has created a crisis quite new in human history. Our extended faculties and senses now constitute a single field of experience which demands that they become collectively conscious. Our technologies, like our private senses, now demand an interplay and ratio that makes rational co-existence possible. As long as our technologies were as slow as the wheel or the alphabet or money, the fact that they were separate, closed systems was socially and psychically supportable. This is not true now when sight and sound and movement are simultaneous and global in extent. (McLuhan 1962, p.5, emphasis in original)Over forty years later, the seamless interplay that McLuhan demanded between our technologies is still barely visible. McLuhan’s predictions of the spread, and increased importance, of electronic media have of course been borne out, and the worlds of business, science and knowledge storage and transfer have been revolutionised. Yet the integration of electronic systems as open systems remains in its infancy.Advanced Knowledge Technologies (AKT) aims to address this problem, to create a view of knowledge and its management across its lifecycle, to research and create the services and technologies that such unification will require. Half way through its sixyear span, the results are beginning to come through, and this paper will explore some of the services, technologies and methodologies that have been developed. We hope to give a sense in this paper of the potential for the next three years, to discuss the insights and lessons learnt in the first phase of the project, to articulate the challenges and issues that remain.The WWW provided the original context that made the AKT approach to knowledge management (KM) possible. AKT was initially proposed in 1999, it brought together an interdisciplinary consortium with the technological breadth and complementarity to create the conditions for a unified approach to knowledge across its lifecycle. The combination of this expertise, and the time and space afforded the consortium by the IRC structure, suggested the opportunity for a concerted effort to develop an approach to advanced knowledge technologies, based on the WWW as a basic infrastructure.The technological context of AKT altered for the better in the short period between the development of the proposal and the beginning of the project itself with the development of the semantic web (SW), which foresaw much more intelligent manipulation and querying of knowledge. The opportunities that the SW provided for e.g., more intelligent retrieval, put AKT in the centre of information technology innovation and knowledge management services; the AKT skill set would clearly be central for the exploitation of those opportunities.The SW, as an extension of the WWW, provides an interesting set of constraints to the knowledge management services AKT tries to provide. As a medium for the semantically-informed coordination of information, it has suggested a number of ways in which the objectives of AKT can be achieved, most obviously through the provision of knowledge management services delivered over the web as opposed to the creation and provision of technologies to manage knowledge.AKT is working on the assumption that many web services will be developed and provided for users. The KM problem in the near future will be one of deciding which services are needed and of coordinating them. Many of these services will be largely or entirely legacies of the WWW, and so the capabilities of the services will vary. As well as providing useful KM services in their own right, AKT will be aiming to exploit this opportunity, by reasoning over services, brokering between them, and providing essential meta-services for SW knowledge service management.Ontologies will be a crucial tool for the SW. The AKT consortium brings a lot of expertise on ontologies together, and ontologies were always going to be a key part of the strategy. All kinds of knowledge sharing and transfer activities will be mediated by ontologies, and ontology management will be an important enabling task. Different applications will need to cope with inconsistent ontologies, or with the problems that will follow the automatic creation of ontologies (e.g. merging of pre-existing ontologies to create a third). Ontology mapping, and the elimination of conflicts of reference, will be important tasks. All of these issues are discussed along with our proposed technologies.Similarly, specifications of tasks will be used for the deployment of knowledge services over the SW, but in general it cannot be expected that in the medium term there will be standards for task (or service) specifications. The brokering metaservices that are envisaged will have to deal with this heterogeneity.The emerging picture of the SW is one of great opportunity but it will not be a wellordered, certain or consistent environment. It will comprise many repositories of legacy data, outdated and inconsistent stores, and requirements for common understandings across divergent formalisms. There is clearly a role for standards to play to bring much of this context together; AKT is playing a significant role in these efforts. But standards take time to emerge, they take political power to enforce, and they have been known to stifle innovation (in the short term). AKT is keen to understand the balance between principled inference and statistical processing of web content. Logical inference on the Web is tough. Complex queries using traditional AI inference methods bring most distributed computer systems to their knees. Do we set up semantically well-behaved areas of the Web? Is any part of the Web in which semantic hygiene prevails interesting enough to reason in? These and many other questions need to be addressed if we are to provide effective knowledge technologies for our content on the web

    Bringing Structure into Summaries: Crowdsourcing a Benchmark Corpus of Concept Maps

    Full text link
    Concept maps can be used to concisely represent important information and bring structure into large document collections. Therefore, we study a variant of multi-document summarization that produces summaries in the form of concept maps. However, suitable evaluation datasets for this task are currently missing. To close this gap, we present a newly created corpus of concept maps that summarize heterogeneous collections of web documents on educational topics. It was created using a novel crowdsourcing approach that allows us to efficiently determine important elements in large document collections. We release the corpus along with a baseline system and proposed evaluation protocol to enable further research on this variant of summarization.Comment: Published at EMNLP 201

    KnowledgePuzzle: a browsing tool to adapt the web navigation process to the learner's mental model

    Get PDF
    This article presents KnowledgePuzzle, a browsing tool for knowledge construction from the web. It aims to adapt the structure of web content to the learner’s information needs regardless of how the web content is originally delivered. Learners are provided with a meta-cognitive space (eg, a concept mapping tool) that enables them to plan navigation paths and visualize the semantic processing of knowledge in their minds. Once the learner’s viewpoint becomes visually represented, it will be transformed to a layer of informative hyperlinks and annotations over previously visited pages. The attached layer causes the web content to be explicitly structured to accommodate the learner’s interests by interlinking and annotating chunks of information that make up the learner’s knowledge. Finally, a hypertext version of the whole knowledge is generated to enable fast and easy reviewing. A discussion about the

    Characterizing the Landscape of Musical Data on the Web: State of the Art and Challenges

    Get PDF
    Musical data can be analysed, combined, transformed and exploited for diverse purposes. However, despite the proliferation of digital libraries and repositories for music, infrastructures and tools, such uses of musical data remain scarce. As an initial step to help fill this gap, we present a survey of the landscape of musical data on the Web, available as a Linked Open Dataset: the musoW dataset of catalogued musical resources. We present the dataset and the methodology and criteria for its creation and assessment. We map the identified dimensions and parameters to existing Linked Data vocabularies, present insights gained from SPARQL queries, and identify significant relations between resource features. We present a thematic analysis of the original research questions associated with surveyed resources and identify the extent to which the collected resources are Linked Data-ready

    Building a semantically annotated corpus of clinical texts

    Get PDF
    In this paper, we describe the construction of a semantically annotated corpus of clinical texts for use in the development and evaluation of systems for automatically extracting clinically significant information from the textual component of patient records. The paper details the sampling of textual material from a collection of 20,000 cancer patient records, the development of a semantic annotation scheme, the annotation methodology, the distribution of annotations in the final corpus, and the use of the corpus for development of an adaptive information extraction system. The resulting corpus is the most richly semantically annotated resource for clinical text processing built to date, whose value has been demonstrated through its use in developing an effective information extraction system. The detailed presentation of our corpus construction and annotation methodology will be of value to others seeking to build high-quality semantically annotated corpora in biomedical domains

    Technologies to enhance self-directed learning from hypertext

    Get PDF
    With the growing popularity of the World Wide Web, materials presented to learners in the form of hypertext have become a major instructional resource. Despite the potential of hypertext to facilitate access to learning materials, self-directed learning from hypertext is often associated with many concerns. Self-directed learners, due to their different viewpoints, may follow different navigation paths, and thus they will have different interactions with knowledge. Therefore, learners can end up being disoriented or cognitively-overloaded due to the potential gap between what they need and what actually exists on the Web. In addition, while a lot of research has gone into supporting the task of finding web resources, less attention has been paid to the task of supporting the interpretation of Web pages. The inability to interpret the content of pages leads learners to interrupt their current browsing activities to seek help from other human resources or explanatory learning materials. Such activity can weaken learner engagement and lower their motivation to learn. This thesis aims to promote self-directed learning from hypertext resources by proposing solutions to the above problems. It first presents Knowledge Puzzle, a tool that proposes a constructivist approach to learn from the Web. Its main contribution to Web-based learning is that self-directed learners will be able to adapt the path of instruction and the structure of hypertext to their way of thinking, regardless of how the Web content is delivered. This can effectively reduce the gap between what they need and what exists on the Web. SWLinker is another system proposed in this thesis with the aim of supporting the interpretation of Web pages using ontology based semantic annotation. It is an extension to the Internet Explorer Web browser that automatically creates a semantic layer of explanatory information and instructional guidance over Web pages. It also aims to break the conventional view of Web browsing as an individual activity by leveraging the notion of ontology-based collaborative browsing. Both of the tools presented in this thesis were evaluated by students within the context of particular learning tasks. The results show that they effectively fulfilled the intended goals by facilitating learning from hypertext without introducing high overheads in terms of usability or browsing efforts

    Designing a Web Application for Simple and Collaborative Video Annotation That Meets Teaching Routines and Educational Requirements

    Get PDF
    Video annotation and analysis is an important activity for teaching with and about audiovisual media artifacts because it helps students to learn how to identify textual and formal connections in media products. But school teachers lack adequate tools for video annotation and analysis in media education that are easy-to-use, integrate into established teaching organization, and support quick collaborative work. To address these challenges, we followed a design-based research approach and conducted qualitative interviews with teachers to develop TRAVIS GO, a web application for simple and collaborative video annotation. TRAVIS GO allows for quick and easy use within established teaching settings. The web application provides basic analytical features in an adaptable work space. Key didactic features include tagging and commenting on posts, sharing and exporting projects, and working in live collaboration. Teachers can create assignments according to grade level, learning subject, and class size. Our work contributes further insights for the CSCW community about how to implement user demands into developing educational tools

    Annotation Studio: Multimedia Annotation for Students

    No full text
    Annotation Studio is a web-based annotation application that integrates a powerful set of textual interpretation tools behind an interface that makes using those tools intuitive for undergraduates. Building on students’ new media literacies, this Open-source application develops traditional humanistic skills including close reading, textual analysis, persuasive writing, and critical thinking. Initial features of the Annotation Studio prototype, supported by an NEH Start-Up Grant, include aligned multi-media annotation of written texts, user-defined sharing of annotations, and grouping of annotation by self-defined tags to support interpretation and argument development. The fully developed application will support annotation of image, video and audio documents; annotation visualization; export of texts with annotations; and a media repository. We will also identify best practices among faculty using Annotation Studio in a broad range of humanities classes across the country

    Mining semantics for culturomics: towards a knowledge-based approach

    Get PDF
    The massive amounts of text data made available through the Google Books digitization project have inspired a new field of big-data textual research. Named culturomics, this field has attracted the attention of a growing number of scholars over recent years. However, initial studies based on these data have been criticized for not referring to relevant work in linguistics and language technology. This paper provides some ideas, thoughts and first steps towards a new culturomics initiative, based this time on Swedish data, which pursues a more knowledge-based approach than previous work in this emerging field. The amount of new Swedish text produced daily and older texts being digitized in cultural heritage projects grows at an accelerating rate. These volumes of text being available in digital form have grown far beyond the capacity of human readers, leaving automated semantic processing of the texts as the only realistic option for accessing and using the information contained in them. The aim of our recently initiated research program is to advance the state of the art in language technology resources and methods for semantic processing of Big Swedish text and focus on the theoretical and methodological advancement of the state of the art in extracting and correlating information from large volumes of Swedish text using a combination of knowledge-based and statistical methods
    corecore