941 research outputs found

    AUGUR: Forecasting the Emergence of New Research Topics

    Get PDF
    Being able to rapidly recognise new research trends is strategic for many stakeholders, including universities, institutional funding bodies, academic publishers and companies. The literature presents several approaches to identifying the emergence of new research topics, which rely on the assumption that the topic is already exhibiting a certain degree of popularity and consistently referred to by a community of researchers. However, detecting the emergence of a new research area at an embryonic stage, i.e., before the topic has been consistently labelled by a community of researchers and associated with a number of publications, is still an open challenge. We address this issue by introducing Augur, a novel approach to the early detection of research topics. Augur analyses the diachronic relationships between research areas and is able to detect clusters of topics that exhibit dynamics correlated with the emergence of new research topics. Here we also present the Advanced Clique Percolation Method (ACPM), a new community detection algorithm developed specifically for supporting this task. Augur was evaluated on a gold standard of 1,408 debutant topics in the 2000-2011 interval and outperformed four alternative approaches in terms of both precision and recall

    Advanced Knowledge Technologies at the Midterm: Tools and Methods for the Semantic Web

    Get PDF
    The University of Edinburgh and research sponsors are authorised to reproduce and distribute reprints and on-line copies for their purposes notwithstanding any copyright annotation hereon. The views and conclusions contained herein are the author’s and shouldn’t be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of other parties.In a celebrated essay on the new electronic media, Marshall McLuhan wrote in 1962:Our private senses are not closed systems but are endlessly translated into each other in that experience which we call consciousness. Our extended senses, tools, technologies, through the ages, have been closed systems incapable of interplay or collective awareness. Now, in the electric age, the very instantaneous nature of co-existence among our technological instruments has created a crisis quite new in human history. Our extended faculties and senses now constitute a single field of experience which demands that they become collectively conscious. Our technologies, like our private senses, now demand an interplay and ratio that makes rational co-existence possible. As long as our technologies were as slow as the wheel or the alphabet or money, the fact that they were separate, closed systems was socially and psychically supportable. This is not true now when sight and sound and movement are simultaneous and global in extent. (McLuhan 1962, p.5, emphasis in original)Over forty years later, the seamless interplay that McLuhan demanded between our technologies is still barely visible. McLuhan’s predictions of the spread, and increased importance, of electronic media have of course been borne out, and the worlds of business, science and knowledge storage and transfer have been revolutionised. Yet the integration of electronic systems as open systems remains in its infancy.Advanced Knowledge Technologies (AKT) aims to address this problem, to create a view of knowledge and its management across its lifecycle, to research and create the services and technologies that such unification will require. Half way through its sixyear span, the results are beginning to come through, and this paper will explore some of the services, technologies and methodologies that have been developed. We hope to give a sense in this paper of the potential for the next three years, to discuss the insights and lessons learnt in the first phase of the project, to articulate the challenges and issues that remain.The WWW provided the original context that made the AKT approach to knowledge management (KM) possible. AKT was initially proposed in 1999, it brought together an interdisciplinary consortium with the technological breadth and complementarity to create the conditions for a unified approach to knowledge across its lifecycle. The combination of this expertise, and the time and space afforded the consortium by the IRC structure, suggested the opportunity for a concerted effort to develop an approach to advanced knowledge technologies, based on the WWW as a basic infrastructure.The technological context of AKT altered for the better in the short period between the development of the proposal and the beginning of the project itself with the development of the semantic web (SW), which foresaw much more intelligent manipulation and querying of knowledge. The opportunities that the SW provided for e.g., more intelligent retrieval, put AKT in the centre of information technology innovation and knowledge management services; the AKT skill set would clearly be central for the exploitation of those opportunities.The SW, as an extension of the WWW, provides an interesting set of constraints to the knowledge management services AKT tries to provide. As a medium for the semantically-informed coordination of information, it has suggested a number of ways in which the objectives of AKT can be achieved, most obviously through the provision of knowledge management services delivered over the web as opposed to the creation and provision of technologies to manage knowledge.AKT is working on the assumption that many web services will be developed and provided for users. The KM problem in the near future will be one of deciding which services are needed and of coordinating them. Many of these services will be largely or entirely legacies of the WWW, and so the capabilities of the services will vary. As well as providing useful KM services in their own right, AKT will be aiming to exploit this opportunity, by reasoning over services, brokering between them, and providing essential meta-services for SW knowledge service management.Ontologies will be a crucial tool for the SW. The AKT consortium brings a lot of expertise on ontologies together, and ontologies were always going to be a key part of the strategy. All kinds of knowledge sharing and transfer activities will be mediated by ontologies, and ontology management will be an important enabling task. Different applications will need to cope with inconsistent ontologies, or with the problems that will follow the automatic creation of ontologies (e.g. merging of pre-existing ontologies to create a third). Ontology mapping, and the elimination of conflicts of reference, will be important tasks. All of these issues are discussed along with our proposed technologies.Similarly, specifications of tasks will be used for the deployment of knowledge services over the SW, but in general it cannot be expected that in the medium term there will be standards for task (or service) specifications. The brokering metaservices that are envisaged will have to deal with this heterogeneity.The emerging picture of the SW is one of great opportunity but it will not be a wellordered, certain or consistent environment. It will comprise many repositories of legacy data, outdated and inconsistent stores, and requirements for common understandings across divergent formalisms. There is clearly a role for standards to play to bring much of this context together; AKT is playing a significant role in these efforts. But standards take time to emerge, they take political power to enforce, and they have been known to stifle innovation (in the short term). AKT is keen to understand the balance between principled inference and statistical processing of web content. Logical inference on the Web is tough. Complex queries using traditional AI inference methods bring most distributed computer systems to their knees. Do we set up semantically well-behaved areas of the Web? Is any part of the Web in which semantic hygiene prevails interesting enough to reason in? These and many other questions need to be addressed if we are to provide effective knowledge technologies for our content on the web

    B!SON: A Tool for Open Access Journal Recommendation

    Get PDF
    Finding a suitable open access journal to publish scientific work is a complex task: Researchers have to navigate a constantly growing number of journals, institutional agreements with publishers, funders’ conditions and the risk of Predatory Publishers. To help with these challenges, we introduce a web-based journal recommendation system called B!SON. It is developed based on a systematic requirements analysis, built on open data, gives publisher-independent recommendations and works across domains. It suggests open access journals based on title, abstract and references provided by the user. The recommendation quality has been evaluated using a large test set of 10,000 articles. Development by two German scientific libraries ensures the longevity of the project

    Improving National and Homeland Security through a proposed Laboratory for Information Globalization and Harmonization Technologies (LIGHT)

    Get PDF
    A recent National Research Council study found that: "Although there are many private and public databases that contain information potentially relevant to counter terrorism programs, they lack the necessary context definitions (i.e., metadata) and access tools to enable interoperation with other databases and the extraction of meaningful and timely information" [NRC02, p.304, emphasis added] That sentence succinctly describes the objectives of this project. Improved access and use of information are essential to better identify and anticipate threats, protect against and respond to threats, and enhance national and homeland security (NHS), as well as other national priority areas, such as Economic Prosperity and a Vibrant Civil Society (ECS) and Advances in Science and Engineering (ASE). This project focuses on the creation and contributions of a Laboratory for Information Globalization and Harmonization Technologies (LIGHT) with two interrelated goals: (1) Theory and Technologies: To research, design, develop, test, and implement theory and technologies for improving the reliability, quality, and responsiveness of automated mechanisms for reasoning and resolving semantic differences that hinder the rapid and effective integration (int) of systems and data (dmc) across multiple autonomous sources, and the use of that information by public and private agencies involved in national and homeland security and the other national priority areas involving complex and interdependent social systems (soc). This work builds on our research on the COntext INterchange (COIN) project, which focused on the integration of diverse distributed heterogeneous information sources using ontologies, databases, context mediation algorithms, and wrapper technologies to overcome information representational conflicts. The COIN approach makes it substantially easier and more transparent for individual receivers (e.g., applications, users) to access and exploit distributed sources. Receivers specify their desired context to reduce ambiguities in the interpretation of information coming from heterogeneous sources. This approach significantly reduces the overhead involved in the integration of multiple sources, improves data quality, increases the speed of integration, and simplifies maintenance in an environment of changing source and receiver context - which will lead to an effective and novel distributed information grid infrastructure. This research also builds on our Global System for Sustainable Development (GSSD), an Internet platform for information generation, provision, and integration of multiple domains, regions, languages, and epistemologies relevant to international relations and national security. (2) National Priority Studies: To experiment with and test the developed theory and technologies on practical problems of data integration in national priority areas. Particular focus will be on national and homeland security, including data sources about conflict and war, modes of instability and threat, international and regional demographic, economic, and military statistics, money flows, and contextualizing terrorism defense and response. Although LIGHT will leverage the results of our successful prior research projects, this will be the first research effort to simultaneously and effectively address ontological and temporal information conflicts as well as dramatically enhance information quality. Addressing problems of national priorities in such rapidly changing complex environments requires extraction of observations from disparate sources, using different interpretations, at different points in times, for different purposes, with different biases, and for a wide range of different uses and users. This research will focus on integrating information both over individual domains and across multiple domains. Another innovation is the concept and implementation of Collaborative Domain Spaces (CDS), within which applications in a common domain can share, analyze, modify, and develop information. Applications also can span multiple domains via Linked CDSs. The PIs have considerable experience with these research areas and the organization and management of such large scale international and diverse research projects. The PIs come from three different Schools at MIT: Management, Engineering, and Humanities, Arts & Social Sciences. The faculty and graduate students come from about a dozen nationalities and diverse ethnic, racial, and religious backgrounds. The currently identified external collaborators come from over 20 different organizations and many different countries, industrial as well as developing. Specific efforts are proposed to engage even more women, underrepresented minorities, and persons with disabilities. The anticipated results apply to any complex domain that relies on heterogeneous distributed data to address and resolve compelling problems. This initiative is supported by international collaborators from (a) scientific and research institutions, (b) business and industry, and (c) national and international agencies. Research products include: a System for Harmonized Information Processing (SHIP), a software platform, and diverse applications in research and education which are anticipated to significantly impact the way complex organizations, and society in general, understand and manage critical challenges in NHS, ECS, and ASE

    Improving National and Homeland Security through a proposed Laboratory for nformation Globalization and Harmonization Technologies (LIGHT)

    Get PDF
    A recent National Research Council study found that: "Although there are many private and public databases that contain information potentially relevant to counter terrorism programs, they lack the necessary context definitions (i.e., metadata) and access tools to enable interoperation with other databases and the extraction of meaningful and timely information" [NRC02, p.304, emphasis added] That sentence succinctly describes the objectives of this project. Improved access and use of information are essential to better identify and anticipate threats, protect against and respond to threats, and enhance national and homeland security (NHS), as well as other national priority areas, such as Economic Prosperity and a Vibrant Civil Society (ECS) and Advances in Science and Engineering (ASE). This project focuses on the creation and contributions of a Laboratory for Information Globalization and Harmonization Technologies (LIGHT) with two interrelated goals: (1) Theory and Technologies: To research, design, develop, test, and implement theory and technologies for improving the reliability, quality, and responsiveness of automated mechanisms for reasoning and resolving semantic differences that hinder the rapid and effective integration (int) of systems and data (dmc) across multiple autonomous sources, and the use of that information by public and private agencies involved in national and homeland security and the other national priority areas involving complex and interdependent social systems (soc). This work builds on our research on the COntext INterchange (COIN) project, which focused on the integration of diverse distributed heterogeneous information sources using ontologies, databases, context mediation algorithms, and wrapper technologies to overcome information representational conflicts. The COIN approach makes it substantially easier and more transparent for individual receivers (e.g., applications, users) to access and exploit distributed sources. Receivers specify their desired context to reduce ambiguities in the interpretation of information coming from heterogeneous sources. This approach significantly reduces the overhead involved in the integration of multiple sources, improves data quality, increases the speed of integration, and simplifies maintenance in an environment of changing source and receiver context - which will lead to an effective and novel distributed information grid infrastructure. This research also builds on our Global System for Sustainable Development (GSSD), an Internet platform for information generation, provision, and integration of multiple domains, regions, languages, and epistemologies relevant to international relations and national security. (2) National Priority Studies: To experiment with and test the developed theory and technologies on practical problems of data integration in national priority areas. Particular focus will be on national and homeland security, including data sources about conflict and war, modes of instability and threat, international and regional demographic, economic, and military statistics, money flows, and contextualizing terrorism defense and response. Although LIGHT will leverage the results of our successful prior research projects, this will be the first research effort to simultaneously and effectively address ontological and temporal information conflicts as well as dramatically enhance information quality. Addressing problems of national priorities in such rapidly changing complex environments requires extraction of observations from disparate sources, using different interpretations, at different points in times, for different purposes, with different biases, and for a wide range of different uses and users. This research will focus on integrating information both over individual domains and across multiple domains. Another innovation is the concept and implementation of Collaborative Domain Spaces (CDS), within which applications in a common domain can share, analyze, modify, and develop information. Applications also can span multiple domains via Linked CDSs. The PIs have considerable experience with these research areas and the organization and management of such large scale international and diverse research projects. The PIs come from three different Schools at MIT: Management, Engineering, and Humanities, Arts & Social Sciences. The faculty and graduate students come from about a dozen nationalities and diverse ethnic, racial, and religious backgrounds. The currently identified external collaborators come from over 20 different organizations and many different countries, industrial as well as developing. Specific efforts are proposed to engage even more women, underrepresented minorities, and persons with disabilities. The anticipated results apply to any complex domain that relies on heterogeneous distributed data to address and resolve compelling problems. This initiative is supported by international collaborators from (a) scientific and research institutions, (b) business and industry, and (c) national and international agencies. Research products include: a System for Harmonized Information Processing (SHIP), a software platform, and diverse applications in research and education which are anticipated to significantly impact the way complex organizations, and society in general, understand and manage critical challenges in NHS, ECS, and ASE

    Early Detection of Research Trends

    Get PDF
    Being able to rapidly recognise new research trends is strategic for many stakeholders, including universities, institutional funding bodies, academic publishers and companies. The literature presents several approaches to identifying the emergence of new research topics, which rely on the assumption that the topic is already exhibiting a certain degree of popularity and consistently referred to by a community of researchers. However, detecting the emergence of a new research area at an embryonic stage, i.e., before the topic has been consistently labelled by a community of researchers and associated with a number of publications, is still an open challenge. In this dissertation, we begin to address this challenge by performing a study of the dynamics preceding the creation of new topics. This study indicates that the emergence of a new topic is anticipated by a significant increase in the pace of collaboration between relevant research areas, which can be seen as the 'ancestors' of the new topic. Based on this understanding, we developed Augur, a novel approach to effectively detect the emergence of new research topics. Augur analyses the diachronic relationships between research areas and is able to detect clusters of topics that exhibit dynamics correlated with the emergence of new research topics. Here we also present the Advanced Clique Percolation Method (ACPM), a new community detection algorithm developed specifically for supporting this task. Augur was evaluated on a gold standard of 1,408 debutant topics in the 2000-2011 timeframe and outperformed four alternative approaches in terms of both precision and recall

    Bibliographic Control in the Digital Ecosystem

    Get PDF
    With the contributions of international experts, the book aims to explore the new boundaries of universal bibliographic control. Bibliographic control is radically changing because the bibliographic universe is radically changing: resources, agents, technologies, standards and practices. Among the main topics addressed: library cooperation networks; legal deposit; national bibliographies; new tools and standards (IFLA LRM, RDA, BIBFRAME); authority control and new alliances (Wikidata, Wikibase, Identifiers); new ways of indexing resources (artificial intelligence); institutional repositories; new book supply chain; “discoverability” in the IIIF digital ecosystem; role of thesauri and ontologies in the digital ecosystem; bibliographic control and search engines

    Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference

    Get PDF
    No abstract available

    Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference

    Get PDF
    No abstract available
    • 

    corecore