27,666 research outputs found

    Natural language processing

    Get PDF
    Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

    Managing complexity in a distributed digital library

    Get PDF
    As the capabilities of distributed digital libraries increase, managing organizational and software complexity becomes a key issue. How can collections and indexes be updated without impacting queries currently in progress? How can the system handle several user-interface clients for the same collections? Computer science professors and lectors from the University of Waikato have developed a software structure that successfully manages this complexity in the New Zealand Digital Library. This digital library has been a success in managing organizational and software complexity. The researchers' primary goal has been to minimize the effort required to keep the system operational and yet continue to expand its offerings

    Escaping the Trap of too Precise Topic Queries

    Full text link
    At the very center of digital mathematics libraries lie controlled vocabularies which qualify the {\it topic} of the documents. These topics are used when submitting a document to a digital mathematics library and to perform searches in a library. The latter are refined by the use of these topics as they allow a precise classification of the mathematics area this document addresses. However, there is a major risk that users employ too precise topics to specify their queries: they may be employing a topic that is only "close-by" but missing to match the right resource. We call this the {\it topic trap}. Indeed, since 2009, this issue has appeared frequently on the i2geo.net platform. Other mathematics portals experience the same phenomenon. An approach to solve this issue is to introduce tolerance in the way queries are understood by the user. In particular, the approach of including fuzzy matches but this introduces noise which may prevent the user of understanding the function of the search engine. In this paper, we propose a way to escape the topic trap by employing the navigation between related topics and the count of search results for each topic. This supports the user in that search for close-by topics is a click away from a previous search. This approach was realized with the i2geo search engine and is described in detail where the relation of being {\it related} is computed by employing textual analysis of the definitions of the concepts fetched from the Wikipedia encyclopedia.Comment: 12 pages, Conference on Intelligent Computer Mathematics 2013 Bath, U

    Abmash: Mashing Up Legacy Web Applications by Automated Imitation of Human Actions

    Get PDF
    Many business web-based applications do not offer applications programming interfaces (APIs) to enable other applications to access their data and functions in a programmatic manner. This makes their composition difficult (for instance to synchronize data between two applications). To address this challenge, this paper presents Abmash, an approach to facilitate the integration of such legacy web applications by automatically imitating human interactions with them. By automatically interacting with the graphical user interface (GUI) of web applications, the system supports all forms of integrations including bi-directional interactions and is able to interact with AJAX-based applications. Furthermore, the integration programs are easy to write since they deal with end-user, visual user-interface elements. The integration code is simple enough to be called a "mashup".Comment: Software: Practice and Experience (2013)

    A Word Sense-Oriented User Interface for Interactive Multilingual Text Retrieval

    Get PDF
    In this paper we present an interface for supporting a user in an interactive cross-language search process using semantic classes. In order to enable users to access multilingual information, different problems have to be solved: disambiguating and translating the query words, as well as categorizing and presenting the results appropriately. Therefore, we first give a brief introduction to word sense disambiguation, cross-language text retrieval and document categorization and finally describe recent achievements of our research towards an interactive multilingual retrieval system. We focus especially on the problem of browsing and navigation of the different word senses in one source and possibly several target languages. In the last part of the paper, we discuss the developed user interface and its functionalities in more detail

    1st INCF Workshop on Sustainability of Neuroscience Databases

    Get PDF
    The goal of the workshop was to discuss issues related to the sustainability of neuroscience databases, identify problems and propose solutions, and formulate recommendations to the INCF. The report summarizes the discussions of invited participants from the neuroinformatics community as well as from other disciplines where sustainability issues have already been approached. The recommendations for the INCF involve rating, ranking, and supporting database sustainability

    Social media analytics: a survey of techniques, tools and platforms

    Get PDF
    This paper is written for (social science) researchers seeking to analyze the wealth of social media now available. It presents a comprehensive review of software tools for social networking media, wikis, really simple syndication feeds, blogs, newsgroups, chat and news feeds. For completeness, it also includes introductions to social media scraping, storage, data cleaning and sentiment analysis. Although principally a review, the paper also provides a methodology and a critique of social media tools. Analyzing social media, in particular Twitter feeds for sentiment analysis, has become a major research and business activity due to the availability of web-based application programming interfaces (APIs) provided by Twitter, Facebook and News services. This has led to an ‘explosion’ of data services, software tools for scraping and analysis and social media analytics platforms. It is also a research area undergoing rapid change and evolution due to commercial pressures and the potential for using social media data for computational (social science) research. Using a simple taxonomy, this paper provides a review of leading software tools and how to use them to scrape, cleanse and analyze the spectrum of social media. In addition, it discussed the requirement of an experimental computational environment for social media research and presents as an illustration the system architecture of a social media (analytics) platform built by University College London. The principal contribution of this paper is to provide an overview (including code fragments) for scientists seeking to utilize social media scraping and analytics either in their research or business. The data retrieval techniques that are presented in this paper are valid at the time of writing this paper (June 2014), but they are subject to change since social media data scraping APIs are rapidly changing
    corecore