22,880 research outputs found
From Keyword Search to Exploration: How Result Visualization Aids Discovery on the Web
A key to the Web's success is the power of search. The elegant way in which search results are returned is usually remarkably effective. However, for exploratory search in which users need to learn, discover, and understand novel or complex topics, there is substantial room for improvement. Human computer interaction researchers and web browser designers have developed novel strategies to improve Web search by enabling users to conveniently visualize, manipulate, and organize their Web search results. This monograph offers fresh ways to think about search-related cognitive processes and describes innovative design approaches to browsers and related tools. For instance, while key word search presents users with results for specific information (e.g., what is the capitol of Peru), other methods may let users see and explore the contexts of their requests for information (related or previous work, conflicting information), or the properties that associate groups of information assets (group legal decisions by lead attorney). We also consider the both traditional and novel ways in which these strategies have been evaluated. From our review of cognitive processes, browser design, and evaluations, we reflect on the future opportunities and new paradigms for exploring and interacting with Web search results
Text Analytics for Android Project
Most advanced text analytics and text mining tasks include text classification, text clustering, building ontology, concept/entity extraction, summarization, deriving patterns within the structured data, production of granular taxonomies, sentiment and emotion analysis, document summarization, entity relation modelling, interpretation of the output. Already existing text analytics and text mining cannot develop text material alternatives (perform a multivariant design), perform multiple criteria analysis,
automatically select the most effective variant according to different aspects (citation index of papers (Scopus, ScienceDirect, Google Scholar) and authors (Scopus, ScienceDirect, Google Scholar), Top 25 papers, impact factor of journals, supporting phrases, document name and contents, density of keywords), calculate utility degree and market value. However, the Text Analytics for Android Project can perform the aforementioned functions. To the best of the knowledge herein, these functions have not been previously implemented; thus this is the first attempt to do so. The Text Analytics for Android Project is briefly described in this article
STARGATE : Static Repository Gateway and Toolkit. Final Project Report
STARGATE (Static Repository Gateway and Toolkit) was funded by the Joint Information Systems Committee (JISC) and is intended to demonstrate the ease of use of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Static Repository technology, and the potential benefits offered to publishers in making their metadata available in this way This technology offers a simpler method of participating in many information discovery services than creating fully-fledged OAI-compliant repositories. It does this by allowing the infrastructure and technical support required to participate in OAI-based services to be shifted from the data provider (the journal) to a third party and allows a single third party gateway provider to provide intermediation for many data providers (journals). Specifically, STARGATE has created a series of Static Repositories of publisher metadata provided by a selection of Library and Information Science journals. It has demonstrated the interoperability of these repositories by exposing their metadata via a Static Repository Gateway for harvesting and cross-searching by external service providers. The project has conducted a critical evaluation of the Static Repository approach in conjunction with the participating publishers and service providers. The technology works. The project has demonstrated that Static Repositories are easy to create and that the differences between fully-fledged and static OAI Repositories have no impact on the participation of small journal publishers in OAI-based services. The problems for a service that arise out of the use of Static Repositories are parallel to those created by any other repository dealing with journal articles. Problems arise from the diversity of metadata element sets provided by a given journal and the lack of specific metadata elements for the articles' volume and issue details. Another issue for the use of publishers' metadata arise as the collection policies of some existing services only allow Open Access materials to be included in them. The project recommends that the use of Static Repositories continues to be explored - in particular as a flexible way to expose existing sets of structured information to OAI services and to create the opportunity to enhance the metadata as part of the process. The project further recommends that the publishing community consider the creation or adoption of an application profile for journal articles to support information discovery that can search by volume and issue. Significant further use of the Static Repository technology by small journal publishers will require the future creation and maintenance of a community-specific Static Repository Gateway. Further use will also require advocacy within the publishing community but might initially be most effectively kick-started through the creation of OAI repositories based on metadata held by the commercial services which publish or mediate access to electronic copies of journals on behalf of small publishers
Clustering of twitter technology tweets and the impact of stopwords on clusters
Year of 2010 could be termed as the year in which Twitter became completely mainstream. Twitter, which started as a means of communicating with friends, became much more than its beginning. Now Twitter is used by companies to promote their new products, used by movie industry to promote movies. A lot of advertising and branding is now tied to Twitter and most importantly any breaking news that happens, the first place one goes and tries to find is to search it on Twitter. Be it the Mumbai attacks that happened in 2008, or the minor earthquakes that happened in Bay Area in 2010 or the twitter revolution cause of the Iran elections, most of the tech and not so tech savvy viewers were following twitter rather than any main stream news channels. In fact most of the breaking news now comes on Twitter because of the huge number of user base rather than the traditional mainstream media. The focus of this paper is clustering with the TF-IDF weighted mechanism of daily technology news tweets of prominent bloggers and news sites using Apache Mahout and to evaluate the effects of introducing and removing stop words on the quality of clustering. This project restricts itself to only tweets in the English language
Development and evaluation of clustering techniques for finding people
Typically in a large organisation much expertise and knowledge is held informally within employees' own memories. When employees leave an organisation many documented links that go through that person are broken and no mechanism is usually available to overcome these broken links. This match making problem is related to the problem of finding potential work partners in a large and distributed organisation. This paper reports a comparative investigation into using standard information retrieval techniques to group employees together based on their webpages. This information can, hopefully, be subsequently used to redirect broken links to people who worked closely with a departed employee or used to highlight people, say indifferent departments, who work on similar topics. The paper reports the design and positive results of an experiment conducted at RisĂž National Laboratory comparing four different IR searching and clustering approaches using real users' web pages
Computer-based library or computer-based learning?
Traditionally, libraries have played the role of repository of published information resources and, more recently,
gateway to online subscription databases. The library online catalog and digital library interface serve an
intermediary function to help users locate information resources available through the library. With competition from Web search engines and Web portals of various kinds available for free, the library has to step up to play a more active role as guide and coach to help users make use of information resources for learning or to accomplish particular tasks. It is no longer sufficient for computer-based library systems to provide just search and access functions. They must provide the functionality and environment to support learning and become computer-based learning systems. This paper examines the kind of learning support that can be incorporated in library online catalogs and digital libraries, including 1) enhanced support for information browsing and synthesis through linking by shared meta-data, references and concepts; 2) visualization of related information; 3) adoption of Library 2.0 and social technologies; 4) adoption of Library 3.0 technologies including intelligent processing and text mining
Detecting Family Resemblance: Automated Genre Classification.
This paper presents results in automated genre classification of digital documents in PDF format. It describes genre classification as an important ingredient in contextualising scientific data and in retrieving targetted material for improving research. The current paper compares the role of visual layout, stylistic features and language model features in clustering documents and presents results in retrieving five selected genres (Scientific Article, Thesis, Periodicals, Business Report, and Form) from a pool of materials populated with documents of the nineteen most popular genres found in our experimental data set.
- âŠ