869 research outputs found
Hybrid Information Retrieval Model For Web Images
The Bing Bang of the Internet in the early 90's increased dramatically the
number of images being distributed and shared over the web. As a result, image
information retrieval systems were developed to index and retrieve image files
spread over the Internet. Most of these systems are keyword-based which search
for images based on their textual metadata; and thus, they are imprecise as it
is vague to describe an image with a human language. Besides, there exist the
content-based image retrieval systems which search for images based on their
visual information. However, content-based type systems are still immature and
not that effective as they suffer from low retrieval recall/precision rate.
This paper proposes a new hybrid image information retrieval model for indexing
and retrieving web images published in HTML documents. The distinguishing mark
of the proposed model is that it is based on both graphical content and textual
metadata. The graphical content is denoted by color features and color
histogram of the image; while textual metadata are denoted by the terms that
surround the image in the HTML document, more particularly, the terms that
appear in the tags p, h1, and h2, in addition to the terms that appear in the
image's alt attribute, filename, and class-label. Moreover, this paper presents
a new term weighting scheme called VTF-IDF short for Variable Term
Frequency-Inverse Document Frequency which unlike traditional schemes, it
exploits the HTML tag structure and assigns an extra bonus weight for terms
that appear within certain particular HTML tags that are correlated to the
semantics of the image. Experiments conducted to evaluate the proposed IR model
showed a high retrieval precision rate that outpaced other current models.Comment: LACSC - Lebanese Association for Computational Sciences,
http://www.lacsc.org/; International Journal of Computer Science & Emerging
Technologies (IJCSET), Vol. 3, No. 1, February 201
Evaluating the retrieval effectiveness of Web search engines using a representative query sample
Search engine retrieval effectiveness studies are usually small-scale, using
only limited query samples. Furthermore, queries are selected by the
researchers. We address these issues by taking a random representative sample
of 1,000 informational and 1,000 navigational queries from a major German
search engine and comparing Google's and Bing's results based on this sample.
Jurors were found through crowdsourcing, data was collected using specialised
software, the Relevance Assessment Tool (RAT). We found that while Google
outperforms Bing in both query types, the difference in the performance for
informational queries was rather low. However, for navigational queries, Google
found the correct answer in 95.3 per cent of cases whereas Bing only found the
correct answer 76.6 per cent of the time. We conclude that search engine
performance on navigational queries is of great importance, as users in this
case can clearly identify queries that have returned correct results. So,
performance on this query type may contribute to explaining user satisfaction
with search engines
Access to information in digital libraries : users and digital divide
Recognising the importance of information and knowledge in all spheres of human life, the recently held World Summit on Information Society came up with a plan of action for building a global information society. The goal of the world information society initiatives is the same as that of digital library research and development - to make information and knowledge accessibleto everyone in the world. Digital libraries have progressed very rapidly over the past ten or soyears. This paper addresses the two most important aspects of the information society - information users and digital divide. Findings of some large-scale studies on human information behaviour on the web and digital libraries have been discussed. The major findings of a study on access to electronic resources by university students are the presented. Proposed that a one-stop window approach with a task-based information organisation and access system may be the way forward
WAQS : a web-based approximate query system
The Web is often viewed as a gigantic database holding vast stores of information and provides ubiquitous accessibility to end-users. Since its inception, the Internet has experienced explosive growth both in the number of users and the amount of content available on it. However, searching for information on the Web has become increasingly difficult. Although query languages have long been part of database management systems, the standard query language being the Structural Query Language is not suitable for the Web content retrieval.
In this dissertation, a new technique for document retrieval on the Web is presented. This technique is designed to allow a detailed retrieval and hence reduce the amount of matches returned by typical search engines. The main objective of this technique is to allow the query to be based on not just keywords but also the location of the keywords within the logical structure of a document. In addition, the technique also provides approximate search capabilities based on the notion of Distance and Variable Length Don\u27t Cares. The proposed techniques have been implemented in a system, called Web-Based Approximate Query System, which contains an SQL-like query language called Web-Based Approximate Query Language.
Web-Based Approximate Query Language has also been integrated with EnviroDaemon, an environmental domain specific search engine. It provides EnviroDaemon with more detailed searching capabilities than just keyword-based search. Implementation details, technical results and future work are presented in this dissertation
Multimedia Chinese Web Search Engines: A Survey
The objective of this paper is to explore the state of multimedia search functionality on major general and dedicated Web search engines in Chinese language. The authors studied: a) how many Chinese Web search engines presently make use of multimedia searching, and b) the type of multimedia search functionality available. Specifically, the following were examined: a) multimedia features - features allowing multimedia search; and b) extent of personalization - the extent to which a search engine Web site allows users to control multimedia search. Overall, Chinese Web search engines offer limited multimedia searching functionality. The significance of the study is based on two factors: a) little research has been conducted on Chinese Web search engines, and b) the instrument used in the study and the results obtained by this research could help users, Web designers, and Web search engine developers. By large, general Web search engines support more multimedia features than specialized one
Concept hierarchy across languages in text-based image retrieval: a user evaluation
The University of Sheffield participated in Interactive ImageCLEF 2005 with a comparative user
evaluation of two interfaces: one displaying search results as a list, the other organizing retrieved images into
a hierarchy of concepts displayed on the interface as an interactive menu. Data was analysed with respect to
effectiveness (number of images retrieved), efficiency (time needed) and user satisfaction (opinions from
questionnaires). Effectiveness and efficiency were calculated at both 5 minutes (CLEF condition) and at final
time. The list was marginally more effective than the menu at 5 minutes (no statistical significance) but the
two were equal at final time showing the menu needs more time to be effectively used. The list was more efficient
at both 5 minutes and final time, although the difference was not statistically significant. Users preferred
the menu (75% vs. 25% for the list) indicating it to be an interesting and engaging feature. An inspection
of the logs showed that 11% of effective terms (i.e. no stop-words, single terms) were not translated and
that another 5% were ill translations. Some of those terms were used by all participants and were fundamental
for some of the tasks. Non translated and ill translated terms negatively affected the search, hierarchy generation
and, results display. More work has to be carried out to test the system under different setting, e.g. using
a dictionary instead of MT that appears to be ineffective in translating usersâ queries that rarely are
grammatically correct. The evaluation also indicated directions for a new interface design that allows the user
to check query translation (in both input and output) and that incorporates visual content image retrieval to
improve result organization
How people find videos
At present very little is known about how people locate and view videos 'in the wild'. This study draws a rich picture of everyday video seeking strategies and video information needs, based on an ethnographic study of New Zealand university students. These insights into the participants' activities and motivations suggest potentially useful facilities for a video digital library
Finding video on the web
At present very little is known about how people locate and view videos. This study draws a rich picture of everyday video seeking strategies and video information needs, based on an ethnographic study of New Zealand university students. These insights into the participantsâ activities and motivations suggest potentially useful facilities for a video digital library
Template Mining for Information Extraction from Digital Documents
published or submitted for publicatio
- âŠ