13,922 research outputs found
Unifying an Introduction to Artificial Intelligence Course through Machine Learning Laboratory Experiences
This paper presents work on a collaborative project funded by the National Science Foundation that incorporates machine learning as a unifying theme to teach fundamental concepts typically covered in the introductory Artificial Intelligence courses. The project involves the development of an adaptable framework for the presentation of core AI topics. This is accomplished through the development, implementation, and testing of a suite of adaptable, hands-on laboratory projects that can be closely integrated into the AI course. Through the design and implementation of learning systems that enhance commonly-deployed applications, our model acknowledges that intelligent systems are best taught through their application to challenging problems. The goals of the project are to (1) enhance the student learning experience in the AI course, (2) increase student interest and motivation to learn AI by providing a framework for the presentation of the major AI topics that emphasizes the strong connection between AI and computer science and engineering, and (3) highlight the bridge that machine learning provides between AI technology and modern software engineering
An Experimental Digital Library Platform - A Demonstrator Prototype for the DigLib Project at SICS
Within the framework of the Digital Library project at SICS, this thesis describes the implementation of a demonstrator prototype of a digital library (DigLib); an experimental platform integrating several functions in one common interface. It includes descriptions of the structure and formats of the digital library collection, the tailoring of the search engine Dienst, the construction of a keyword extraction tool, and the design and development of the interface. The platform was realised through sicsDAIS, an agent interaction and presentation system, and is to be used for testing and evaluating various tools for information seeking. The platform supports various user interaction strategies by providing: search in bibliographic records (Dienst); an index of keywords (the Keyword Extraction Function (KEF)); and browsing through the hierarchical structure of the collection. KEF was developed for this thesis work, and extracts and presents keywords from Swedish documents. Although based on a comparatively simple algorithm, KEF contributes by supplying a long-felt want in the area of Information Retrieval. Evaluations of the tasks and the interface still remain to be done, but the digital library is very much up and running. By implementing the platform through sicsDAIS, DigLib can deploy additional tools and search engines without interfering with already running modules. If wanted, agents providing other services than SICS can supply, can be plugged in
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
An Improved PageRank Method based on Genetic Algorithm for Web Search
AbstractWeb search engine has become a very important tool for finding information efficiently from the massive Web data. Based on PageRank algorithm, a genetic PageRank algorithm (GPRA) is proposed. With the condition of preserving PageRank algorithm advantages, GPRA takes advantage of genetic algorithm so as to solve web search. Experimental results have shown that GPRA is superior to PageRank algorithm and genetic algorithm on performance
Overview of VideoCLEF 2009: New perspectives on speech-based multimedia content enrichment
VideoCLEF 2009 offered three tasks related to enriching video content for improved multimedia access in a multilingual environment. For each task, video data (Dutch-language television, predominantly documentaries) accompanied by speech recognition transcripts were provided.
The Subject Classification Task involved automatic tagging of videos with subject theme labels. The best performance was achieved by approaching subject tagging as an information retrieval task and using both speech recognition transcripts and archival metadata. Alternatively, classifiers were trained using either the training data provided or data collected from Wikipedia or via general Web search. The Affect Task involved detecting narrative peaks, defined as points where viewers perceive heightened dramatic tension. The task was carried out on the “Beeldenstorm” collection containing 45 short-form documentaries on the visual arts. The best runs exploited affective vocabulary and audience directed speech. Other approaches included using topic changes, elevated speaking pitch, increased speaking intensity and radical visual changes. The Linking Task, also called “Finding Related Resources Across Languages,” involved linking video to material on the same subject in a different language.
Participants were provided with a list of multimedia anchors (short video segments) in the Dutch-language “Beeldenstorm” collection and were expected to return target pages drawn from English-language Wikipedia. The best performing methods used the transcript of the
speech spoken during the multimedia anchor to build a query to search an index of the Dutch language Wikipedia. The Dutch Wikipedia pages returned were used to identify related English pages. Participants also experimented with pseudo-relevance feedback, query translation and methods that targeted proper names
Towards an Information Retrieval Theory of Everything
I present three well-known probabilistic models of information retrieval in tutorial style: The binary independence probabilistic model, the language modeling approach, and Google's page rank. Although all three models are based on probability theory, they are very different in nature. Each model seems well-suited for solving certain information retrieval problems, but not so useful for solving others. So, essentially each model solves part of a bigger puzzle, and a unified view on these models might be a first step towards an Information Retrieval Theory of Everything
Information Retrieval Models
Many applications that handle information on the internet would be completely\ud
inadequate without the support of information retrieval technology. How would\ud
we find information on the world wide web if there were no web search engines?\ud
How would we manage our email without spam filtering? Much of the development\ud
of information retrieval technology, such as web search engines and spam\ud
filters, requires a combination of experimentation and theory. Experimentation\ud
and rigorous empirical testing are needed to keep up with increasing volumes of\ud
web pages and emails. Furthermore, experimentation and constant adaptation\ud
of technology is needed in practice to counteract the effects of people that deliberately\ud
try to manipulate the technology, such as email spammers. However,\ud
if experimentation is not guided by theory, engineering becomes trial and error.\ud
New problems and challenges for information retrieval come up constantly.\ud
They cannot possibly be solved by trial and error alone. So, what is the theory\ud
of information retrieval?\ud
There is not one convincing answer to this question. There are many theories,\ud
here called formal models, and each model is helpful for the development of\ud
some information retrieval tools, but not so helpful for the development others.\ud
In order to understand information retrieval, it is essential to learn about these\ud
retrieval models. In this chapter, some of the most important retrieval models\ud
are gathered and explained in a tutorial style
Toward Entity-Aware Search
As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability
Why People Search for Images using Web Search Engines
What are the intents or goals behind human interactions with image search
engines? Knowing why people search for images is of major concern to Web image
search engines because user satisfaction may vary as intent varies. Previous
analyses of image search behavior have mostly been query-based, focusing on
what images people search for, rather than intent-based, that is, why people
search for images. To date, there is no thorough investigation of how different
image search intents affect users' search behavior.
In this paper, we address the following questions: (1)Why do people search
for images in text-based Web image search systems? (2)How does image search
behavior change with user intent? (3)Can we predict user intent effectively
from interactions during the early stages of a search session? To this end, we
conduct both a lab-based user study and a commercial search log analysis.
We show that user intents in image search can be grouped into three classes:
Explore/Learn, Entertain, and Locate/Acquire. Our lab-based user study reveals
different user behavior patterns under these three intents, such as first click
time, query reformulation, dwell time and mouse movement on the result page.
Based on user interaction features during the early stages of an image search
session, that is, before mouse scroll, we develop an intent classifier that is
able to achieve promising results for classifying intents into our three intent
classes. Given that all features can be obtained online and unobtrusively, the
predicted intents can provide guidance for choosing ranking methods immediately
after scrolling
- …