16,059 research outputs found
Math Search for the Masses: Multimodal Search Interfaces and Appearance-Based Retrieval
We summarize math search engines and search interfaces produced by the
Document and Pattern Recognition Lab in recent years, and in particular the min
math search interface and the Tangent search engine. Source code for both
systems are publicly available. "The Masses" refers to our emphasis on creating
systems for mathematical non-experts, who may be looking to define unfamiliar
notation, or browse documents based on the visual appearance of formulae rather
than their mathematical semantics.Comment: Paper for Invited Talk at 2015 Conference on Intelligent Computer
Mathematics (July, Washington DC
Overview of VideoCLEF 2009: New perspectives on speech-based multimedia content enrichment
VideoCLEF 2009 offered three tasks related to enriching video content for improved multimedia access in a multilingual environment. For each task, video data (Dutch-language television, predominantly documentaries) accompanied by speech recognition transcripts were provided.
The Subject Classification Task involved automatic tagging of videos with subject theme labels. The best performance was achieved by approaching subject tagging as an information retrieval task and using both speech recognition transcripts and archival metadata. Alternatively, classifiers were trained using either the training data provided or data collected from Wikipedia or via general Web search. The Affect Task involved detecting narrative peaks, defined as points where viewers perceive heightened dramatic tension. The task was carried out on the âBeeldenstormâ collection containing 45 short-form documentaries on the visual arts. The best runs exploited affective vocabulary and audience directed speech. Other approaches included using topic changes, elevated speaking pitch, increased speaking intensity and radical visual changes. The Linking Task, also called âFinding Related Resources Across Languages,â involved linking video to material on the same subject in a different language.
Participants were provided with a list of multimedia anchors (short video segments) in the Dutch-language âBeeldenstormâ collection and were expected to return target pages drawn from English-language Wikipedia. The best performing methods used the transcript of the
speech spoken during the multimedia anchor to build a query to search an index of the Dutch language Wikipedia. The Dutch Wikipedia pages returned were used to identify related English pages. Participants also experimented with pseudo-relevance feedback, query translation and methods that targeted proper names
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
Massive ontology interface
This paper describes the Massive Ontology Interface (MOI), a web portal which facilitates interaction with a large ontology (over 200,000 concepts and 1.6M assertions) that is built automatically using OpenCyc as a backbone. The aim of the interface is to simplify interaction with the massive amounts of information and guide the user towards understanding the ontologyâs data. Using either a text or graph-based representation, users can discuss and edit the ontology. Social elements utilizing gamification techniques are included to encourage users to create and collaborate on stored knowledge as part of a web community.
An evaluation by 30 users comparing MOI with OpenCycâs original interface showed significant improvements in user understanding of the ontology, although full testing of the interfaceâs social elements lies in the future
A Taxonomy of Hyperlink Hiding Techniques
Hidden links are designed solely for search engines rather than visitors. To
get high search engine rankings, link hiding techniques are usually used for
the profitability of black industries, such as illicit game servers, false
medical services, illegal gambling, and less attractive high-profit industry,
etc. This paper investigates hyperlink hiding techniques on the Web, and gives
a detailed taxonomy. We believe the taxonomy can help develop appropriate
countermeasures. Study on 5,583,451 Chinese sites' home pages indicate that
link hidden techniques are very prevalent on the Web. We also tried to explore
the attitude of Google towards link hiding spam by analyzing the PageRank
values of relative links. The results show that more should be done to punish
the hidden link spam.Comment: 12 pages, 2 figure
- âŠ