5,606 research outputs found
Towards Query Logs for Privacy Studies: On Deriving Search Queries from Questions
Translating verbose information needs into crisp search queries is a
phenomenon that is ubiquitous but hardly understood. Insights into this process
could be valuable in several applications, including synthesizing large
privacy-friendly query logs from public Web sources which are readily available
to the academic research community. In this work, we take a step towards
understanding query formulation by tapping into the rich potential of community
question answering (CQA) forums. Specifically, we sample natural language (NL)
questions spanning diverse themes from the Stack Exchange platform, and conduct
a large-scale conversion experiment where crowdworkers submit search queries
they would use when looking for equivalent information. We provide a careful
analysis of this data, accounting for possible sources of bias during
conversion, along with insights into user-specific linguistic patterns and
search behaviors. We release a dataset of 7,000 question-query pairs from this
study to facilitate further research on query understanding.Comment: ECIR 2020 Short Pape
Semantic Grounding Strategies for Tagbased Recommender Systems
Recommender systems usually operate on similarities between recommended items
or users. Tag based recommender systems utilize similarities on tags. The tags
are however mostly free user entered phrases. Therefore, similarities computed
without their semantic groundings might lead to less relevant recommendations.
In this paper, we study a semantic grounding used for tag similarity calculus.
We show a comprehensive analysis of semantic grounding given by 20 ontologies
from different domains. The study besides other things reveals that currently
available OWL ontologies are very narrow and the percentage of the similarity
expansions is rather small. WordNet scores slightly better as it is broader but
not much as it does not support several semantic relationships. Furthermore,
the study reveals that even with such number of expansions, the recommendations
change considerably.Comment: 13 pages, 5 figure
RiPLE: Recommendation in Peer-Learning Environments Based on Knowledge Gaps and Interests
Various forms of Peer-Learning Environments are increasingly being used in
post-secondary education, often to help build repositories of student generated
learning objects. However, large classes can result in an extensive repository,
which can make it more challenging for students to search for suitable objects
that both reflect their interests and address their knowledge gaps. Recommender
Systems for Technology Enhanced Learning (RecSysTEL) offer a potential solution
to this problem by providing sophisticated filtering techniques to help
students to find the resources that they need in a timely manner. Here, a new
RecSysTEL for Recommendation in Peer-Learning Environments (RiPLE) is
presented. The approach uses a collaborative filtering algorithm based upon
matrix factorization to create personalized recommendations for individual
students that address their interests and their current knowledge gaps. The
approach is validated using both synthetic and real data sets. The results are
promising, indicating RiPLE is able to provide sensible personalized
recommendations for both regular and cold-start users under reasonable
assumptions about parameters and user behavior.Comment: 25 pages, 7 figures. The paper is accepted for publication in the
Journal of Educational Data Minin
Modeling an ontology on accessible evacuation routes for emergencies
Providing alert communication in emergency situations is vital to reduce the number of victims. However, this is a challenging goal for researchers and professionals due to the diverse pool of prospective users, e.g. people with disabilities as well as other vulnerable groups. Moreover, in the event of an emergency situation, many people could become vulnerable because of exceptional circumstances such as stress, an unknown environment or even visual impairment (e.g. fire causing smoke). Within this scope, a crucial activity is to notify affected people about safe places and available evacuation routes. In order to address this need, we propose to extend an ontology, called SEMA4A (Simple EMergency Alert 4 [for] All), developed in a previous work for managing knowledge about accessibility guidelines, emergency situations and communication technologies. In this paper, we introduce a semi-automatic technique for knowledge acquisition and modeling on accessible evacuation routes. We introduce a use case to show applications of the ontology and conclude with an evaluation involving several experts in evacuation procedures. © 2014 Elsevier Ltd. All rights reserved
Social Search with Missing Data: Which Ranking Algorithm?
Online social networking tools are extremely popular, but can miss potential discoveries latent in the social 'fabric'. Matchmaking services which can do naive profile matching with old database technology are too brittle in the absence of key data, and even modern ontological markup, though powerful, can be onerous at data-input time. In this paper, we present a system called BuddyFinder which can automatically identify buddies who can best match a user's search requirements specified in a term-based query, even in the absence of stored user-profiles. We deploy and compare five statistical measures, namely, our own CORDER, mutual information (MI), phi-squared, improved MI and Z score, and two TF/IDF based baseline methods to find online users who best match the search requirements based on 'inferred profiles' of these users in the form of scavenged web pages. These measures identify statistically significant relationships between online users and a term-based query. Our user evaluation on two groups of users shows that BuddyFinder can find users highly relevant to search queries, and that CORDER achieved the best average ranking correlations among all seven algorithms and improved the performance of both baseline methods
Enhanced Search for Educational Resources - A Perspective and a Prototype from ccLearn
Users of search tools who seek educational materials on the Internet are typically presented with either a web-scale search (e.g., Google or Yahoo) or a specialized, site-specific tool. The specialized search tools often rely upon custom data fields, such as user-entered ratings, to provide additional value. As currently designed, these systems are generally too labor intensive to manage and scale up beyond a single site or set of resources.However, custom (or structured) data of some form is necessary if search outcomes foreducational materials are to be improved. For example, design criteria and evaluative metrics are crucial attributes for educational resources, and these currently require human labeling and verification. Thus, one challenge is to design a search tool that capitalizes on available structured data (also called metadata) but is not crippled if the data are missing. This information should be amenable to repurposing by anyone, which means that it must be archived in a manner that can be discovered and leveraged easily.In this paper, we describe the extent to which DiscoverEd, a prototype developed by ccLearn, meets the design challenge of a scalable, enhanced search platform for educational resources. We then explore some of the key challenges regarding enhanced search for topic-specific Internet resources generally. We conclude by illustrating some possible future developments and third-party enhancements to the DiscoverEd prototype
Recommended from our members
Revyu: Linking reviews and ratings into the Web of Data
Revyu is a live, publicly accessible reviewing and rating Web site, designed to be usable by humans whilst transparently generating machine-readable RDF metadata for the Semantic Web, based on user input. The site uses Semantic Web specifications such as RDF and SPARQL, and the latest Linked Data best practices to create a major node in a potentially Web-wide ecosystem of reviews and related data. Throughout the implementation of Revyu design decisions have been made that aim to minimize the burden on users, by maximizing the reuse of external data sources, and allowing less structured human input (in the form of Web 2.0-style tagging) from which stronger semantics can later be derived. Links to external sources such as DBpedia are exploited to create human-oriented mashups at the HTML level, whilst links are also made in RDF to ensure Revyu plays a first class role in the blossoming Web of Data. In this paper we document design decisions made during the implementation of Revyu, discuss the techniques used for linking Revyu data with external sources, and outline how data from the site is being used to infer the trustworthiness of reviewers as sources of information and recommendations
- …