217 research outputs found
Finding Support Documents with a Logistic Regression Approach
Entity retrieval finds the relevant results for a userâs information needs at a finer unit called âentityâ. To retrieve such entity, people usually first locate a small set of support documents which contain answer entities, and then further detect the answer entities in this set. In the literature, people view the support documents as relevant documents, and their findings as a conventional document retrieval problem. In this paper, we will state that finding support documents and that of relevant documents, although sounds similar, have important differences. Further, we propose a logistic regression approach to find support documents. Our experiment results show that the logistic regression method performs significantly better than a baseline system that treat the support document finding as a conventional document retrieval problem
Benchmarking the Privacy-Preserving People Search
People search is an important topic in information retrieval. Many previous
studies on this topic employed social networks to boost search performance by
incorporating either local network features (e.g. the common connections
between the querying user and candidates in social networks), or global network
features (e.g. the PageRank), or both. However, the available social network
information can be restricted because of the privacy settings of involved
users, which in turn would affect the performance of people search. Therefore,
in this paper, we focus on the privacy issues in people search. We propose
simulating different privacy settings with a public social network due to the
unavailability of privacy-concerned networks. Our study examines the influences
of privacy concerns on the local and global network features, and their impacts
on the performance of people search. Our results show that: 1) the privacy
concerns of different people in the networks have different influences. People
with higher association (i.e. higher degree in a network) have much greater
impacts on the performance of people search; 2) local network features are more
sensitive to the privacy concerns, especially when such concerns come from high
association peoples in the network who are also related to the querying user.
As the first study on this topic, we hope to generate further discussions on
these issues.Comment: 4 pages, 5 figure
References to graphical objects in interactive multimodel queries
This thesis describes a computational model for interpreting natural language expressions in an interactive multimodal query system integrating both natural language text
and graphic displays. The primary concern of the model is to interpret expressions that
might involve graphical attributes, and expressions whose referents could be objects
on the screen.Graphical objects on the screen are used to visualise entities in the application domain
and their attributes (in short, domain entities and domain attributes). This is why
graphical objects are treated as descriptions of those domain entities/attributes in
the literature. However, graphical objects and their attributes are visible during the
interaction, and are thus known by the participants of the interaction. Therefore, they
themselves should be part of the mutual knowledge of the interaction.This poses some interesting problems in language processing. As part of the mutual
knowledge, graphical attributes could be used in expressions, and graphical objects
could be referred to by expressions. In consequence, there could be ambiguities about
whether an attribute in an expression belongs to a graphical object or to a domain
entity. There could also be ambiguities about whether the referent of an expression is
a graphical object or a domain entity.The main contributions of this thesis consist of analysing the above ambiguities, deÂŹ
signing, implementing and testing a computational model and a demonstration system
for resolving these ambiguities. Firstly, a structure and corresponding terminology are
set up, so these ambiguities can be clarified as ambiguities derived from referring to
different databases, the screen or the application domain (source ambiguities). Secondly, a meaning representation language is designed which explicitly represents the
information about which database an attribute/entity comes from. Several linguistic
regularities inside and among referring expressions are described so that they can be
used as heuristics in the ambiguity resolution. Thirdly, a computational model based
on constraint satisfaction is constructed to resolve simultaneously some reference ambiguities and source ambiguities. Then, a demonstration system integrating natural
language text and graphics is implemented, whose core is the computational model.This thesis ends with an evaluation of the computational model. It provides some
concrete evidence about the advantages and disadvantages of the above approach
Enhancing Clinical Decision Support Systems with Public Knowledge Bases
With vast amount of biomedical literature available online, doctors have the benefits of consulting the literature before making clinical decisions, but they are facing the daunting task of finding needles in haystacks. In this situation, it would help doctors if an effective clinical decision support system could generate accurate queries and return a manageable size of highly useful articles. Existing studies showed the useful-ness of patientsâ diagnosis information in such scenario, but diagnosis is often missing in most cases. Furthermore, existing diagnosis prediction systems mainly focus on predicting a small range of diseases with well-formatted features, and it is still a great challenge to perform large-scale automatic diagnosis predictions based on noisy pa-tient medical records. In this paper, we propose automatic diagnosis prediction meth-ods for enhancing the retrieval in a clinical decision support system, where the predic-tion is based on evidences automatically collected from publicly accessible online knowledge bases such as Wikipedia and Semantic MEDLINE Database (SemMedDB). The assumption is that relevant diseases and their corresponding symptoms co-occur more frequently in these knowledge bases. Our methods perfor-mance was evaluated using test collections from the Clinical Decision Support (CDS) track in TREC 2014, 2015 and 2016. The results show that our best method can au-tomatically predict diagnosis with about 65.56% usefulness, and such predictions can significantly improve the biomedical literatures retrieval. Our methods can generate comparable retrieval results to the state-of-art methods, which utilize much more complicated methods and some manually crafted medical knowledge. One possible future work is to apply these methods in collaboration with real doctors
Usersâ Perceived Difficulties and Corresponding Reformulation Strategies in Voice Search
We report on usersâ perceptions on query input errors and query reformulation strategies in voice search. The perceptions were collected through a controlled experiment. Our results reveal that: 1) usersâ faced obstacles during a voice search that can be related to system recognition errors and topic complexity; 2) users naturally develop different strategies while dealing with varying types of words that are problematic for systems to recognize
Finding cultural heritage images through a Dual-Perspective Navigation Framework
With the increasing volume of digital images, improving techniques for image findability is receiving heightened attention. The cultural heritage sector, with its vast resource of images, has realized the value of social tags and started using tags in parallel with controlled vocabularies to increase the odds of users finding images of interest. The research presented in this paper develops the Dual-Perspective Navigation Framework (DPNF), which integrates controlled vocabularies and social tags to represent the aboutness of an item more comprehensively, in order that the information scent can be maximized to facilitate resource findability.
DPNF utilizes the mechanisms of faceted browsing and tag-based navigation to offer a seamless interaction between expertsâ subject headings and public tags during image search. In a controlled user study, participants effectively completed more exploratory tasks with the DPNF interface than with the tag-only interface. DPNF is more efficient than both single descriptor interfaces (subject heading-only and tag-only interfaces). Participants spent significantly less time, fewer interface interactions, and less back tracking to complete an exploratory task without an extra workload. In addition, participants were more satisfied with the DPNF interface than with the others. The findings of this study can assist interface designers struggling with what information is most helpful to users and facilitate searching tasks. It also maximizes end usersâ chances of finding target images by engaging image information from two sources: the professionalsâ description of items in a collection and the crowd's assignment of social tags
Toward a conceptual framework for data sharing practices in social sciences: A profile approach. In the proceedings of the ASIS&T 2016 Annual Meeting
This paper investigates the landscape of data-sharing practices in social sciences via the data sharing profile approach. Guided by two pre-existing conceptual frameworks, Knowledge Infrastructure (KI) and the Theory of Remote Scientific Collaboration (TORSC), we design and test a profile tool that consists of four overarching dimensions for capturing social scientistsâ data practices, namely: 1) data characteristics, 2) perceived technical infrastructure, 3) perceived organizational context, and 4) individual characteristics. To ensure that the instrument can be applied in real and practical terms, we conduct a case study by collecting responses from 93 early-career social scientists at two research universities in the Pittsburgh Area, U.S. The results suggest that there is no significant difference, in general, among scholars who prefer quantitative, mixed method, or qualitative research methods in terms of research activities and data-sharing practices. We also confirm that there is a gap between participantsâ attitudes about research openness and their actual sharing behaviors, highlighting the need to study the âbarrierâ in addition to the âincentiveâ of research data sharing
- âŠ