Search CORE

217 research outputs found

Finding Support Documents with a Logistic Regression Approach

Author: He Daqing
Li Qi
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 28/07/2011
Field of study

Entity retrieval finds the relevant results for a user’s information needs at a finer unit called “entity”. To retrieve such entity, people usually first locate a small set of support documents which contain answer entities, and then further detect the answer entities in this set. In the literature, people view the support documents as relevant documents, and their findings as a conventional document retrieval problem. In this paper, we will state that finding support documents and that of relevant documents, although sounds similar, have important differences. Further, we propose a logistic regression approach to find support documents. Our experiment results show that the logistic regression method performs significantly better than a baseline system that treat the support document finding as a conventional document retrieval problem

D-Scholarship@Pitt

Benchmarking the Privacy-Preserving People Search

Author: Han Shuguang
He Daqing
Yue Zhen
Publication venue
Publication date: 19/09/2014
Field of study

People search is an important topic in information retrieval. Many previous studies on this topic employed social networks to boost search performance by incorporating either local network features (e.g. the common connections between the querying user and candidates in social networks), or global network features (e.g. the PageRank), or both. However, the available social network information can be restricted because of the privacy settings of involved users, which in turn would affect the performance of people search. Therefore, in this paper, we focus on the privacy issues in people search. We propose simulating different privacy settings with a public social network due to the unavailability of privacy-concerned networks. Our study examines the influences of privacy concerns on the local and global network features, and their impacts on the performance of people search. Our results show that: 1) the privacy concerns of different people in the networks have different influences. People with higher association (i.e. higher degree in a network) have much greater impacts on the performance of people search; 2) local network features are more sensitive to the privacy concerns, especially when such concerns come from high association peoples in the network who are also related to the querying user. As the first study on this topic, we hope to generate further discussions on these issues.Comment: 4 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

References to graphical objects in interactive multimodel queries

Author: He Daqing
Publication venue: The University of Edinburgh
Publication date: 01/01/2001
Field of study

This thesis describes a computational model for interpreting natural language expressions in an interactive multimodal query system integrating both natural language text and graphic displays. The primary concern of the model is to interpret expressions that might involve graphical attributes, and expressions whose referents could be objects on the screen.Graphical objects on the screen are used to visualise entities in the application domain and their attributes (in short, domain entities and domain attributes). This is why graphical objects are treated as descriptions of those domain entities/attributes in the literature. However, graphical objects and their attributes are visible during the interaction, and are thus known by the participants of the interaction. Therefore, they themselves should be part of the mutual knowledge of the interaction.This poses some interesting problems in language processing. As part of the mutual knowledge, graphical attributes could be used in expressions, and graphical objects could be referred to by expressions. In consequence, there could be ambiguities about whether an attribute in an expression belongs to a graphical object or to a domain entity. There could also be ambiguities about whether the referent of an expression is a graphical object or a domain entity.The main contributions of this thesis consist of analysing the above ambiguities, de¬ signing, implementing and testing a computational model and a demonstration system for resolving these ambiguities. Firstly, a structure and corresponding terminology are set up, so these ambiguities can be clarified as ambiguities derived from referring to different databases, the screen or the application domain (source ambiguities). Secondly, a meaning representation language is designed which explicitly represents the information about which database an attribute/entity comes from. Several linguistic regularities inside and among referring expressions are described so that they can be used as heuristics in the ambiguity resolution. Thirdly, a computational model based on constraint satisfaction is constructed to resolve simultaneously some reference ambiguities and source ambiguities. Then, a demonstration system integrating natural language text and graphics is implemented, whose core is the computational model.This thesis ends with an evaluation of the computational model. It provides some concrete evidence about the advantages and disadvantages of the above approach

Edinburgh Research Archive

Enhancing Clinical Decision Support Systems with Public Knowledge Bases

Author: He Daqing
Zhang Danchen
Publication venue
Publication date
Field of study

With vast amount of biomedical literature available online, doctors have the benefits of consulting the literature before making clinical decisions, but they are facing the daunting task of finding needles in haystacks. In this situation, it would help doctors if an effective clinical decision support system could generate accurate queries and return a manageable size of highly useful articles. Existing studies showed the useful-ness of patients’ diagnosis information in such scenario, but diagnosis is often missing in most cases. Furthermore, existing diagnosis prediction systems mainly focus on predicting a small range of diseases with well-formatted features, and it is still a great challenge to perform large-scale automatic diagnosis predictions based on noisy pa-tient medical records. In this paper, we propose automatic diagnosis prediction meth-ods for enhancing the retrieval in a clinical decision support system, where the predic-tion is based on evidences automatically collected from publicly accessible online knowledge bases such as Wikipedia and Semantic MEDLINE Database (SemMedDB). The assumption is that relevant diseases and their corresponding symptoms co-occur more frequently in these knowledge bases. Our methods perfor-mance was evaluated using test collections from the Clinical Decision Support (CDS) track in TREC 2014, 2015 and 2016. The results show that our best method can au-tomatically predict diagnosis with about 65.56% usefulness, and such predictions can significantly improve the biomedical literatures retrieval. Our methods can generate comparable retrieval results to the state-of-art methods, which utilize much more complicated methods and some manually crafted medical knowledge. One possible future work is to apply these methods in collaboration with real doctors

D-Scholarship@Pitt

Users’ Perceived Difficulties and Corresponding Reformulation Strategies in Voice Search

Author: He Daqing
Jeng Wei
Jiang Jiepu
Publication venue
Publication date: 03/10/2013
Field of study

We report on users’ perceptions on query input errors and query reformulation strategies in voice search. The perceptions were collected through a controlled experiment. Our results reveal that: 1) users’ faced obstacles during a voice search that can be related to system recognition errors and topic complexity; 2) users naturally develop different strategies while dealing with varying types of words that are problematic for systems to recognize

D-Scholarship@Pitt

Finding cultural heritage images through a Dual-Perspective Navigation Framework

Author: Brusilovsky Peter
He Daqing
Lin Yi-Ling
Publication venue: 'Elsevier BV'
Publication date: 01/09/2016
Field of study

With the increasing volume of digital images, improving techniques for image findability is receiving heightened attention. The cultural heritage sector, with its vast resource of images, has realized the value of social tags and started using tags in parallel with controlled vocabularies to increase the odds of users finding images of interest. The research presented in this paper develops the Dual-Perspective Navigation Framework (DPNF), which integrates controlled vocabularies and social tags to represent the aboutness of an item more comprehensively, in order that the information scent can be maximized to facilitate resource findability. DPNF utilizes the mechanisms of faceted browsing and tag-based navigation to offer a seamless interaction between experts’ subject headings and public tags during image search. In a controlled user study, participants effectively completed more exploratory tasks with the DPNF interface than with the tag-only interface. DPNF is more efficient than both single descriptor interfaces (subject heading-only and tag-only interfaces). Participants spent significantly less time, fewer interface interactions, and less back tracking to complete an exploratory task without an extra workload. In addition, participants were more satisfied with the DPNF interface than with the others. The findings of this study can assist interface designers struggling with what information is most helpful to users and facilitate searching tasks. It also maximizes end users’ chances of finding target images by engaging image information from two sources: the professionals’ description of items in a collection and the crowd's assignment of social tags

Crossref

D-Scholarship@Pitt

Understanding qualitative data sharing practices in social sciences

Author: He Daqing
Jeng Wei
Oh Jung Sun
Publication venue
Publication date: 01/02/2016
Field of study

D-Scholarship@Pitt

Toward a conceptual framework for data sharing practices in social sciences: A profile approach. In the proceedings of the ASIS&T 2016 Annual Meeting

Author: He Daqing
Jeng Wei
Oh Jung Sun
Publication venue: 'Royal College of Obstetricians & Gynaecologists (RCOG)'
Publication date: 01/01/2016
Field of study

This paper investigates the landscape of data-sharing practices in social sciences via the data sharing profile approach. Guided by two pre-existing conceptual frameworks, Knowledge Infrastructure (KI) and the Theory of Remote Scientific Collaboration (TORSC), we design and test a profile tool that consists of four overarching dimensions for capturing social scientists’ data practices, namely: 1) data characteristics, 2) perceived technical infrastructure, 3) perceived organizational context, and 4) individual characteristics. To ensure that the instrument can be applied in real and practical terms, we conduct a case study by collecting responses from 93 early-career social scientists at two research universities in the Pittsburgh Area, U.S. The results suggest that there is no significant difference, in general, among scholars who prefer quantitative, mixed method, or qualitative research methods in terms of research activities and data-sharing practices. We also confirm that there is a gap between participants’ attitudes about research openness and their actual sharing behaviors, highlighting the need to study the “barrier” in addition to the “incentive” of research data sharing

D-Scholarship@Pitt