2,818 research outputs found

    Simplifying Sparse Expert Recommendation by Revisiting Graph Diffusion

    Full text link
    Community Question Answering (CQA) websites have become valuable knowledge repositories where individuals exchange information by asking and answering questions. With an ever-increasing number of questions and high migration of users in and out of communities, a key challenge is to design effective strategies for recommending experts for new questions. In this paper, we propose a simple graph-diffusion expert recommendation model for CQA, that can outperform state-of-the art deep learning representatives and collaborative models. Our proposed method learns users' expertise in the context of both semantic and temporal information to capture their changing interest and activity levels with time. Experiments on five real-world datasets from the Stack Exchange network demonstrate that our approach outperforms competitive baseline methods. Further, experiments on cold-start users (users with a limited historical record) show our model achieves an average of ~ 30% performance gain compared to the best baseline method

    Knowledge fixation and accretion: Longitudinal analysis of a social question-answering site

    Get PDF
    © 2014, Emerald Group Publishing Limited. Purpose – The purpose of this paper is to investigate longitudinal features of an established social question-answering (Q&A) site to study how question-answer resources and other community features change over time. Design/methodology/approach – Statistical analysis and visualisation was performed on the full data dump from the Stack Overflow social Q&A site for programmers. Findings – The timing of answers is as strong a predictor of acceptance – a proxy for user satisfaction – as the structural features of provided answers sometimes associated with quality. While many questions and answer exchanges are short-lived, there is a small yet interesting subset of questions where new answers receive community approval and which may end up being ranked more highly than early answers. Research limitations/implications – As a large-scale data oriented research study, this work says little about user motivations to find and contribute new knowledge to old questions or about the impact of the resource on the consumer. This will require complementary studies using qualitative and evaluative methods. Practical implications –While content contribution to social question-asking is largely undertaken within a very short time frame, content consumption is usually over far longer periods. Methods and incentives by which content can be updated and maintained need to be considered. This work should be of interest to knowledge exchange community designers and managers. Originality/value – Few studies have looked at temporal patterns in social Q&A and how time and the moderation and voting systems employed may shape resource quality

    Kansans in the Middle of the Pandemic: Risk Perception, Knowledge, Compliance with Preventive Measures, and Primary Sources of Information about COVID-19

    Get PDF
    Introduction. As we conduct this study, the world is in the grasp of a deadly pandemic. In less than six months since its first diagnosis in Wuhan, China, the COVID-19 infectious disease due to the novel coronavirus has infected over 5,000,000 people and claimed over 350,000 lives. In the United States, most of the cases are in large urban settings along the coasts, but the disease is slowly progressing through the mainland. Kansas, with its particular location in the midwest United States, has seen a relatively small number of cases, but these are increasing. The Kansas government took radical measures to prevent the spread of the disease. According to the Health Beliefs Model, an individual’s perception of risk will dictate engagement with preventive behaviors. Knowledge about the disease and preventive measures drive the risk assessment. Knowledge is dependant on the sources of information used. This study explored these metrics in a sample of Kansans living in the times of the COVID-19 pandemic. Methods. A combination of snowball samples and random distribution through social media was used to recruit participants to an online survey. The risk and knowledge instrument was developed and validated by WHO Europe. Data collection lasted 96 hours. Results. The attitudes and behaviors of Kansans concerning COVID-19 were consistent with its location in an area of the country with a relatively lower incidence of the disease. Participants had good knowledge about the disease and preventive measures and were willing to comply with recommendations from local authorities. Conclusion. Localized information sources that cater to the community are often primary, while social media is not a valuable source for information pertinent to COVID-19

    GAUGING PUBLIC INTEREST FROM SERVER LOGS, SURVEYS AND INLINKS

    Get PDF
    As the World Wide Web (the Web) has turned into a full-fledged medium to disseminate news, it is very important for journalism and information science researchers to investigate how Web users access online news reports and how to interpret such usage patterns. This doctoral thesis collected and analyzed Web server log statistics, online surveys results, online reprints of the top 50 news reports, as well as external inlinks data of a leading comprehensive online newspaper (the People’s Daily Online) in China, one of the biggest Web/information markets in today’s world. The aim of the thesis was to explore various methods to gauge the public interest from a Webometrics perspective. A total of 129 days of Web server log statistics, including the top 50 Chinese and English news stories with the highest daily pageview numbers, the comments attracted by these news items and the emailed frequencies of the same stories were collected from October 2007 to September 2008. These top 50 news items’positions on the Chinese and English homepages and the top 50 queries submitted to the website search engine of the People’s Daily Online were also retrieved. Results of the two online surveys launched in March 2008 and March 2009 were collected after their respective closing dates. The external inlinks to the People’s Daily Online were retrieved by Yahoo! (Chinese and English versions), and the online reprints were retrieved by Google. Besides the general usage patterns identified from the top 50 news stories, this study, by conducting statistical tests on the data sets, also reveals the following findings. First, the editors’ choices and the readers’ favorites do not always match each other; thus content of news title is more important than its homepage position in attracting online visits. Second, the Chinese and English readers’ interests in the same events are different. Third, the pageview numbers and comments posted to the news items reflect the unfavorable attitudes of the Chinese people toward the United States and Japan, which might offer us a method to investigate the public interest in some other issues or nations after necessary modifications. More importantly, some publicly available data, such as the comments posted to the news stories and online survey results, further show that the pageview measure does reflect readers’ interests/needs truthfully, as proved by the strong correlations between the top news reports and relevant top queries. The external ininks to the news websites and the online reprints of the top news items help us examine readers\u27 interests from other perspectives, as well as establish online profiles of the news websites. Such publicly accessible information could be an alternative data source for researchers to study readers\u27 interests when the Web server log data are not available. This doctoral thesis not only shows the usefulness of Web server log statistics, survey results, and other publicly accessible data in studying Web user’s information needs, but also offers practical suggestions for online news sites to improve their contents and homepage designs. However, no single method can draw a complete picture of the online news readers’ interests. The above mentioned research methodologies should be employed together, in order to make more comprehensive conclusions. Future research is especially needed to investigate the continuously rapid growth of the “Mobile News Readers,” which poses both challenges and opportunities to the press industry in the 21st century

    Exploring the Landscape of Natural Language Processing Research

    Full text link
    As an efficient approach to understand, generate, and process natural language texts, research in natural language processing (NLP) has exhibited a rapid spread and wide adoption in recent years. Given the increasing amount of research work in this area, several NLP-related approaches have been surveyed in the research community. However, a comprehensive study that categorizes established topics, identifies trends, and outlines areas for future research remains absent to this day. Contributing to closing this gap, we have systematically classified and analyzed research papers included in the ACL Anthology. As a result, we present a structured overview of the research landscape, provide a taxonomy of fields-of-study in NLP, analyze recent developments in NLP, summarize our findings, and highlight directions for future work.Comment: Accepted to the 14th International Conference on Recent Advances in Natural Language Processing (RANLP 2023

    Hierarchical Expert Recommendation on Community Question Answering Platforms

    Get PDF
    The community question answering (CQA) platforms, such as Stack Overflow, have become the primary source of answers to most questions in various topics. CQA platforms offer an opportunity for sharing and acquiring knowledge at a low cost, where users, many of whom are experts in a specific topic, can potentially provide high-quality solutions to a given question. Many recommendation methods have been proposed to match questions to potential good answerers. However, most existing methods have focused on modelling the user-question interaction — a user might answer multiple questions and a question might be answered by multiple users — using simple collaborative filtering approaches, overlooking the rich information in the question’s title and body when modelling the users’ expertise. This project fills the research gap by thoroughly examining machine learning and deep learning approaches that can be applied to the expert recommendation problem. It proposes a Hierarchical Expert Recommendation (HER) model, a deep learning recommender system that recommends experts to answer a given question in the CQA platform. Although choosing a deep learning over a machine learning solution for this problem can be justified considering the degree of complexity of the available datasets, we assess performance of each family of methods and evaluate the trade-off between them to pick the perfect fit for our problem. We analyzed various machine learning algorithms to determine their performances in the expert recommendation problem, which narrows down the potential ways for tackling this problem using traditional recommendation methods. Furthermore, we investigate the recommendation models based on matrix factorization to establish the baselines for our proposed model and shed light on the weaknesses and strengths of matrix- based solutions, which shape our final deep learning model. In the last section, we introduce the Hierarchical Expert Recommendation System (HER) that utilizes hierarchical attention-based neural networks to rep- resent the questions better and ultimately model the users’ expertise through user-question interactions. We conducted extensive experiments on a large real-world Stack Overflow dataset and benchmarked HER against the state-of-the-art baselines. The results from our extensive experiments show that HER outperforms the state-of-the-art baselines in recommending experts to answer questions in Stack Overflow

    GAUGING PUBLIC INTEREST FROM SERVER LOGS, SURVEYS AND INLINKS A Multi-Method Approach to Analyze News Websites

    Get PDF
    As the World Wide Web (the Web) has turned into a full-fledged medium to disseminate news, it is very important for journalism and information science researchers to investigate how Web users access online news reports and how to interpret such usage patterns. This doctoral thesis collected and analyzed Web server log statistics, online surveys results, online reprints of the top 50 news reports, as well as external inlinks data of a leading comprehensive online newspaper (the People\u27s Daily Online) in China, one of the biggest Web/information markets in today\u27s world. The aim of the thesis was to explore various methods to gauge the public interest from a Webometrics perspective. A total of 129 days of Web server log statistics, including the top 50 Chinese and English news stories with the highest daily pageview numbers, the comments attracted by these news items and the emailed frequencies of the same stories were collected from October 2007 to September 2008. These top 50 news items’positions on the Chinese and English homepages and the top 50 queries submitted to the website search engine of the People’s Daily Online were also retrieved. Results of the two online surveys launched in March 2008 and March 2009 were collected after their respective closing dates. The external inlinks to the People’s Daily Online were retrieved by Yahoo! (Chinese and English versions), and the online reprints were retrieved by Google. Besides the general usage patterns identified from the top 50 news stories, this study, by conducting statistical tests on the data sets, also reveals the following findings. First, the editors’ choices and the readers’ favorites do not always match each other; thus content of news title is more important than its homepage position in attracting online visits. Second, the Chinese and English readers’ interests in the same events are different. Third, the pageview numbers and comments posted to the news items reflect the unfavorable attitudes of the Chinese people toward the United States and Japan, which might offer us a method to investigate the public interest in some other issues or nations after necessary modifications. More importantly, some publicly available data, such as the comments posted to the news stories and online survey results, further show that the pageview measure does reflect readers’ interests/needs truthfully, as proved by the strong correlations between the top news reports and relevant top queries. The external ininks to the news websites and the online reprints of the top news items help us examine readers\u27 interests from other perspectives, as well as establish online profiles of the news websites. Such publicly accessible information could be an alternative data source for researchers to study readers\u27 interests when the Web server log data are not available. This doctoral thesis not only shows the usefulness of Web server log statistics, survey results, and other publicly accessible data in studying Web user’s information needs, but also offers practical suggestions for online news sites to improve their contents and homepage designs. However, no single method can draw a complete picture of the online news readers’ interests. The above mentioned research methodologies should be employed together, in order to make more comprehensive conclusions. Future research is especially needed to investigate the continuously rapid growth of the “Mobile News Readers,” which poses both challenges and opportunities to the press industry in the 21st century
    • …
    corecore