3 research outputs found

    Zipf's law and the diversity of biology newsgroups

    Full text link
    Usenet newsgroups provide a popular means of scientific communication. We demonstrate striking order in the diversity of biology newsgroups: Submissions to newsgroups obey a form of Zipf's law, a simple power law for the frequency of posts as a function of the rank, by posting, of contributors. We show that a simple stochastic process, due to GĂĽnther et al. (1992, 1996), Levitin and Schapiro (1993), and Schapiro (1994), accounts for this pattern and reproduces many of the properties of newsgroups. This model successfully predicts the relative contribution from each poster in terms of the size, the number of posters and total posts, of the newsgroup.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/43673/1/11192_2004_Article_5116106.pd

    GAUGING PUBLIC INTEREST FROM SERVER LOGS, SURVEYS AND INLINKS

    Get PDF
    As the World Wide Web (the Web) has turned into a full-fledged medium to disseminate news, it is very important for journalism and information science researchers to investigate how Web users access online news reports and how to interpret such usage patterns. This doctoral thesis collected and analyzed Web server log statistics, online surveys results, online reprints of the top 50 news reports, as well as external inlinks data of a leading comprehensive online newspaper (the People’s Daily Online) in China, one of the biggest Web/information markets in today’s world. The aim of the thesis was to explore various methods to gauge the public interest from a Webometrics perspective. A total of 129 days of Web server log statistics, including the top 50 Chinese and English news stories with the highest daily pageview numbers, the comments attracted by these news items and the emailed frequencies of the same stories were collected from October 2007 to September 2008. These top 50 news items’positions on the Chinese and English homepages and the top 50 queries submitted to the website search engine of the People’s Daily Online were also retrieved. Results of the two online surveys launched in March 2008 and March 2009 were collected after their respective closing dates. The external inlinks to the People’s Daily Online were retrieved by Yahoo! (Chinese and English versions), and the online reprints were retrieved by Google. Besides the general usage patterns identified from the top 50 news stories, this study, by conducting statistical tests on the data sets, also reveals the following findings. First, the editors’ choices and the readers’ favorites do not always match each other; thus content of news title is more important than its homepage position in attracting online visits. Second, the Chinese and English readers’ interests in the same events are different. Third, the pageview numbers and comments posted to the news items reflect the unfavorable attitudes of the Chinese people toward the United States and Japan, which might offer us a method to investigate the public interest in some other issues or nations after necessary modifications. More importantly, some publicly available data, such as the comments posted to the news stories and online survey results, further show that the pageview measure does reflect readers’ interests/needs truthfully, as proved by the strong correlations between the top news reports and relevant top queries. The external ininks to the news websites and the online reprints of the top news items help us examine readers\u27 interests from other perspectives, as well as establish online profiles of the news websites. Such publicly accessible information could be an alternative data source for researchers to study readers\u27 interests when the Web server log data are not available. This doctoral thesis not only shows the usefulness of Web server log statistics, survey results, and other publicly accessible data in studying Web user’s information needs, but also offers practical suggestions for online news sites to improve their contents and homepage designs. However, no single method can draw a complete picture of the online news readers’ interests. The above mentioned research methodologies should be employed together, in order to make more comprehensive conclusions. Future research is especially needed to investigate the continuously rapid growth of the “Mobile News Readers,” which poses both challenges and opportunities to the press industry in the 21st century

    GAUGING PUBLIC INTEREST FROM SERVER LOGS, SURVEYS AND INLINKS A Multi-Method Approach to Analyze News Websites

    Get PDF
    As the World Wide Web (the Web) has turned into a full-fledged medium to disseminate news, it is very important for journalism and information science researchers to investigate how Web users access online news reports and how to interpret such usage patterns. This doctoral thesis collected and analyzed Web server log statistics, online surveys results, online reprints of the top 50 news reports, as well as external inlinks data of a leading comprehensive online newspaper (the People\u27s Daily Online) in China, one of the biggest Web/information markets in today\u27s world. The aim of the thesis was to explore various methods to gauge the public interest from a Webometrics perspective. A total of 129 days of Web server log statistics, including the top 50 Chinese and English news stories with the highest daily pageview numbers, the comments attracted by these news items and the emailed frequencies of the same stories were collected from October 2007 to September 2008. These top 50 news items’positions on the Chinese and English homepages and the top 50 queries submitted to the website search engine of the People’s Daily Online were also retrieved. Results of the two online surveys launched in March 2008 and March 2009 were collected after their respective closing dates. The external inlinks to the People’s Daily Online were retrieved by Yahoo! (Chinese and English versions), and the online reprints were retrieved by Google. Besides the general usage patterns identified from the top 50 news stories, this study, by conducting statistical tests on the data sets, also reveals the following findings. First, the editors’ choices and the readers’ favorites do not always match each other; thus content of news title is more important than its homepage position in attracting online visits. Second, the Chinese and English readers’ interests in the same events are different. Third, the pageview numbers and comments posted to the news items reflect the unfavorable attitudes of the Chinese people toward the United States and Japan, which might offer us a method to investigate the public interest in some other issues or nations after necessary modifications. More importantly, some publicly available data, such as the comments posted to the news stories and online survey results, further show that the pageview measure does reflect readers’ interests/needs truthfully, as proved by the strong correlations between the top news reports and relevant top queries. The external ininks to the news websites and the online reprints of the top news items help us examine readers\u27 interests from other perspectives, as well as establish online profiles of the news websites. Such publicly accessible information could be an alternative data source for researchers to study readers\u27 interests when the Web server log data are not available. This doctoral thesis not only shows the usefulness of Web server log statistics, survey results, and other publicly accessible data in studying Web user’s information needs, but also offers practical suggestions for online news sites to improve their contents and homepage designs. However, no single method can draw a complete picture of the online news readers’ interests. The above mentioned research methodologies should be employed together, in order to make more comprehensive conclusions. Future research is especially needed to investigate the continuously rapid growth of the “Mobile News Readers,” which poses both challenges and opportunities to the press industry in the 21st century
    corecore