11,913 research outputs found

    The role of handbooks in knowledge creation and diffusion: A case of science and technology studies

    Get PDF
    Genre is considered to be an important element in scholarly communication and in the practice of scientific disciplines. However, scientometric studies have typically focused on a single genre, the journal article. The goal of this study is to understand the role that handbooks play in knowledge creation and diffusion and their relationship with the genre of journal articles, particularly in highly interdisciplinary and emergent social science and humanities disciplines. To shed light on these questions we focused on handbooks and journal articles published over the last four decades belonging to the research area of Science and Technology Studies (STS), broadly defined. To get a detailed picture we used the full-text of five handbooks (500,000 words) and a well-defined set of 11,700 STS articles. We confirmed the methodological split of STS into qualitative and quantitative (scientometric) approaches. Even when the two traditions explore similar topics (e.g., science and gender) they approach them from different starting points. The change in cognitive foci in both handbooks and articles partially reflects the changing trends in STS research, often driven by technology. Using text similarity measures we found that, in the case of STS, handbooks play no special role in either focusing the research efforts or marking their decline. In general, they do not represent the summaries of research directions that have emerged since the previous edition of the handbook.Comment: Accepted for publication in Journal of Informetric

    Towards Real-Time, Country-Level Location Classification of Worldwide Tweets

    Get PDF
    In contrast to much previous work that has focused on location classification of tweets restricted to a specific country, here we undertake the task in a broader context by classifying global tweets at the country level, which is so far unexplored in a real-time scenario. We analyse the extent to which a tweet's country of origin can be determined by making use of eight tweet-inherent features for classification. Furthermore, we use two datasets, collected a year apart from each other, to analyse the extent to which a model trained from historical tweets can still be leveraged for classification of new tweets. With classification experiments on all 217 countries in our datasets, as well as on the top 25 countries, we offer some insights into the best use of tweet-inherent features for an accurate country-level classification of tweets. We find that the use of a single feature, such as the use of tweet content alone -- the most widely used feature in previous work -- leaves much to be desired. Choosing an appropriate combination of both tweet content and metadata can actually lead to substantial improvements of between 20\% and 50\%. We observe that tweet content, the user's self-reported location and the user's real name, all of which are inherent in a tweet and available in a real-time scenario, are particularly useful to determine the country of origin. We also experiment on the applicability of a model trained on historical tweets to classify new tweets, finding that the choice of a particular combination of features whose utility does not fade over time can actually lead to comparable performance, avoiding the need to retrain. However, the difficulty of achieving accurate classification increases slightly for countries with multiple commonalities, especially for English and Spanish speaking countries.Comment: Accepted for publication in IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE

    De retibus socialibus et legibus momenti

    Get PDF
    Online Social Networks (OSNs) are a cutting edge topic. Almost everybody --users, marketers, brands, companies, and researchers-- is approaching OSNs to better understand them and take advantage of their benefits. Maybe one of the key concepts underlying OSNs is that of influence which is highly related, although not entirely identical, to those of popularity and centrality. Influence is, according to Merriam-Webster, "the capacity of causing an effect in indirect or intangible ways". Hence, in the context of OSNs, it has been proposed to analyze the clicks received by promoted URLs in order to check for any positive correlation between the number of visits and different "influence" scores. Such an evaluation methodology is used in this paper to compare a number of those techniques with a new method firstly described here. That new method is a simple and rather elegant solution which tackles with influence in OSNs by applying a physical metaphor.Comment: Changes made for third revision: Brief description of the dataset employed added to Introduction. Minor changes to the description of preparation of the bit.ly datasets. Minor changes to the captions of Tables 1 and 3. Brief addition in the Conclusions section (future line of work added). Added references 16 and 18. Some typos and grammar polishe

    Growing Story Forest Online from Massive Breaking News

    Full text link
    We describe our experience of implementing a news content organization system at Tencent that discovers events from vast streams of breaking news and evolves news story structures in an online fashion. Our real-world system has distinct requirements in contrast to previous studies on topic detection and tracking (TDT) and event timeline or graph generation, in that we 1) need to accurately and quickly extract distinguishable events from massive streams of long text documents that cover diverse topics and contain highly redundant information, and 2) must develop the structures of event stories in an online manner, without repeatedly restructuring previously formed stories, in order to guarantee a consistent user viewing experience. In solving these challenges, we propose Story Forest, a set of online schemes that automatically clusters streaming documents into events, while connecting related events in growing trees to tell evolving stories. We conducted extensive evaluation based on 60 GB of real-world Chinese news data, although our ideas are not language-dependent and can easily be extended to other languages, through detailed pilot user experience studies. The results demonstrate the superior capability of Story Forest to accurately identify events and organize news text into a logical structure that is appealing to human readers, compared to multiple existing algorithm frameworks.Comment: Accepted by CIKM 2017, 9 page

    Laws and Limits of Econometrics

    Get PDF
    We start by discussing some general weaknesses and limitations of the econometric approach. A template from sociology is used to formulate six laws that characterize mainstream activities of econometrics and the scientific limits of those activities, we discuss some proximity theorems that quantify by means of explicit bounds how close we can get to the generating mechanism of the data and the optimal forecasts of next period observations using a finite number of observations. The magnitude of the bound depends on the characteristics of the model and the trajectory of the observed data. The results show that trends are more elusive to model than stationary processes in the sense that the proximity bounds are larger. By contrast, the bounds are of smaller order for models that are unidentified or nearly unidentified, so that lack or near lack of identification may not be as fatal to the use of a model in practice as some recent results on inference suggest, we look at one possible future of econometrics that involves the use of advanced econometric methods interactively by way of a web browser. With these methods users may access a suite of econometric methods and data sets online. They may also upload data to remote servers and by simple web browser selections initiate the implementation of advanced econometric software algorithms, returning the results online and by file and graphics downloads.Activities and limitations of econometrics, automated modeling, nearly unidentified models, nonstationarity, online econometrics, policy analysis, prediction, quantitative bounds, trends, unit roots, weak instruments

    Studying Social Networks at Scale: Macroscopic Anatomy of the Twitter Social Graph

    Get PDF
    Twitter is one of the largest social networks using exclusively directed links among accounts. This makes the Twitter social graph much closer to the social graph supporting real life communications than, for instance, Facebook. Therefore, understanding the structure of the Twitter social graph is interesting not only for computer scientists, but also for researchers in other fields, such as sociologists. However, little is known about how the information propagation in Twitter is constrained by its inner structure. In this paper, we present an in-depth study of the macroscopic structure of the Twitter social graph unveiling the highways on which tweets propagate, the specific user activity associated with each component of this macroscopic structure, and the evolution of this macroscopic structure with time for the past 6 years. For this study, we crawled Twitter to retrieve all accounts and all social relationships (follow links) among accounts; the crawl completed in July 2012 with 505 million accounts interconnected by 23 billion links. Then, we present a methodology to unveil the macroscopic structure of the Twitter social graph. This macroscopic structure consists of 8 components defined by their connectivity characteristics. Each component group users with a specific usage of Twitter. For instance, we identified components gathering together spammers, or celebrities. Finally, we present a method to approximate the macroscopic structure of the Twitter social graph in the past, validate this method using old datasets, and discuss the evolution of the macroscopic structure of the Twitter social graph during the past 6 years.Comment: ACM Sigmetrics 2014 (2014

    Plantarium : Human-Vegetal Ecologies

    No full text
    • 

    corecore