Search CORE

6,474 research outputs found

Science Models as Value-Added Services for Scholarly Information Systems

Author: A Al-Maskari
A Bavelas
A Shiri
AL Barabasi
BC Brookes
C Chen
D Beaver
DB Worthen
DC Blair
DC Blair
DC Blair
FR Lang
H Lu
HD White
JL Fleiss
K Börner
KW Boyack
L Leydesdorff
L Leydesdorff
L Yin
LC Freeman
LC Freeman
M Callon
MEJ Newman
MEJ Newman
MJ Bates
NJ Belkin
P Mayr
P Mayr
P Mutschke
P Mutschke
P Mutschke
Peter Mutschke
Philipp Mayr
Philipp Schaer
RW White
SC Bradford
SC Bradford
V Petras
W Glänzel
X Liu
Y Jiang
York Sure
Z-L He
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/05/2011
Field of study

The paper introduces scholarly Information Retrieval (IR) as a further dimension that should be considered in the science modeling debate. The IR use case is seen as a validation model of the adequacy of science models in representing and predicting structure and dynamics in science. Particular conceptualizations of scholarly activity and structures in science are used as value-added search services to improve retrieval quality: a co-word model depicting the cognitive structure of a field (used for query expansion), the Bradford law of information concentration, and a model of co-authorship networks (both used for re-ranking search results). An evaluation of the retrieval quality when science model driven services are used turned out that the models proposed actually provide beneficial effects to retrieval quality. From an IR perspective, the models studied are therefore verified as expressive conceptualizations of central phenomena in science. Thus, it could be shown that the IR perspective can significantly contribute to a better understanding of scholarly structures and activities.Comment: 26 pages, to appear in Scientometric

arXiv.org e-Print Archive

Crossref

StakeNet: using social networks to analyse the stakeholders of large-scale software projects

Author: Finkelstein A.
Lim S.L.
Quercia D.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

Many software projects fail because they overlook stakeholders or involve the wrong representatives of significant groups. Unfortunately, existing methods in stakeholder analysis are likely to omit stakeholders, and consider all stakeholders as equally influential. To identify and prioritise stakeholders, we have developed StakeNet, which consists of three main steps: identify stakeholders and ask them to recommend other stakeholders and stakeholder roles, build a social network whose nodes are stakeholders and links are recommendations, and prioritise stakeholders using a variety of social network measures. To evaluate StakeNet, we conducted one of the first empirical studies of requirements stakeholders on a software project for a 30,000-user system. Using the data collected from surveying and interviewing 68 stakeholders, we show that StakeNet identifies stakeholders and their roles with high recall, and accurately prioritises them. StakeNet uncovers a critical stakeholder role overlooked in the project, whose omission significantly impacted project success

CiteSeerX

UCL Discovery

Structuring Wikipedia Articles with Section Recommendations

Author: Catasta Michele
Piccardi Tiziano
West Robert
Zia Leila
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/05/2018
Field of study

Sections are the building blocks of Wikipedia articles. They enhance readability and can be used as a structured entry point for creating and expanding articles. Structuring a new or already existing Wikipedia article with sections is a hard task for humans, especially for newcomers or less experienced editors, as it requires significant knowledge about how a well-written article looks for each possible topic. Inspired by this need, the present paper defines the problem of section recommendation for Wikipedia articles and proposes several approaches for tackling it. Our systems can help editors by recommending what sections to add to already existing or newly created Wikipedia articles. Our basic paradigm is to generate recommendations by sourcing sections from articles that are similar to the input article. We explore several ways of defining similarity for this purpose (based on topic modeling, collaborative filtering, and Wikipedia's category system). We use both automatic and human evaluation approaches for assessing the performance of our recommendation system, concluding that the category-based approach works best, achieving precision@10 of about 80% in the human evaluation.Comment: SIGIR '18 camera-read

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Data-Driven Application Maintenance: Views from the Trenches

Author: Misra Janardan
Podder Sanjay
Rawat Divya
Savagaonkar Milind
Sengupta Shubhashis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/06/2018
Field of study

In this paper we present our experience during design, development, and pilot deployments of a data-driven machine learning based application maintenance solution. We implemented a proof of concept to address a spectrum of interrelated problems encountered in application maintenance projects including duplicate incident ticket identification, assignee recommendation, theme mining, and mapping of incidents to business processes. In the context of IT services, these problems are frequently encountered, yet there is a gap in bringing automation and optimization. Despite long-standing research around mining and analysis of software repositories, such research outputs are not adopted well in practice due to the constraints these solutions impose on the users. We discuss need for designing pragmatic solutions with low barriers to adoption and addressing right level of complexity of problems with respect to underlying business constraints and nature of data.Comment: Earlier version of paper appearing in proceedings of the 4th International Workshop on Software Engineering Research and Industrial Practice (SER&IP), IEEE Press, pp. 48-54, 201

arXiv.org e-Print Archive

Crossref

Graph-based Features for Automatic Online Abuse Detection

Author: F Harary
F Pedregosa
G Csardi
J Kleinberg
K Balci
K Dinakar
LC Freeman
MEJ Newman
PF Bonacich
S Brin
SB Seidman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/08/2017
Field of study

While online communities have become increasingly important over the years, the moderation of user-generated content is still performed mostly manually. Automating this task is an important step in reducing the financial cost associated with moderation, but the majority of automated approaches strictly based on message content are highly vulnerable to intentional obfuscation. In this paper, we discuss methods for extracting conversational networks based on raw multi-participant chat logs, and we study the contribution of graph features to a classification system that aims to determine if a given message is abusive. The conversational graph-based system yields unexpectedly high performance , with results comparable to those previously obtained with a content-based approach

arXiv.org e-Print Archive

Crossref

Large-Margin Determinantal Point Processes

Author: Chao Wei-lun
Gong Boqing
Grauman Kristen
Sha Fei
Publication venue
Publication date: 07/11/2014
Field of study

Determinantal point processes (DPPs) offer a powerful approach to modeling diversity in many applications where the goal is to select a diverse subset. We study the problem of learning the parameters (the kernel matrix) of a DPP from labeled training data. We make two contributions. First, we show how to reparameterize a DPP's kernel matrix with multiple kernel functions, thus enhancing modeling flexibility. Second, we propose a novel parameter estimation technique based on the principle of large margin separation. In contrast to the state-of-the-art method of maximum likelihood estimation, our large-margin loss function explicitly models errors in selecting the target subsets, and it can be customized to trade off different types of errors (precision vs. recall). Extensive empirical studies validate our contributions, including applications on challenging document and video summarization, where flexibility in modeling the kernel matrix and balancing different errors is indispensable.Comment: 15 page

arXiv.org e-Print Archive

CiteSeerX