Search CORE

117,194 research outputs found

Identification-method research for open-source software ecosystems

Author: Liao Zhifang
Liu Hui
Liu Shengzong
Wang Ningwei
Zhang Qi
Zhang Yan
Publication venue: 'MDPI AG'
Publication date: 01/02/2019
Field of study

In recent years, open-source software (OSS) development has grown, with many developers around the world working on different OSS projects. A variety of open-source software ecosystems have emerged, for instance, GitHub, StackOverflow, and SourceForge. One of the most typical social-programming and code-hosting sites, GitHub, has amassed numerous open-source-software projects and developers in the same virtual collaboration platform. Since GitHub itself is a large open-source community, it hosts a collection of software projects that are developed together and coevolve. The great challenge here is how to identify the relationship between these projects, i.e., project relevance. Software-ecosystem identification is the basis of other studies in the ecosystem. Therefore, how to extract useful information in GitHub and identify software ecosystems is particularly important, and it is also a research area in symmetry. In this paper, a Topic-based Project Knowledge Metrics Framework (TPKMF) is proposed. By collecting the multisource dataset of an open-source ecosystem, project-relevance analysis of the open-source software is carried out on the basis of software-ecosystem identification. Then, we used our Spectral Clustering algorithm based on Core Project (CP-SC) to identify software-ecosystem projects and further identify software ecosystems. We verified that most software ecosystems usually contain a core software project, and most other projects are associated with it. Furthermore, we analyzed the characteristics of the ecosystem, and we also found that interactive information has greater impact on project relevance. Finally, we summarize the Topic-based Project Knowledge Metrics Framework

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

ResearchOnline@GCU

Users and Assessors in the Context of INEX: Are Relevance Dimensions Relevant?

Author: Pehcevski Jovan
Thom James A.
Vercoustre Anne-Marie
Publication venue
Publication date: 01/01/2005
Field of study

The main aspects of XML retrieval are identified by analysing and comparing the following two behaviours: the behaviour of the assessor when judging the relevance of returned document components; and the behaviour of users when interacting with components of XML documents. We argue that the two INEX relevance dimensions, Exhaustivity and Specificity, are not orthogonal dimensions; indeed, an empirical analysis of each dimension reveals that the grades of the two dimensions are correlated to each other. By analysing the level of agreement between the assessor and the users, we aim at identifying the best units of retrieval. The results of our analysis show that the highest level of agreement is on highly relevant and on non-relevant document components, suggesting that only the end points of the INEX 10-point relevance scale are perceived in the same way by both the assessor and the users. We propose a new definition of relevance for XML retrieval and argue that its corresponding relevance scale would be a better choice for INEX

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

Digitometric Services for Open Archives Environments

Author: Brody Tim
Carr Les
Harnad Stevan
Hitchcock Steve
Kampa Simon
Publication venue
Publication date: 01/01/2003
Field of study

We describe “digitometric” services and tools that add value to open-access eprint archives using the Open Archives Initiative (OAI) Protocol for Metadata Harvesting. Celestial is an OAI cache and gateway tool. Citebase Search enhances OAI-harvested metadata with linked references harvested from the full-text to provide a web service for citation navigation and research impact analysis. Digitometrics builds on data harvested using OAI to provide advanced visualisation and hypertext navigation for the research community. Together these services provide a modular, distributed architecture for building a “semantic web” for the research literature

Southampton (e-Prints Soton)

The Royal Birth of 2013: Analysing and Visualising Public Sentiment in the UK Using Twitter

Author: Barker Adam
Nguyen Vu Dung
Varghese Blesson
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Analysis of information retrieved from microblogging services such as Twitter can provide valuable insight into public sentiment in a geographic region. This insight can be enriched by visualising information in its geographic context. Two underlying approaches for sentiment analysis are dictionary-based and machine learning. The former is popular for public sentiment analysis, and the latter has found limited use for aggregating public sentiment from Twitter data. The research presented in this paper aims to extend the machine learning approach for aggregating public sentiment. To this end, a framework for analysing and visualising public sentiment from a Twitter corpus is developed. A dictionary-based approach and a machine learning approach are implemented within the framework and compared using one UK case study, namely the royal birth of 2013. The case study validates the feasibility of the framework for analysis and rapid visualisation. One observation is that there is good correlation between the results produced by the popular dictionary-based approach and the machine learning approach when large volumes of tweets are analysed. However, for rapid analysis to be possible faster methods need to be developed using big data techniques and parallel methods.Comment: http://www.blessonv.com/research/publicsentiment/ 9 pages. Submitted to IEEE BigData 2013: Workshop on Big Humanities, October 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

University of St. Andrews - Pure

The statistical Analysis of Star Clusters

Author: Cartwright Annabel
Whitworth Anthony P
Publication venue: 'Wiley'
Publication date: 01/01/2004
Field of study

We review a range of stastistical methods for analyzing the structures of star clusters, and derive a new measure

{\cal Q}

which both quantifies, and distinguishes between, a (relatively smooth) large-scale radial density gradient and multi-scale (fractal) sub-clustering. Q is derived from the normalised correlation length and the normalised edge length of the minimal spanning tree for each cluster

arXiv.org e-Print Archive

CiteSeerX

Online Research @ Cardiff

CERN Document Server

A methodology for analysing and evaluating narratives in annual reports: a comprehensive descriptive profile and metrics for disclosure quality attributes

Author: Beattie V.
Fearnley S.
McInnes W.
Publication venue: 'Elsevier BV'
Publication date: 01/09/2004
Field of study

There is a consensus that the business reporting model needs to expand to serve the changing information needs of the market and provide the information required for enhanced corporate transparency and accountability. Worldwide, regulators view narrative disclosures as the key to achieving the desired step-change in the quality of corporate reporting. In recent years, accounting researchers have increasingly focused their efforts on investigating disclosure and it is now recognised that there is an urgent need to develop disclosure metrics to facilitate research into voluntary disclosure and quality [Core, J. E. (2001). A review of the empirical disclosure literature. Journal of Accounting and Economics, 31(3), 441–456]. This paper responds to this call and contributes in two principal ways. First, the paper introduces to the academic literature a comprehensive four-dimensional framework for the holistic content analysis of accounting narratives and presents a computer-assisted methodology for implementing this framework. This procedure provides a rich descriptive profile of a company's narrative disclosures based on the coding of topic and three type attributes. Second, the paper explores the complex concept of quality, and the problematic nature of quality measurement. It makes a preliminary attempt to identify some of the attributes of quality (such as relative amount of disclosure and topic spread), suggests observable proxies for these and offers a tentative summary measure of disclosure quality

Enlighten