Search CORE

39,809 research outputs found

What to Fix? Distinguishing between design and non-design rules in automated tools

Author: Bellomo Stephany
Ernst Neil A.
Nord Robert L.
Ozkaya Ipek
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/05/2017
Field of study

Technical debt---design shortcuts taken to optimize for delivery speed---is a critical part of long-term software costs. Consequently, automatically detecting technical debt is a high priority for software practitioners. Software quality tool vendors have responded to this need by positioning their tools to detect and manage technical debt. While these tools bundle a number of rules, it is hard for users to understand which rules identify design issues, as opposed to syntactic quality. This is important, since previous studies have revealed the most significant technical debt is related to design issues. Other research has focused on comparing these tools on open source projects, but these comparisons have not looked at whether the rules were relevant to design. We conducted an empirical study using a structured categorization approach, and manually classify 466 software quality rules from three industry tools---CAST, SonarQube, and NDepend. We found that most of these rules were easily labeled as either not design (55%) or design (19%). The remainder (26%) resulted in disagreements among the labelers. Our results are a first step in formalizing a definition of a design rule, in order to support automatic detection.Comment: Long version of accepted short paper at International Conference on Software Architecture 2017 (Gothenburg, SE

arXiv.org e-Print Archive

Crossref

Exploring The Value Of Folksonomies For Creating Semantic Metadata

Author: Al-Khalifa Hend S.
Davis Hugh C.
Publication venue
Publication date: 01/01/2007
Field of study

Finding good keywords to describe resources is an on-going problem: typically we select such words manually from a thesaurus of terms, or they are created using automatic keyword extraction techniques. Folksonomies are an increasingly well populated source of unstructured tags describing web resources. This paper explores the value of the folksonomy tags as potential source of keyword metadata by examining the relationship between folksonomies, community produced annotations, and keywords extracted by machines. The experiment has been carried-out in two ways: subjectively, by asking two human indexers to evaluate the quality of the generated keywords from both systems; and automatically, by measuring the percentage of overlap between the folksonomy set and machine generated keywords set. The results of this experiment show that the folksonomy tags agree more closely with the human generated keywords than those automatically generated. The results also showed that the trained indexers preferred the semantics of folksonomy tags compared to keywords extracted automatically. These results can be considered as evidence for the strong relationship of folksonomies to the human indexer’s mindset, demonstrating that folksonomies used in the del.icio.us bookmarking service are a potential source for generating semantic metadata to annotate web resources

CiteSeerX

Southampton (e-Prints Soton)

Machine learning challenges in theoretical HEP

Author: Carrazza Stefano
Publication venue
Publication date: 29/11/2017
Field of study

In these proceedings we perform a brief review of machine learning (ML) applications in theoretical High Energy Physics (HEP-TH). We start the discussion by defining and then classifying machine learning tasks in theoretical HEP. We then discuss some of the most popular and recent published approaches with focus on a relevant case study topic: the determination of parton distribution functions (PDFs) and related tools. Finally, we provide an outlook about future applications and developments due to the synergy between ML and HEP-TH.Comment: 7 pages, 3 figures, in proceedings of the 18th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2017

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

CERN Document Server

Expanding the Usage of Web Archives by Recommending Archived Webpages Using Only the URI

Author: Alkwai Lulwah M.
Publication venue: ODU Digital Commons
Publication date: 01/04/2019
Field of study

Web archives are a window to view past versions of webpages. When a user requests a webpage on the live Web, such as http://tripadvisor.com/where_to_t ravel/, the webpage may not be found, which results in an HyperText Transfer Protocol (HTTP) 404 response. The user then may search for the webpage in a Web archive, such as the Internet Archive. Unfortunately, if this page had never been archived, the user will not be able to view the page, nor will the user gain any information on other webpages that have similar content in the archive, such as the archived webpage http://classy-travel.net. Similarly, if the user requests the webpage http://hokiesports.com/football/ from the Internet Archive, the user will only find the requested webpage, and the user will not gain any information on other webpages that have similar content in the archive, such as the archived webpage http://techsideline.com. In this research, we will build a model for selecting and ranking possible recommended webpages at a Web archive. This is to enhance both HTTP 404 responses and HTTP 200 responses by surfacing webpages in the archive that the user may not know existed. First, we detect semantics in the requested Uniform Resource Identifier (URI). Next, we classify the URI using an ontology, such as DMOZ or any website directory. Finally, we filter and rank candidates based on several features, such as archival quality, webpage popularity, temporal similarity, and content similarity. We measure the performance of each step using different techniques, including calculating the F1 to measure of different tokenization methods and the classification. We tested the model using human evaluation to determine if we could classify and find recommendations for a sample of requests from the Internet Archive’s Wayback Machine access log. Overall, when selecting the full categorization, reviewers agreed with 80.3% of the recommendations, which is much higher than “do not agree” and “I do not know”. This indicates the reviewer is more likely to agree on the recommendations when selecting the full categorization. But when selecting the first level only, reviewers only agreed with 25.5% of the recommendations. This indicates that having deep level categorization improves the performance of finding relevant recommendations

Old Dominion University

Assessing Consistency and Fairness in Sentencing: A Comparative Study in Three States

Author: Brian J. Ostrom
Charles W. Ostrom
Matthew Kleiman
Roger A. Hanson
Publication venue: National Center for State Courts
Publication date: 05/05/2008
Field of study

Summarizes a study of sentencing guidelines in Michigan, Minnesota, and Virginia comparing levels of predictability and judicial discretion under different guideline systems, effectiveness in limiting discriminatory disparities, and lessons learned

IssueLab