Search CORE

3,087 research outputs found

From a Domain Analysis to the Specification and Detection of Code and Design Smells

Author: Duchien Laurence
Guéhéneuc Yann-Gaël
Le Meur Anne-Françoise
Moha Naouel
Tiberghien Alban
Publication venue: Springer
Publication date: 28/12/2009
Field of study

Code and design smells are recurring design problems in software systems that must be identified to avoid their possible negative consequences\ud on development and maintenance. Consequently, several smell detection\ud approaches and tools have been proposed in the literature. However,\ud so far, they allow the detection of predefined smells but the detection\ud of new smells or smells adapted to the context of the analysed systems\ud is possible only by implementing new detection algorithms manually.\ud Moreover, previous approaches do not explain the transition from\ud specifications of smells to their detection. Finally, the validation\ud of the existing approaches and tools has been limited on few proprietary\ud systems and on a reduced number of smells. In this paper, we introduce\ud an approach to automate the generation of detection algorithms from\ud specifications written using a domain-specific language. This language\ud is defined from a thorough domain analysis. It allows the specification\ud of smells using high-level domain-related abstractions. It allows\ud the adaptation of the specifications of smells to the context of\ud the analysed systems.We specify 10 smells, generate automatically\ud their detection algorithms using templates, and validate the algorithms\ud in terms of precision and recall on Xerces v2.7.0 and GanttProject\ud v1.10.2, two open-source object-oriented systems.We also compare\ud the detection results with those of a previous approach, iPlasma

HAL-CentraleSupelec

HAL - Lille 3

CiteSeerX

INRIA a CCSD electronic archive server

Archipel - Université du Québec à Montréal

PolyPublie

HAL-Rennes 1

Text Mining Infrastructure in R

Author: David Meyer
Ingo Feinerer
Kurt Hornik
Publication venue
Publication date
Field of study

During the last decade text mining has become a widely used discipline utilizing statistical and machine learning methods. We present the tm package which provides a framework for text mining applications within R. We give a survey on text mining facilities in R and explain how typical application tasks can be carried out using our framework. We present techniques for count-based analysis methods, text clustering, text classification and string kernels.

Research Papers in Economics

Recommended from our members

OpenLearn and knowledge maps for language learning

Author: Okada Alexandra
Publication venue: IGI group
Publication date: 15/07/2008
Field of study

This chapter presents new methodologies designed to facilitate language acquisition in open learning communities via open educational resources and knowledge mapping. It specifically focuses on the OpenLearn project developed by the Open University. This offers a virtual learning environment based on Moodle platform with free educational materials and knowledge media tools such as the instant messaging MSG, the video webconference FlashMeeting and the knowledge mapping software tool Compendium. In this work, these technologies and mapping techniques are introduced in order to promote open language learning. Ways in which teachers and students can make use of these OpenLearn tools and resources are discussed and some benefits fully described

Open Research Online (The Open University)

Structured text retrieval by means of affordances and genre.

Author: Clark Malcolm
Publication venue: BCS, The Chartered Institute for IT
Publication date: 31/08/2007
Field of study

This paper offers a proposal for some preliminary research on the retrieval of structured text, such as extensible mark-up language (XML). We believe that capturing the way in which a reader perceives the meaning of documents, especially genres of text, may have implications for information retrieval (IR) and in particular, for cognitive IR and relevance. Previous research on shallow features of structured text has shown that categorization by form is possible. Gibsons theory of affordances and genre offer the reader the meaning and purpose - through structure - of a text, before the reader has even begun to read it, and should therefore provide a good basis for the deep skimming and categorization of texts. We believe that Gibsons affordances will aid the user to locate, examine and utilize shallow or deep features of genres and retrieve relevant output. Our proposal puts forward two hypotheses, with a list of research questions to test them, and culminates in experiments involving the studies of human categorization behaviour when viewing the structures of emails and web documents. Finally, we will examine the effectiveness of adding structural layout cues to a Yahoo discussion forum (currently only a bag-of-words), which is rich in structure, but only searchable through a Boolean search engine

Crossref

Open Access Institutional Repository at Robert Gordon University

Holistic recommender systems for software engineering

Author: Lanza Michele
Mocci Andrea
Ponzanelli Luca
Publication venue
Publication date: 04/07/2017
Field of study

The knowledge possessed by developers is often not sufficient to overcome a programming problem. Short of talking to teammates, when available, developers often gather additional knowledge from development artifacts (e.g., project documentation), as well as online resources. The web has become an essential component in the modern developer’s daily life, providing a plethora of information from sources like forums, tutorials, Q&A websites, API documentation, and even video tutorials. Recommender Systems for Software Engineering (RSSE) provide developers with assistance to navigate the information space, automatically suggest useful items, and reduce the time required to locate the needed information. Current RSSEs consider development artifacts as containers of homogeneous information in form of pure text. However, text is a means to represent heterogeneous information provided by, for example, natural language, source code, interchange formats (e.g., XML, JSON), and stack traces. Interpreting the information from a pure textual point of view misses the intrinsic heterogeneity of the artifacts, thus leading to a reductionist approach. We propose the concept of Holistic Recommender Systems for Software Engineering (H-RSSE), i.e., RSSEs that go beyond the textual interpretation of the information contained in development artifacts. Our thesis is that modeling and aggregating information in a holistic fashion enables novel and advanced analyses of development artifacts. To validate our thesis we developed a framework to extract, model and analyze information contained in development artifacts in a reusable meta- information model. We show how RSSEs benefit from a meta-information model, since it enables customized and novel analyses built on top of our framework. The information can be thus reinterpreted from an holistic point of view, preserving its multi-dimensionality, and opening the path towards the concept of holistic recommender systems for software engineering

RERO DOC Digital Library

Semi-automatic knowledge population in a legal document management system

Author: Boella G.
Di Caro L.
Leone V.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Institutional Research Information System University of Turin

Global Diffusion of the Internet XV: Web 2.0 Technologies, Principles, and Applications: A Conceptual Framework from Technology Push and Demand Pull Perspective

Author: Gates Tracy
Hall Sharon Perkins
Kim Dan J.
Yue Kwok-Bun
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/06/2009
Field of study

Web 2.0, the current Internet evolution, can be described by several key features of an expanded Web that is more interactive; allows easy social interactions through participation and collaboration from a variety of human sectors; responds more immediately to users\u27 queries and needs; is easier to search; and provides a faster, smoother, realistic and engaging user search capability, often with automatic updates to users. The purpose of this study is three-fold. First, the primary goal is to propose a conceptual Web 2.0 framework that provides better understanding of the Web 2.0 concept by classifying current key components in a holistic manner. Second, using several selective key components from the conceptual framework, this study conducts case analyses of Web 2.0 applications to discuss how they have adopted the selective key features (i.e., participation, collaboration, rich user experience, social networking, semantics, and interactivity responsiveness) of the conceptual Web 2.0 framework. Finally, the study provides insightful discussion of some challenges and opportunities provided by Web 2.0 to education, business, and social life

AIS Electronic Library (AISeL)

Web Data Extraction, Applications and Techniques: A Survey

Author: Abel
Amalfitano
Balduzzi
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Berger
Berthold
Bettencourt
Califf
Catanese
Chang
Chen
Chen
Chen
Collins
Conover
Crandall
Crescenzi
Crescenzi
Dalvi
Dalvi
De Meo
De Meo
Doan
Emilio Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Flesca
Freitag
Furche
Gatterbauer
Gatterbauer
Giacomo Fiumara
Gjoka
Gkotsis
Gottlob
Gottlob
Hammersley
Han
Hecht
Hsu
Irmak
Khare
Kim
Kinsella
Kleinberg
Kleinberg
Kohlschütter
Kokkoras
Kokkoras
Kokkoras
Krüpl
Kushmerick
Kwak
Laender
Liu
Manning
Masanès
Mathes
Meng
Mislove
Monge
Muslea
Oro
Pan
Pasquale De Meo
Perito
Phan
Plake
Rahm
Rahm
Reis
Robert Baumgartner
Sahuguet
Sarawagi
Schifanella
Selkow
Shi
Soderland
Szomszor
Turmo
Vosecky
Wang
Wang
Weikum
Wilson
Winograd
Yang
Ye
Zafarani
Zanasi
Zhai
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 09/06/2014
Field of study

Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

arXiv.org e-Print Archive

Crossref

Context Aware Textual Entailment

Author: Arab-Khazaeli Soha
Publication venue: LSU Digital Commons
Publication date: 01/01/2015
Field of study

In conversations, stories, news reporting, and other forms of natural language, understanding requires participants to make assumptions (hypothesis) based on background knowledge, a process called entailment. These assumptions may then be supported, contradicted, or refined as a conversation or story progresses and additional facts become known and context changes. It is often the case that we do not know an aspect of the story with certainty but rather believe it to be the case; i.e., what we know is associated with uncertainty or ambiguity. In this research a method has been developed to identify different contexts of the input raw text along with specific features of the contexts such as time, location, and objects. The method includes a two-phase SVM classifier along with a voting mechanism in the second phase to identify the contexts. Rule-based algorithms were utilized to extract the context elements. This research also develops a new context˗aware text representation. This representation maintains semantic aspects of sentences, as well as textual contexts and context elements. The method can offer both graph representation and First-Order-Logic representation of the text. This research also extracts a First-Order Logic (FOL) and XML representation of a text or series of texts. The method includes entailment using background knowledge from sources (VerbOcean and WordNet), with resolution of conflicts between extracted clauses, and handling the role of context in resolving uncertain truth

Louisiana State University