19,516 research outputs found

    Searching for rules to detect defective modules: A subgroup discovery approach

    Get PDF
    Data mining methods in software engineering are becoming increasingly important as they can support several aspects of the software development life-cycle such as quality. In this work, we present a data mining approach to induce rules extracted from static software metrics characterising fault-prone modules. Due to the special characteristics of the defect prediction data (imbalanced, inconsistency, redundancy) not all classification algorithms are capable of dealing with this task conveniently. To deal with these problems, Subgroup Discovery (SD) algorithms can be used to find groups of statistically different data given a property of interest. We propose EDER-SD (Evolutionary Decision Rules for Subgroup Discovery), a SD algorithm based on evolutionary computation that induces rules describing only fault-prone modules. The rules are a well-known model representation that can be easily understood and applied by project managers and quality engineers. Thus, rules can help them to develop software systems that can be justifiably trusted. Contrary to other approaches in SD, our algorithm has the advantage of working with continuous variables as the conditions of the rules are defined using intervals. We describe the rules obtained by applying our algorithm to seven publicly available datasets from the PROMISE repository showing that they are capable of characterising subgroups of fault-prone modules. We also compare our results with three other well known SD algorithms and the EDER-SD algorithm performs well in most cases.Ministerio de Educación y Ciencia TIN2007-68084-C02-00Ministerio de Educación y Ciencia TIN2010-21715-C02-0

    Establishment of a integrative multi-omics expression database CKDdb in the context of chronic kidney disease (CKD)

    Get PDF
    Complex human traits such as chronic kidney disease (CKD) are a major health and financial burden in modern societies. Currently, the description of the CKD onset and progression at the molecular level is still not fully understood. Meanwhile, the prolific use of high-throughput omic technologies in disease biomarker discovery studies yielded a vast amount of disjointed data that cannot be easily collated. Therefore, we aimed to develop a molecule-centric database featuring CKD-related experiments from available literature publications. We established the Chronic Kidney Disease database CKDdb, an integrated and clustered information resource that covers multi-omic studies (microRNAs, genomics, peptidomics, proteomics and metabolomics) of CKD and related disorders by performing literature data mining and manual curation. The CKDdb database contains differential expression data from 49395 molecule entries (redundant), of which 16885 are unique molecules (non-redundant) from 377 manually curated studies of 230 publications. This database was intentionally built to allow disease pathway analysis through a systems approach in order to yield biological meaning by integrating all existing information and therefore has the potential to unravel and gain an in-depth understanding of the key molecular events that modulate CKD pathogenesis

    A sectoral analysis of Barbados’ GDP business cycle

    Get PDF
    This paper has two main objectives. Firstly, to establish and characterise a reference cycle (based on real output) for Barbados over the quarterly period 1974-2003 using the Bry and Boschan algorithm. Secondly, to link this aggregate output cycle to the cycles of the individual sectors that comprises real output. The overriding conclusions are that the cycles of tourism and wholesale and retail closely resembles that of the aggregate business cycle, while the non-sugar agriculture and fishing cycle is acyclical.Barbados; Gross Domestic Product, Business Cycle

    The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas

    Get PDF
    Ontologies of research areas are important tools for characterising, exploring, and analysing the research landscape. Some fields of research are comprehensively described by large-scale taxonomies, e.g., MeSH in Biology and PhySH in Physics. Conversely, current Computer Science taxonomies are coarse-grained and tend to evolve slowly. For instance, the ACM classification scheme contains only about 2K research topics and the last version dates back to 2012. In this paper, we introduce the Computer Science Ontology (CSO), a large-scale, automatically generated ontology of research areas, which includes about 26K topics and 226K semantic relationships. It was created by applying the Klink-2 algorithm on a very large dataset of 16M scientific articles. CSO presents two main advantages over the alternatives: i) it includes a very large number of topics that do not appear in other classifications, and ii) it can be updated automatically by running Klink-2 on recent corpora of publications. CSO powers several tools adopted by the editorial team at Springer Nature and has been used to enable a variety of solutions, such as classifying research publications, detecting research communities, and predicting research trends. To facilitate the uptake of CSO we have developed the CSO Portal, a web application that enables users to download, explore, and provide granular feedback on CSO at different levels. Users can use the portal to rate topics and relationships, suggest missing relationships, and visualise sections of the ontology. The portal will support the publication of and access to regular new releases of CSO, with the aim of providing a comprehensive resource to the various communities engaged with scholarly data

    A new model to support the personalised management of a quality e-commerce service

    No full text
    The paper presents an aiding model to support the management of a high quality e-commerce service. The approach focuses on the service quality aspects related to customer relationship management (CRM). Knowing the individual characteristics of a customer, it is possible to supply a personalised and high quality service. A segmentation model, based on the "relationship evolution" between users and Web site, is developed. The method permits the provision of a specific service management for each user segment. Finally, some preliminary experimental results for a sport-clothing industry application are described

    3D geological models and their hydrogeological applications : supporting urban development : a case study in Glasgow-Clyde, UK

    Get PDF
    Urban planners and developers in some parts of the United Kingdom can now access geodata in an easy-to-retrieve and understandable format. 3D attributed geological framework models and associated GIS outputs, developed by the British Geological Survey (BGS), provide a predictive tool for planning site investigations for some of the UK's largest regeneration projects in the Thames and Clyde River catchments. Using the 3D models, planners can get a 3D preview of properties of the subsurface using virtual cross-section and borehole tools in visualisation software, allowing critical decisions to be made before any expensive site investigation takes place, and potentially saving time and money. 3D models can integrate artificial and superficial deposits and bedrock geology, and can be used for recognition of major resources (such as water, thermal and sand and gravel), for example in buried valleys, groundwater modelling and assessing impacts of underground mining. A preliminary groundwater recharge and flow model for a pilot area in Glasgow has been developed using the 3D geological models as a framework. This paper focuses on the River Clyde and the Glasgow conurbation, and the BGS's Clyde Urban Super-Project (CUSP) in particular, which supports major regeneration projects in and around the City of Glasgow in the West of Scotland
    corecore