5,874 research outputs found

    The Minimum Description Length Principle for Pattern Mining: A Survey

    Full text link
    This is about the Minimum Description Length (MDL) principle applied to pattern mining. The length of this description is kept to the minimum. Mining patterns is a core task in data analysis and, beyond issues of efficient enumeration, the selection of patterns constitutes a major challenge. The MDL principle, a model selection method grounded in information theory, has been applied to pattern mining with the aim to obtain compact high-quality sets of patterns. After giving an outline of relevant concepts from information theory and coding, as well as of work on the theory behind the MDL and similar principles, we review MDL-based methods for mining various types of data and patterns. Finally, we open a discussion on some issues regarding these methods, and highlight currently active related data analysis problems

    Knowledge acquisition for effective and efficient use of engineering software

    Get PDF
    The problem of effective and efficient use of engineering software can be thought of as a Pareto optimal problem. However, the complexity of modern engineering software precludes the possibility of acquiring complete knowledge of the software's Pareto optimal set. Instead, heuristic knowledge must be acquired. The thesis proposes that heuristic knowledge be acquired via a knowledge acquisition procedure. The use of a knowledge acquisition system, which may be computerised, forms an integral part of this procedure. Two examples of knowledge acquisition illustrate the use of the knowledge acquisition procedure

    Keeping Research Data Safe 2: Final Report

    Get PDF
    The first Keeping Research Data Safe study funded by JISC made a major contribution to understanding of long-term preservation costs for research data by developing a cost model and indentifying cost variables for preserving research data in UK universities (Beagrie et al, 2008). However it was completed over a very constrained timescale of four months with little opportunity to follow up other major issues or sources of preservation cost information it identified. It noted that digital preservation costs are notoriously difficult to address in part because of the absence of good case studies and longitudinal information for digital preservation costs or cost variables. In January 2009 JISC issued an ITT for a study on the identification of long-lived digital datasets for the purposes of cost analysis. The aim of this work was to provide a larger body of material and evidence against which existing and future data preservation cost modelling exercises could be tested and validated. The proposal for the KRDS2 study was submitted in response by a consortium consisting of 4 partners involved in the original Keeping Research Data Safe study (Universities of Cambridge and Southampton, Charles Beagrie Ltd, and OCLC Research) and 4 new partners with significant data collections and interests in preservation costs (Archaeology Data Service, University of London Computer Centre, University of Oxford, and the UK Data Archive). A range of supplementary materials in support of this main report have been made available on the KRDS2 project website at http://www.beagrie.com/jisc.php. That website will be maintained and continuously updated with future work as a resource for KRDS users

    Metadiscourse: What is it and where is it going?

    Get PDF
    Metadiscourse – the ways in which writers and speakers interact through their use of language with readers and listeners – is a widely used term in current discourse analysis, pragmatics and language teaching. This interest has grown up over the past 40 years driven by a dual purpose. The first is a desire to understand the relationship between language and its contexts of use. That is, how individuals use language to orient to and interpret particular communicative situations, and especially how they draw on their understandings of these to make their intended meanings clear to their interlocutors. The second is to employ this knowledge in the service of language and literacy education. But while many researchers and teachers find it to be a conceptually rich and analytically powerful idea, it is not without difficulties of definition, categorisation and analysis. In this paper I explore the strengths and shortcomings of the concept and map its influence and directions through a state of the art analysis of the main online academic databases and current published research

    Finding Interpretable Class-Specific Patterns through Efficient Neural Search

    Full text link
    Discovering patterns in data that best describe the differences between classes allows to hypothesize and reason about class-specific mechanisms. In molecular biology, for example, this bears promise of advancing the understanding of cellular processes differing between tissues or diseases, which could lead to novel treatments. To be useful in practice, methods that tackle the problem of finding such differential patterns have to be readily interpretable by domain experts, and scalable to the extremely high-dimensional data. In this work, we propose a novel, inherently interpretable binary neural network architecture DIFFNAPS that extracts differential patterns from data. DiffNaps is scalable to hundreds of thousands of features and robust to noise, thus overcoming the limitations of current state-of-the-art methods in large-scale applications such as in biology. We show on synthetic and real world data, including three biological applications, that, unlike its competitors, DiffNaps consistently yields accurate, succinct, and interpretable class description

    Gender, sex and sexuality in two open access communication journals published in Portugal: a critical overview of current discursive practices

    Get PDF
    The links between gender, sex and sexuality and their relevance are theoretically and politically problematic (Richardson, 2007). One of the difficulties in understanding their interconnections is that these terms are often used differently and ambiguously by different authors (and even by the same authors). This article reports the results of an analysis of the articles published in open access communication journals with known impact factor, edited in Portugal and published between 2005 and 2012. The diverse conceptualisations of those three basic concepts and of their (inter)relationships within communication research are identified. The complexity and the intricate (and often implicit) nature of both the meanings of these categories and their relationships underlie and justify our attention and further research. What the findings suggest about the current communication research into gender issues published in the two journals surveyed is that the ‘Gender differences discourse’ (Sunderland, 2004) is the most pervasive discourse (also) in academic practice. Additionally, they show that gender and sex are mainly taken for a fact, not a question that is worth being studied. The editors of these journals, as well as the scholars submitting manuscripts, need to be more aware of the traditional nature of the theoretical and methodological choices that they make regarding gender- and sex-related issues, as well as of the relative lack of attention to sexuality as a research subject.Compete e Quadro de ReferĂȘncia EstratĂ©gica Nacional (QREN)

    Healthcare in Italy: expenditure determinants and regional differentials

    Get PDF
    The aim of this work is to identify the determinants of health spending differentials among Italian regions, which could highlight the existence of potential margins for savings. The analysis exploits a dataset for the panel of the 21 Italian regions starting in the early 1990s and ending in 2006. After having controlled for standard healthcare demand indicators, spending differentials appear to be associated with differences in the degree of appropriateness of the treatments, supply structure and social capital indicators. These results suggest that savings could be achieved without reducing the amount of services supplied to citizens. This is particularly important in view of the expected rise in health spending associated with the forecast demographic developments.government expenditure, health, regional variation
    • 

    corecore