5,930 research outputs found

    Same Difference: Detecting Collusion by Finding Unusual Shared Elements

    Get PDF
    Pam Green, Peter Lane, Austen Rainer, Sven-Bodo Scholz, Steve Bennett, ‘Same Difference: Detecting Collusion by Finding Unusual Shared Elements’, paper presented at the 5th International Plagiarism Conference, Sage Gateshead, Newcastle, UK, 17-18 July, 2012.Many academic staff will recognise that unusual shared elements in student submissions trigger suspicion of inappropriate collusion. These elements may be odd phrases, strange constructs, peculiar layout, or spelling mistakes. In this paper we review twenty-nine approaches to source-code plagiarism detection, showing that the majority focus on overall file similarity, and not on unusual shared elements, and that none directly measure these elements. We describe an approach to detecting similarity between files which focuses on these unusual similarities. The approach is token-based and therefore largely language independent, and is tested on a set of student assignments, each one consisting of a mix of programming languages. We also introduce a technique for visualising one document in relation to another in the context of the group. This visualisation separates code which is unique to the document, that shared by just the two files, code shared by small groups, and uninteresting areas of the file.Peer reviewe

    SeqDoC: rapid SNP and mutation detection by direct comparison of DNA sequence chromatograms

    Get PDF
    BACKGROUND: This paper describes SeqDoC, a simple, web-based tool to carry out direct comparison of ABI sequence chromatograms. This allows the rapid identification of single nucleotide polymorphisms (SNPs) and point mutations without the need to install or learn more complicated analysis software. RESULTS: SeqDoC produces a subtracted trace showing differences between a reference and test chromatogram, and is optimised to emphasise those characteristic of single base changes. It automatically aligns sequences, and produces straightforward graphical output. The use of direct comparison of the sequence chromatograms means that artefacts introduced by automatic base-calling software are avoided. Homozygous and heterozygous substitutions and insertion/deletion events are all readily identified. SeqDoC successfully highlights nucleotide changes missed by the Staden package 'tracediff' program. CONCLUSION: SeqDoC is ideal for small-scale SNP identification, for identification of changes in random mutagenesis screens, and for verification of PCR amplification fidelity. Differences are highlighted, not interpreted, allowing the investigator to make the ultimate decision on the nature of the change

    Weather persistence on sub-seasonal to seasonal timescales: a methodological review

    Get PDF
    Persistence is an important concept in meteorology. It refers to surface weather or the atmospheric circulation either remaining in approximately the same state (stationarity) or repeatedly occupying the same state (recurrence) over some prolonged period of time. Persistence can be found at many different timescales; however, the sub-seasonal to seasonal (S2S) timescale is especially relevant in terms of impacts and atmospheric predictability. For these reasons, S2S persistence has been attracting increasing attention by the scientific community. The dynamics responsible for persistence and their potential evolution under climate change are a notable focus of active research. However, one important challenge facing the community is how to define persistence, from both a qualitative and quantitative perspective. Despite a general agreement on the concept, many different definitions and perspectives have been proposed over the years, among which it is not always easy to find one’s way. The purpose of this review is to present and discuss existing concepts of weather persistence, associated methodologies and physical interpretations. In particular, we call attention to the fact that persistence can be defined as a global or as a local property of a system, with important implications in terms of methods but also impacts. We also highlight the importance of timescale and similarity metric selection, and illustrate some of the concepts using the example of summertime atmospheric circulation over Western Europ

    The VirusBanker database uses a Java program to allow flexible searching through Bunyaviridae sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Viruses of the <it>Bunyaviridae </it>have segmented negative-stranded RNA genomes and several of them cause significant disease. Many partial sequences have been obtained from the segments so that GenBank searches give complex results. Sequence databases usually use HTML pages to mediate remote sorting, but this approach can be limiting and may discourage a user from exploring a database.</p> <p>Results</p> <p>The VirusBanker database contains <it>Bunyaviridae </it>sequences and alignments and is presented as two spreadsheets generated by a Java program that interacts with a MySQL database on a server. Sequences are displayed in rows and may be sorted using information that is displayed in columns and includes data relating to the segment, gene, protein, species, strain, sequence length, terminal sequence and date and country of isolation. <it>Bunyaviridae </it>sequences and alignments may be downloaded from the second spreadsheet with titles defined by the user from the columns, or viewed when passed directly to the sequence editor, Jalview.</p> <p>Conclusion</p> <p>VirusBanker allows large datasets of aligned nucleotide and protein sequences from the <it>Bunyaviridae </it>to be compiled and winnowed rapidly using criteria that are formulated heuristically.</p

    Persistent topology of the reionisation bubble network. I: Formalism & Phenomenology

    Get PDF
    We present a new formalism for studying the topology of HII regions during the Epoch of Reionisation, based on persistent homology theory. With persistent homology, it is possible to follow the evolution of topological features over time. We introduce the notion of a persistence field as a statistical summary of persistence data and we show how these fields can be used to identify different stages of reionisation. We identify two new stages common to all bubble ionisation scenarios. Following an initial pre-overlap and subsequent overlap stage, the topology is first dominated by neutral filaments (filament stage) and then by enclosed patches of neutral hydrogen undergoing outside-in ionisation (patch stage). We study how these stages are affected by the degree of galaxy clustering. We also show how persistence fields can be used to study other properties of the ionisation topology, such as the bubble size distribution and the fractal-like topology of the largest ionised region.Comment: 18 pages, 12 figures, 1 table. Submitted to MNRA

    Time domain deconvolution in nonlinear elastoplastic soil deposits

    Get PDF
    The paper presents an iterative procedure for the time domain deconvolution in nonlinear elastoplastic materials. The approach is intended for the generation of input motions for dynamic soil–structure interaction (DSSI) numerical analyses when the desired earthquake is specified at the surface of a nonlinear soil deposit. The main advantage is that the same constitutive model (or models) to be used in the DSSI simulation to characterise the soil deposit is also employed in the deconvolution procedure. Therefore, the desired surface motion is recovered from the free-field propagation of the resulting input motion at the base of the numerical model, accounting for the assumed constitutive behaviour of the ground. An application example is also presented, where the potential of the proposed approach is shown.Peer ReviewedPostprint (published version
    • 

    corecore