102,271 research outputs found

    Space-efficient detection of unusual words

    Full text link
    Detecting all the strings that occur in a text more frequently or less frequently than expected according to an IID or a Markov model is a basic problem in string mining, yet current algorithms are based on data structures that are either space-inefficient or incur large slowdowns, and current implementations cannot scale to genomes or metagenomes in practice. In this paper we engineer an algorithm based on the suffix tree of a string to use just a small data structure built on the Burrows-Wheeler transform, and a stack of O(σ2log2n)O(\sigma^2\log^2 n) bits, where nn is the length of the string and σ\sigma is the size of the alphabet. The size of the stack is o(n)o(n) except for very large values of σ\sigma. We further improve the algorithm by removing its time dependency on σ\sigma, by reporting only a subset of the maximal repeats and of the minimal rare words of the string, and by detecting and scoring candidate under-represented strings that do not occur\textit{do not occur} in the string. Our algorithms are practical and work directly on the BWT, thus they can be immediately applied to a number of existing datasets that are available in this form, returning this string mining problem to a manageable scale.Comment: arXiv admin note: text overlap with arXiv:1502.0637

    Does Familiarity breed inattention? Why drivers crash on the roads they know best

    Get PDF
    This paper describes our research into the nature of everyday driving, with a particular emphasis on the processes that govern driver behaviour in familiar, well - practiced situations. The research examined the development and maintenance of proceduralised driving habits in a high-fidelity driving simulator by paying 29 participants to drive a simulated road regularly over three months of testing. A range of measures, including detection task performance and driving performance were collected over the course of 20 sessions. Performance from a yoked control group who experienced the same road scenarios in a single session was also measured. The data showed the development of stereotyped driving patterns and changes in what drivers noticed, indicative of in attentional blindness and “driving without awareness”. Extended practice also resulted in increased sensitivity for detecting changes to foveal road features associated with vehicle guidance and performance on an embedded vehicle detection task (detection of a specific vehicle type). The changes in attentional focus and driving performance resulting from extended practice help explain why drivers are at increased risk of crashing on roads they know well. Identifying the features of familiar roads that attract driver attention, even when they are driving without awareness, can inform new interventions and designs for safer roads. The data also provide new light on a range of previous driver behaviour research including a “Tandem Model” that includes both explicit and implicit processes involved in driving performance

    Detecting fraud: Utilizing new technology to advance the audit profession

    Get PDF
    corecore