4,147 research outputs found

    Adapting Sequence Models for Sentence Correction

    Full text link
    In a controlled experiment of sequence-to-sequence approaches for the task of sentence correction, we find that character-based models are generally more effective than word-based models and models that encode subword information via convolutions, and that modeling the output data as a series of diffs improves effectiveness over standard approaches. Our strongest sequence-to-sequence model improves over our strongest phrase-based statistical machine translation model, with access to the same data, by 6 M2 (0.5 GLEU) points. Additionally, in the data environment of the standard CoNLL-2014 setup, we demonstrate that modeling (and tuning against) diffs yields similar or better M2 scores with simpler models and/or significantly less data than previous sequence-to-sequence approaches.Comment: EMNLP 201

    There and back again: detecting regularity in human encounter communities

    Get PDF
    Detecting communities that recur over time is a challenging problem due to the potential sparsity of encounter events at an individual scale and inherent uncertainty in human behavior. Existing methods for community detection in mobile human encounter networks ignore the presence of temporal patterns that lead to periodic components in the network. Daily and weekly routine are prevalent in human behavior and can serve as rich context for applications that rely on person-to-person encounters, such as mobile routing protocols and intelligent digital personal assistants. In this article, we present the design, implementation, and evaluation of an approach to decentralized periodic community detection that is robust to uncertainty and computationally efficient. This alternative approach has a novel periodicity detection method inspired by a neural synchrony measure used in the field of neurophysiology. We evaluate our approach and investigate human periodic encounter patterns using empirical datasets of inferred and direct-sensed encounters

    Reachable but not receptive: enhancing smartphone interruptibility prediction by modelling the extent of user engagement with notifications

    Get PDF
    Smartphone notifications frequently interrupt our daily lives, often at inopportune moments. We propose the decision-on-information-gain model, which extends the existing data collection convention to capture a range of interruptibility behaviour implicitly. Through a six-month in-the-wild study of 11,346 notifications, we find that this approach captures up to 125% more interruptibility cases. Secondly, we find different correlating contextual features for different behaviour using the approach and find that predictive models can be built with >80% precision for most users. However we note discrepancies in performance across labelling, training, and evaluation methods, creating design considerations for future systems

    Methods of isolation and identification of pathogenic and potential pathogenic bacteria from skins and tannery effluents

    Get PDF
    Currently there is no standard protocol available within the leather industry to isolate and identify pathogenic bacteria from hides, skins or tannery effluent. This study was therefore carried out to identify simple but effective methods for isolation and identification of bacterial pathogens from the effluent and skins during leather processing. Identification methods based on both phenotypic and genotypic characteristics were investigated. Bacillus cereus and Pseudomonas aeruginosa were used as indicator bacteria to evaluate the isolation and identification methods. Decontaminated calfskins were inoculated with a pure culture of the above mentioned bacterial species followed by a pre-tanning and chromium tanning processes. Effluent samples were collected and skins were swabbed at the end of each processing stage. Bacterial identification was carried out based on the phenotypic characteristics; such as colony appearance on selective solid media, cell morphology following a standard Gram-staining and spore staining techniques, and biochemical reactions, e.g., the ability of a bacterial species to ferment particular sugars and ability to produce certain enzymes. Additionally, an identification system based on bacterial phenotypic characteristics, known as Biolog® system was applied. A pulsed-filed gel electrophoresis (PFGE) method for bacterial DNA fingerprinting was also evaluated and used for the identification of the inoculated bacteria. The methods described in the study were found to be effective for the identification of pathogenic bacteria from skins and effluent

    A chi-squared time-frequency discriminator for gravitational wave detection

    Full text link
    Searches for known waveforms in gravitational wave detector data are often done using matched filtering. When used on real instrumental data, matched filtering often does not perform as well as might be expected, because non-stationary and non-Gaussian detector noise produces large spurious filter outputs (events). This paper describes a chi-squared time-frequency test which is one way to discriminate such spurious events from the events that would be produced by genuine signals. The method works well only for broad-band signals. The case where the filter template does not exactly match the signal waveform is also considered, and upper bounds are found for the expected value of chi-squared.Comment: 18 pages, five figures, RevTex

    Personality homophily and geographic distance in Facebook

    Get PDF
    Personality homophily remains an understudied aspect of social networks, with the traditional focus concerning sociodemographic variables as the basis for assortativity, rather than psychological dispositions. We consider the effect of personality homophily on one of the biggest constraints to human social networks: geographic distance. We use the Big Five model of personality to make predictions for each of the five facets: Openness to experience, Conscientiousness, Extraversion, Agreeableness, and Neuroticism. Using a network of 313,669 Facebook users, we investigate the difference in geographic distance between homophilous pairs, in which both users scored similarly on a particular facet, and mixed pairs. In accordance with our hypotheses, we find that pairs of open and conscientious users are geographically further apart than mixed pairs. Pairs of extraverts, on the other hand, tend to be geographically closer together. We find mixed results for the Neuroticism facet, and no significant effects for the Agreeableness facet. The results are discussed in the context of personality homophily and the impact of geographic distance on social connections

    Retweeting beyond expectation: Inferring interestingness in Twitter

    Get PDF
    Online social networks such as Twitter have emerged as an important mechanism for individuals to share information and post user generated content. However, filtering interesting content from the large volume of messages received through Twitter places a significant cognitive burden on users. Motivated by this problem, we develop a new automated mechanism to detect personalised interestingness, and investigate this for Twitter. Instead of undertaking semantic content analysis and matching of tweets, our approach considers the human response to content, in terms of whether the content is sufficiently stimulating to get repeatedly chosen by users for forwarding (retweeting). This approach involves machine learning against features that are relevant to a particular user and their network, to obtain an expected level of retweeting for a user and a tweet. Tweets observed to be above this expected level are classified as interesting. We implement the approach in Twitter and evaluate it using comparative human tweet assessment in two forms: through aggregated assessment using Mechanical Turk, and through a web-based experiment for Twitter users. The results provide confidence that the approach is effective in identifying the more interesting tweets from a user’s timeline. This has important implications for reduction of cognitive burden: the results show that timelines can be considerably shortened while maintaining a high degree of confidence that more interesting tweets will be retained. In conclusion we discuss how the technique could be applied to mitigate possible filter bubble effects
    • …
    corecore