1,041 research outputs found

    ER-AE: Differentially-private Text Generation for Authorship Anonymization

    Get PDF
    Most of privacy protection studies for textual data focus on removing explicit sensitive identifiers. However, personal writing style, as a strong indicator of the authorship, is often neglected. Recent studies on writing style anonymization can only output numeric vectors which are difficult for the recipients to interpret. We propose a novel text generation model with the exponential mechanism for authorship anonymization. By augmenting the semantic information through a REINFORCE training reward function, the model can generate differentially-private text that has a close semantic and similar grammatical structure to the original text while removing personal traits of the writing style. It does not assume any conditioned labels or paralleled text data for training. We evaluate the performance of the proposed model on the real-life peer reviews dataset and the Yelp review dataset. The result suggests that our model outperforms the state-of-the-art on semantic preservation, authorship obfuscation, and stylometric transformation

    Arabic authorship attribution: An extensive study on twitter posts

    Get PDF
    © 2018 ACM Law enforcement faces problems in tracing the true identity of offenders in cybercrime investigations. Most offenders mask their true identity, impersonate people of high authority, or use identity deception and obfuscation tactics to avoid detection and traceability. To address the problem of anonymity, authorship analysis is used to identify individuals by their writing styles without knowing their actual identities. Most authorship studies are dedicated to English due to its widespread use over the Internet, but recent cyber-attacks such as the distribution of Stuxnet indicate that Internet crimes are not limited to a certain community, language, culture, ideology, or ethnicity. To effectively investigate cybercrime and to address the problem of anonymity in online communication, there is a pressing need to study authorship analysis of languages such as Arabic, Chinese, Turkish, and so on. Arabic, the focus of this study, is the fourth most widely used language on the Internet. This study investigates authorship of Arabic discourse/text, especially tiny text, Twitter posts. We benchmark the performance of a profile-based approach that uses n-grams as features and compare it with state-of-the-art instance-based classification techniques. Then we adapt an event-visualization tool that is developed for English to accommodate both Arabic and English languages and visualize the result of the attribution evidence. In addition, we investigate the relative effect of the training set, the length of tweets, and the number of authors on authorship classification accuracy. Finally, we show that diacritics have an insignificant effect on the attribution process and part-of-speech tags are less effective than character-level and word-level n-grams

    Pluvio: Assembly Clone Search for Out-of-domain Architectures and Libraries through Transfer Learning and Conditional Variational Information Bottleneck

    Full text link
    The practice of code reuse is crucial in software development for a faster and more efficient development lifecycle. In reality, however, code reuse practices lack proper control, resulting in issues such as vulnerability propagation and intellectual property infringements. Assembly clone search, a critical shift-right defence mechanism, has been effective in identifying vulnerable code resulting from reuse in released executables. Recent studies on assembly clone search demonstrate a trend towards using machine learning-based methods to match assembly code variants produced by different toolchains. However, these methods are limited to what they learn from a small number of toolchain variants used in training, rendering them inapplicable to unseen architectures and their corresponding compilation toolchain variants. This paper presents the first study on the problem of assembly clone search with unseen architectures and libraries. We propose incorporating human common knowledge through large-scale pre-trained natural language models, in the form of transfer learning, into current learning-based approaches for assembly clone search. Transfer learning can aid in addressing the limitations of the existing approaches, as it can bring in broader knowledge from human experts in assembly code. We further address the sequence limit issue by proposing a reinforcement learning agent to remove unnecessary and redundant tokens. Coupled with a new Variational Information Bottleneck learning strategy, the proposed system minimizes the reliance on potential indicators of architectures and optimization settings, for a better generalization of unseen architectures. We simulate the unseen architecture clone search scenarios and the experimental results show the effectiveness of the proposed approach against the state-of-the-art solutions.Comment: 13 pages and 4 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Computerised Clinical Reminders Use in an Integrated Healthcare System

    Get PDF
    Objective: To examine levels of routine computerised clinical reminder use in a nationwide sample of primary care physicians and to identify factors influencing reminder use. Design: Cross-sectional using a self-administered questionnaire. Setting: The United States Veterans Health Administration. Methods: Survey responses from 461 VHA primary care physicians sampled from across the Veterans Health Administration were sampled and analysed. We asked physicians how many computerised clinical reminders they use per patient per visit and when they typically use computerised clinical reminders in their clinics. Measured physician characteristics included age, gender, year of medical degree, number of days in clinic per week, and attitudes towards computerised clinical reminders (measured on Likert-like scales). We used multivariable linear regression to determine factors associated with greater use of computerised clinical reminders per patient per visit. Results: Average computerised clinical reminder use per patient visit was 4.2 (SD = 2.5). Eightysix percent of physicians resolve reminders during the visit. In a multivariable regression model, a higher score on the team factors scale is associated with use of more reminders (increase of 0.24 reminders for each unit increase on the team factors scale, or one extra reminder for each four unit increase in the team factor scale). Working more days in clinic is associated with use of more reminders per patient visit (increase of 0.13 reminders for each extra half-day of clinic per week, or about one additional reminder for physicians working ten half-days per week versus physicians working two half-days per week). Academic facility affiliation is associated with one less reminder used per patient visit as compared with no affiliation. Conclusions: Most United States Veterans Health Administration primary care physicians use computerised clinical reminders, typically during the patient visit. Strategies to increase reminder use should focus on improving physicians’ understanding of their role in completing reminder-related tasks and improving usability for users such as physicians who work in clinic less frequently

    Convergence of TOR-nitrogen and Snf1-glucose signaling pathways onto Gln3

    Get PDF
    Carbon and nitrogen are two basic nutrient sources for cellular organisms. They supply precursors for energy metabolism and metabolic biosynthesis. In the yeast Saccharomyces cerevisiae, distinct sensing and signaling pathways have been described that regulate gene expression in response to the quality of carbon and nitrogen sources, respectively. Gln3 is a GATA-type transcription factor of nitrogen catabolite-repressible (NCR) genes. Previous observations indicate that the quality of nitrogen sources controls the phosphorylation and cytoplasmic retention of Gln3 via the target of rapamycin (TOR) protein. In this study, we show that glucose also regulates Gln3 phosphorylation and subcellular localization, which is mediated by Snf1, the yeast homolog of AMP-dependent protein kinase and a cytoplasmic glucose sensor. Our data show that glucose and nitrogen signaling pathways converge onto Gln3, which may be critical for both nutrient sensing and starvation responses

    Use of a Human Skin-Grafted Nude Mouse Model for the Evaluation of Topical Retinoic Acid Treatment

    Get PDF
    Cultured human keratinocytes and artificial dermal equivalents maintained in vitro do not perfectly mimic the terminal differentiation patterns and response to drugs observed in intact human skin. We have made use of human skin grafted onto nude mice to demonstrate that such grafts maintain the pattern of pharmacologic responsiveness to all-trans retinoic acid previously reported in human subjects. The use of a quantitative polymerase chain reaction method to measure induction of a retinoic acid responsive gene, cytoplasmic retinoic acid binding protein II, has made it possible to generate objective data suitable for investigations of drug efficacy. This method of using grafted human skin has potential broad applicability for investigation of topical drugs in a number of therapeutic fields

    Personal and Ambient Air Pollution is Associated with Increased Exhaled Nitric Oxide in Children with Asthma

    Get PDF
    BACKGROUND: Research has shown associations between pediatric asthma outcomes and airborne particulate matter (PM). The importance of particle components remains to be determined. METHODS: We followed a panel of 45 schoolchildren with persistent asthma living in Southern California. Subjects were monitored over 10 days with offline fractional exhaled nitric oxide (Fe(NO)), a biomarker of airway inflammation. Personal active sampler exposures included continuous particulate matter < 2.5 μm in aerodynamic diameter (PM(2.5)), 24-hr PM(2.5) elemental and organic carbon (EC, OC), and 24-hr nitrogen dioxide. Ambient exposures included PM(2.5), PM(2.5) EC and OC, and NO(2). Data were analyzed with mixed models controlling for personal temperature, humidity and 10-day period. RESULTS: The strongest positive associations were between Fe(NO) and 2-day average pollutant concentrations. Per interquartile range pollutant increase, these were: for 24 μg/m(3) personal PM(2.5), 1.1 ppb Fe(NO) [95% confidence interval (CI), 0.1–1.9]; for 0.6 μg/m(3) personal EC, 0.7 ppb Fe(NO) (95% CI, 0.3–1.1); for 17 ppb personal NO(2), 1.6 ppb Fe(NO) (95% CI, 0.4–2.8). Larger associations were found for ambient EC and smaller associations for ambient NO(2). Ambient PM(2.5) and personal and ambient OC were significant only in subjects taking inhaled corticosteroids (ICS) alone. Subjects taking both ICS and antileukotrienes showed no significant associations. Distributed lag models showed personal PM(2.5) in the preceding 5 hr was associated with Fe(NO). In two-pollutant models, the most robust associations were for personal and ambient EC and NO(2), and for personal but not ambient PM(2.5). CONCLUSION: PM associations with airway inflammation in asthmatics may be missed using ambient particle mass, which may not sufficiently represent causal pollutant components from fossil fuel combustion
    corecore