2,123 research outputs found

    TRIDEnT: Building Decentralized Incentives for Collaborative Security

    Full text link
    Sophisticated mass attacks, especially when exploiting zero-day vulnerabilities, have the potential to cause destructive damage to organizations and critical infrastructure. To timely detect and contain such attacks, collaboration among the defenders is critical. By correlating real-time detection information (alerts) from multiple sources (collaborative intrusion detection), defenders can detect attacks and take the appropriate defensive measures in time. However, although the technical tools to facilitate collaboration exist, real-world adoption of such collaborative security mechanisms is still underwhelming. This is largely due to a lack of trust and participation incentives for companies and organizations. This paper proposes TRIDEnT, a novel collaborative platform that aims to enable and incentivize parties to exchange network alert data, thus increasing their overall detection capabilities. TRIDEnT allows parties that may be in a competitive relationship, to selectively advertise, sell and acquire security alerts in the form of (near) real-time peer-to-peer streams. To validate the basic principles behind TRIDEnT, we present an intuitive game-theoretic model of alert sharing, that is of independent interest, and show that collaboration is bound to take place infinitely often. Furthermore, to demonstrate the feasibility of our approach, we instantiate our design in a decentralized manner using Ethereum smart contracts and provide a fully functional prototype.Comment: 28 page

    OnionBots: Subverting Privacy Infrastructure for Cyber Attacks

    Full text link
    Over the last decade botnets survived by adopting a sequence of increasingly sophisticated strategies to evade detection and take overs, and to monetize their infrastructure. At the same time, the success of privacy infrastructures such as Tor opened the door to illegal activities, including botnets, ransomware, and a marketplace for drugs and contraband. We contend that the next waves of botnets will extensively subvert privacy infrastructure and cryptographic mechanisms. In this work we propose to preemptively investigate the design and mitigation of such botnets. We first, introduce OnionBots, what we believe will be the next generation of resilient, stealthy botnets. OnionBots use privacy infrastructures for cyber attacks by completely decoupling their operation from the infected host IP address and by carrying traffic that does not leak information about its source, destination, and nature. Such bots live symbiotically within the privacy infrastructures to evade detection, measurement, scale estimation, observation, and in general all IP-based current mitigation techniques. Furthermore, we show that with an adequate self-healing network maintenance scheme, that is simple to implement, OnionBots achieve a low diameter and a low degree and are robust to partitioning under node deletions. We developed a mitigation technique, called SOAP, that neutralizes the nodes of the basic OnionBots. We also outline and discuss a set of techniques that can enable subsequent waves of Super OnionBots. In light of the potential of such botnets, we believe that the research community should proactively develop detection and mitigation methods to thwart OnionBots, potentially making adjustments to privacy infrastructure.Comment: 12 pages, 8 figure

    Exploring Linguistic Constraints in Nlp Applications

    Get PDF
    The key argument of this dissertation is that the success of an Natural Language Processing (NLP) application depends on a proper representation of the corresponding linguistic problem. This theme is raised in the context that the recent progress made in our field is widely credited to the effective use of strong engineering techniques. However, the intriguing power of highly lexicalized models shown in many NLP applications is not only an achievement by the development in machine learning, but also impossible without the extensive hand-annotated data resources made available, which are originally built with very deep linguistic considerations. More specifically, we explore three linguistic aspects in this dissertation: the distinction between closed-class vs. open-class words, long-tail distributions in vocabulary study and determinism in language models. The first two aspects are studied in unsupervised tasks, unsupervised part-of-speech (POS) tagging and morphology learning, and the last one is studied in supervised tasks, English POS tagging and Chinese word segmentation. Each linguistic aspect under study manifests itself in a (different) way to help improve performance or efficiency in some NLP application

    Integrating Personality Psychology into Economics

    Get PDF
    This paper reviews the problems and potential benefits of integrating personality psychology into economics. Economists have much to learn from and contribute to personality psychology.personality psychology, behavioral economics, identification, causality

    PLAZA 4.0 : an integrative resource for functional, evolutionary and comparative plant genomics

    Get PDF
    PLAZA (https://bioinformatics.psb.ugent.be/plaza) is a plant-oriented online resource for comparative, evolutionary and functional genomics. The PLAZA platform consists of multiple independent instances focusing on different plant clades, while also providing access to a consistent set of reference species. Each PLAZA instance contains structural and functional gene annotations, gene family data and phylogenetic trees and detailed gene colinearity information. A user-friendly web interface makes the necessary tools and visualizations accessible, specific for each data type. Here we present PLAZA 4.0, the latest iteration of the PLAZA framework. This version consists of two new instances (Dicots 4.0 and Monocots 4.0) providing a large increase in newly available species, and offers access to updated and newly implemented tools and visualizations, helping users with the ever-increasing demands for complex and in-depth analyzes. The total number of species across both instances nearly doubles from 37 species in PLAZA 3.0 to 71 species in PLAZA 4.0, with a much broader coverage of crop species (e.g. wheat, palm oil) and species of evolutionary interest (e.g. spruce, Marchantia). The new PLAZA instances can also be accessed by a programming interface through a RESTful web service, thus allowing bioinformaticians to optimally leverage the power of the PLAZA platform

    Text Generation with Efficient (Soft) Q-Learning

    Full text link
    Maximum likelihood estimation (MLE) is the predominant algorithm for training text generation models. This paradigm relies on direct supervision examples, which is not applicable to many applications, such as generating adversarial attacks or generating prompts to control language models. Reinforcement learning (RL) on the other hand offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward. Yet previous RL algorithms for text generation, such as policy gradient (on-policy RL) and Q-learning (off-policy RL), are often notoriously inefficient or unstable to train due to the large sequence space and the sparse reward received only at the end of sequences. In this paper, we introduce a new RL formulation for text generation from the soft Q-learning perspective. It further enables us to draw from the latest RL advances, such as path consistency learning, to combine the best of on-/off-policy updates, and learn effectively from sparse reward. We apply the approach to a wide range of tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation. Experiments show our approach consistently outperforms both task-specialized algorithms and the previous RL methods. On standard supervised tasks where MLE prevails, our approach also achieves competitive performance and stability by training text generation from scratch.Comment: Code available at https://github.com/HanGuo97/soft-Q-learning-for-text-generatio

    Annotating patient clinical records with syntactic chunks and named entities: the Harvey corpus

    Get PDF
    The free text notes typed by physicians during patient consultations contain valuable information for the study of disease and treatment. These notes are difficult to process by existing natural language analysis tools since they are highly telegraphic (omitting many words), and contain many spelling mistakes, inconsistencies in punctuation, and non-standard word order. To support information extraction and classification tasks over such text, we describe a de-identified corpus of free text notes, a shallow syntactic and named entity annotation scheme for this kind of text, and an approach to training domain specialists with no linguistic background to annotate the text. Finally, we present a statistical chunking system for such clinical text with a stable learning rate and good accuracy, indicating that the manual annotation is consistent and that the annotation scheme is tractable for machine learning
    • …
    corecore