666 research outputs found

    A Very Brief Introduction to Machine Learning With Applications to Communication Systems

    Get PDF
    Given the unprecedented availability of data and computing resources, there is widespread renewed interest in applying data-driven machine learning methods to problems for which the development of conventional engineering solutions is challenged by modelling or algorithmic deficiencies. This tutorial-style paper starts by addressing the questions of why and when such techniques can be useful. It then provides a high-level introduction to the basics of supervised and unsupervised learning. For both supervised and unsupervised learning, exemplifying applications to communication networks are discussed by distinguishing tasks carried out at the edge and at the cloud segments of the network at different layers of the protocol stack

    Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging

    Get PDF
    We consider the construction of part-of-speech taggers for resource-poor languages. Recently, manually constructed tag dictionaries from Wiktionary and dictionaries projected via bitext have been used as type constraints to overcome the scarcity of annotated data in this setting. In this paper, we show that additional token constraints can be projected from a resource-rich source language to a resource-poor target language via word-aligned bitext. We present several models to this end; in particular a partially observed conditional random ïŹeld model, where coupled token and type constraints provide a partial signal for training. Averaged across eight previously studied Indo-European languages, our model achieves a 25% relative error reduction over the prior state of the art. We further present successful results on seven additional languages from different families, empirically demonstrating the applicability of coupled token and type constraints across a diverse set of languages

    Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification

    Full text link
    Hierarchical text classification has many real-world applications. However, labeling a large number of documents is costly. In practice, we can use semi-supervised learning or weakly supervised learning (e.g., dataless classification) to reduce the labeling cost. In this paper, we propose a path cost-sensitive learning algorithm to utilize the structural information and further make use of unlabeled and weakly-labeled data. We use a generative model to leverage the large amount of unlabeled data and introduce path constraints into the learning algorithm to incorporate the structural information of the class hierarchy. The posterior probabilities of both unlabeled and weakly labeled data can be incorporated with path-dependent scores. Since we put a structure-sensitive cost to the learning algorithm to constrain the classification consistent with the class hierarchy and do not need to reconstruct the feature vectors for different structures, we can significantly reduce the computational cost compared to structural output learning. Experimental results on two hierarchical text classification benchmarks show that our approach is not only effective but also efficient to handle the semi-supervised and weakly supervised hierarchical text classification.Comment: Aceepted by 2019 World Wide Web Conference (WWW19

    A Critical Analysis Of The State-Of-The-Art On Automated Detection Of Deceptive Behavior In Social Media

    Get PDF
    Recently, a large body of research has been devoted to examine the user behavioral patterns and the business implications of social media. However, relatively little research has been conducted regarding users’ deceptive activities in social media; these deceptive activities may hinder the effective application of the data collected from social media to perform e-marketing and initiate business transformation in general. One of the main contributions of this paper is the critical analysis of the possible forms of deceptive behavior in social media and the state-of-the-art technologies for automated deception detection in social media. Based on the proposed taxonomy of major deception types, the assumptions, advantages, and disadvantages of the popular deception detection methods are analyzed. Our critical analysis shows that deceptive behavior may evolve over time, and so making it difficult for the existing methods to effectively detect social media spam. Accordingly, another main contribution of this paper is the design and development of a generic framework to combat dynamic deceptive activities in social media. The managerial implication of our research is that business managers or marketers will develop better insights about the possible deceptive behavior in social media before they tap into social media to collect and generate market intelligence. Moreover, they can apply the proposed adaptive deception detection framework to more effectively combat the ever increasing and evolving deceptive activities in social medi
    • 

    corecore