218 research outputs found

    Multilabel Classification for News Article Using Long Short-Term Memory

    Get PDF
    oai:ojs.sjia.ilkom.unsri.ac.id:article/14Multilabel text classification is a task of categorizing text into one or more categories. Like other machine learning, multilabel classification performance is limited when there is small labeled data and leads to the difficulty of capturing semantic relationships. In this case, it requires a multi-label text classification technique that can group four labels from news articles. Deep Learning is a proposed method for solving problems in multi-label text classification techniques. By comparing the seven proposed Long Short-Term Memory (LSTM) models with large-scale datasets by dividing 4 LSTM models with 1 layer, 2 layer and 3-layer LSTM and Bidirectional LSTM to show that LSTM can achieve good performance in multi-label text classification. The results show that the evaluation of the performance of the 2-layer LSTM model in the training process obtained an accuracy of 96 with the highest testing accuracy of all models at 94.3. The performance results for model 3 with 1-layer LSTM obtained the average value of precision, recall, and f1-score equal to the 94 training process accuracy. This states that model 3 with 1-layer LSTM both training and testing process is better.  The comparison among seven proposed LSTM models shows that model 3 with 1 layer LSTM is the best model

    A Survey on Extreme Multi-label Learning

    Full text link
    Multi-label learning has attracted significant attention from both academic and industry field in recent decades. Although existing multi-label learning algorithms achieved good performance in various tasks, they implicitly assume the size of target label space is not huge, which can be restrictive for real-world scenarios. Moreover, it is infeasible to directly adapt them to extremely large label space because of the compute and memory overhead. Therefore, eXtreme Multi-label Learning (XML) is becoming an important task and many effective approaches are proposed. To fully understand XML, we conduct a survey study in this paper. We first clarify a formal definition for XML from the perspective of supervised learning. Then, based on different model architectures and challenges of the problem, we provide a thorough discussion of the advantages and disadvantages of each category of methods. For the benefit of conducting empirical studies, we collect abundant resources regarding XML, including code implementations, and useful tools. Lastly, we propose possible research directions in XML, such as new evaluation metrics, the tail label problem, and weakly supervised XML.Comment: A preliminary versio

    New Directions in Online Learning: Boosting, Partial Information, and Non-Stationarity

    Full text link
    Online learning, where a learning algorithm fits a model on-the-fly with streaming data, has become an important research area in machine learning. Batch learning, where the entire data set has to be available to the learning algorithm, is not always a suitable paradigm for the big data era. It is increasingly common in many practical situations, such as online ads prediction or control of self-driving cars, that data instances naturally arrive in a sequential manner. In these situations, researchers want to update their model in an online fashion. This dissertation pursues several topics at the frontier of online learning research. In Chapter 2 and Chapter 3, the journey starts with online boosting. Online boosting studies how to combine multiple online weak learners to get a stronger learner. Chapter 2 considers online multi-class classification problems. Chapter 3 focuses on the more challenging multi-label ranking problem where there are multiple correct labels and the learner outputs a ranking of labels based on their relevance. In both chapters, an optimal algorithm and an adaptive algorithm are proposed. The optimal algorithms require a minimal number of weak learners to attain the desired accuracy. The adaptive algorithms are practically more useful since they do not require a priori knowledge about the strength of weak learners and are more computationally efficient. The adaptive algorithms are not statistically optimal but they still come with reasonable performance guarantees. The empirical results on real data sets support the theoretical findings and the proposed boosting algorithms outperformed existing competitors on benchmark data sets. Chapter 4 considers the partial information setting, where the learner does not receive the true labels. Partial feedback is common in practice as obtaining complete feedback can be costly. The chapter revisits the boosting algorithms that are presented in Chapter 2 and Chapter 3 and extends them to work with partial information feedback. Despite the learner receiving much less information, comparable performance guarantees can be made. Later in Chapter 5 and Chapter 6, we move on to another interesting area in online learning called restless bandit problems. Unlike the classical (stochastic) multi-armed bandit problems where the reward distributions are unknown but stationary, in restless bandit problems the distributions can change over time. This extra layer of complexity allows us to study more complicated models, but the analysis becomes even more difficult. In restless bandit problems, it is assumed that each arm has a state that evolves according to an unknown Markov process, and the reward distribution depends on the arm's current state. This setting can be thought of as a sub-class of reinforcement learning and the partial observability inherent in this problem makes the analysis very challenging. The well known Thompson Sampling algorithm is analyzed and a Bayesian regret bound for it is derived. Chapter 5 considers the episodic case where the system periodically resets. Chapter 6 extends the analysis to the more challenging non-episodic (i.e., infinite time horizon) case. In both settings, Thompson Sampling algorithms (with slight modifications) enjoy sub-linear regret bounds, and the empirical results on simulated data support this fact. The experiments also suggest the possibility that the algorithm can be used in the frequentist setting even though the theoretical bounds are only shown for the Bayesian regret.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155110/1/yhjung_1.pd

    Human-in-the-Loop Learning From Crowdsourcing and Social Media

    Get PDF
    Computational social studies using public social media data have become more and more popular because of the large amount of user-generated data available. The richness of social media data, coupled with noise and subjectivity, raise significant challenges for computationally studying social issues in a feasible and scalable manner. Machine learning problems are, as a result, often subjective or ambiguous when humans are involved. That is, humans solving the same problems might come to legitimate but completely different conclusions, based on their personal experiences and beliefs. When building supervised learning models, particularly when using crowdsourced training data, multiple annotations per data item are usually reduced to a single label representing ground truth. This inevitably hides a rich source of diversity and subjectivity of opinions about the labels. Label distribution learning associates for each data item a probability distribution over the labels for that item, thus it can preserve diversities of opinions, beliefs, etc. that conventional learning hides or ignores. We propose a humans-in-the-loop learning framework to model and study large volumes of unlabeled subjective social media data with less human effort. We study various annotation tasks given to crowdsourced annotators and methods for aggregating their contributions in a manner that preserves subjectivity and disagreement. We introduce a strategy for learning label distributions with only five-to-ten labels per item by aggregating human-annotated labels over multiple, semantically related data items. We conduct experiments using our learning framework on data related to two subjective social issues (work and employment, and suicide prevention) that touch many people worldwide. Our methods can be applied to a broad variety of problems, particularly social problems. Our experimental results suggest that specific label aggregation methods can help provide reliable representative semantics at the population level
    corecore