Search CORE

12 research outputs found

Predicting Events Surrounding the Egyptian Revolution of 2011 Using Learning Algorithms on Micro Blog Data

Author: Boecking Benedikt
Hall Margeret A.
Schneider Jeff
Publication venue: DigitalCommons@UNO
Publication date: 01/01/2014
Field of study

We aim to predict activities of political nature in Egypt which influence or reflect societal-scale behavior and beliefs by using learning algorithms on Twitter data. We focus on capturing domestic events in Egypt from November 2009 to November 2013. To this extent we study underlying communication patterns by evaluating content-based and meta-data information in classification tasks without targeting specific keywords or users. Classification is done using Support Vector Machines (SVM) and Support Distribution Machines (SDM). Latent Dirichlet Allocation (LDA) is used to create content-based input patterns for the classifiers while bags of Twitter meta-information are used with the SDM to classify meta-data features. The experiments reveal that user centric approaches based on metadata can outperform methods employing content-based input despite the use of well established natural language processing algorithms. The results show that distributions over users-centric meta information provides an important signal when detecting and predicting events

The University of Nebraska, Omaha

Support vector clustering of time series data with alignment kernels

Author: Boecking Benedikt
Chalup Stephan K.
Seese Detlef
Wong Aaron S. W.
Publication venue: Elsevier
Publication date: 01/01/2014
Field of study

Time series clustering is an important data mining topic and a challenging task due to the sequences’ potentially very complex structures. In the present study we experimentally investigate the combination of support vector clustering with a triangular alignment kernel by evaluating it on an artificial time series benchmark dataset. The experiments lead to meaningful segmentations of the data, thereby providing an example that clustering time series with specific kernels is possible without pre-processing of the data. We compare our approach and the results and learn that the clustering quality is competitive when compared to other approaches

University of Newcastle's Digital Repository

Generative Modeling Helps Weak Supervision (and Vice Versa)

Author: Boecking Benedikt
Dubrawski Artur
Ermon Stefano
Neiswanger Willie
Roberts Nicholas
Sala Frederic
Publication venue
Publication date: 01/06/2022
Field of study

Many promising applications of supervised machine learning face hurdles in the acquisition of labeled data in sufficient quantity and quality, creating an expensive bottleneck. To overcome such limitations, techniques that do not depend on ground truth labels have been studied, including weak supervision and generative modeling. While these techniques would seem to be usable in concert, improving one another, how to build an interface between them is not well-understood. In this work, we propose a model fusing programmatic weak supervision and generative adversarial networks and provide theoretical justification motivating this fusion. The proposed approach captures discrete latent variables in the data alongside the weak supervision derived label estimate. Alignment of the two allows for better modeling of sample-dependent accuracies of the weak supervision sources, improving the estimate of unobserved labels. It is the first approach to enable data augmentation through weakly supervised synthetic images and pseudolabels. Additionally, its learned latent variables can be inspected qualitatively. The model outperforms baseline weak supervision label models on a number of multiclass image classification datasets, improves the quality of generated images, and further improves end-model performance through data augmentation with synthetic samples

arXiv.org e-Print Archive

Ordinal Programmatic Weak Supervision and Crowdsourcing for Estimating Cognitive States (Student Abstract)

Author: Boecking Benedikt
Clark Torin K.
Dubrawski Artur
Gisolfi Nicholas
Kintz Jacob R.
Pradeep Prakruthi
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 06/09/2023
Field of study

Crowdsourcing and weak supervision offer methods to efficiently label large datasets. Our work builds on existing weak supervision models to accommodate ordinal target classes, in an effort to recover ground truth from weak, external labels. We define a parameterized factor function and show that our approach improves over other baselines

Association for the Advancement of Artificial Intelligence: AAAI Publications