1,355 research outputs found
A Full Probabilistic Model for Yes/No Type Crowdsourcing in Multi-Class Classification
Crowdsourcing has become widely used in supervised scenarios where training
sets are scarce and difficult to obtain. Most crowdsourcing models in the
literature assume labelers can provide answers to full questions. In
classification contexts, full questions require a labeler to discern among all
possible classes. Unfortunately, discernment is not always easy in realistic
scenarios. Labelers may not be experts in differentiating all classes. In this
work, we provide a full probabilistic model for a shorter type of queries. Our
shorter queries only require "yes" or "no" responses. Our model estimates a
joint posterior distribution of matrices related to labelers' confusions and
the posterior probability of the class of every object. We developed an
approximate inference approach, using Monte Carlo Sampling and Black Box
Variational Inference, which provides the derivation of the necessary
gradients. We built two realistic crowdsourcing scenarios to test our model.
The first scenario queries for irregular astronomical time-series. The second
scenario relies on the image classification of animals. We achieved results
that are comparable with those of full query crowdsourcing. Furthermore, we
show that modeling labelers' failures plays an important role in estimating
true classes. Finally, we provide the community with two real datasets obtained
from our crowdsourcing experiments. All our code is publicly available.Comment: SIAM International Conference on Data Mining (SDM19), 9 official
pages, 5 supplementary page
Support Vector Machines and Radon's Theorem
A support vector machine (SVM) is an algorithm which finds a hyperplane that
optimally separates labeled data points in into positive and
negative classes. The data points on the margin of this separating hyperplane
are called support vectors. We study the possible configurations of support
vectors for points in general position. In particular, we connect the possible
configurations to Radon's theorem, which provides guarantees for when a set of
points can be divided into two classes (positive and negative) whose convex
hulls intersect. If the positive and negative support vectors in a generic SVM
configuration are projected to the separating hyperplane, then these projected
points will form a Radon configuration. Further, with a particular type of
general position, we show there are at most support vectors. This can be
used to test the level of machine precision needed in a support vector machine
implementation. We also show the projections of the convex hulls of the support
vectors intersect in a single Radon point, and under a small enough
perturbation, the points labeled as support vectors remain labeled as support
vectors. We furthermore consider computations studying the expected number of
support vectors for randomly generated data
Predicting outcomes in crowdfunding campaigns with textual, visual, and linguistic signals
This paper introduces a neural network and natural language processing approach to predict the outcome of crowdfunding startup pitches using text, speech, and video metadata in 20,188 crowdfunding campaigns. Our study emphasizes the need to understand crowdfunding from an investor’s perspective. Linguistic styles in crowdfunding campaigns that aim to trigger excitement or are aimed at inclusiveness are better predictors of campaign success than firm-level determinants. At the contrary, higher uncertainty perceptions about the state of product development may substantially reduce evaluations of new products and reduce purchasing intentions among potential funders. Our findings emphasize that positive psychological language is salient in environments where objective information is scarce and where investment preferences are taste based. Employing enthusiastic language or showing the product in action may capture an individual’s attention. Using all technology and design-related crowdfunding campaigns launched on Kickstarter, our study underscores the need to align potential consumers’ expectations with the visualization and presentation of the crowdfunding campaign
On Model- and Data-based Approaches to Structural Health Monitoring
Structural Heath Monitoring (SHM) is the term applied to the process of periodically monitoring the state of a structural system with the aim of diagnosing damage in the structure. Over the course of the past several decades there has been ongoing interest in approaches to the problem of SHM. This attention has been sustained by the belief that SHM will allow substantial economic and life-safety benefits to be realised across a wide range of applications. Several numerical and laboratory implementations have been successfully demonstrated. However, despite this research effort, real-world applications of SHM as originally envisaged are somewhat rare. Numerous technical barriers to the broader application of SHM methods have been identified, namely: severe restrictions on the availability of damaged-state data in real-world scenarios; difficulties associated with the numerical modelling of physical systems; and limited understanding of the physical effect of system inputs (including environmental and operational loads). This thesis focuses on the roles of law-based and data-based modelling in current applications of. First, established approaches to model-based SHM are introduced, with the aid of an exemplar ‘wingbox’ structure. The study highlights the degree of difficulty associated with applying model-updating-based methods and with producing numerical models capable of accurately predicting changes in structural response due to damage. These difficulties motivate the investigation of non-deterministic, predictive modelling of structural responses taking into account both experimental and modelling uncertainties. Secondly, a data-based approach to multiple-site damage location is introduced, which may allow the quantity of experimental data required for classifier training to be drastically reduced. A conclusion of the above research is the identification of hybrid approaches, in which a forward-mode law-based model informs a data-based damage identification scheme, as an area for future wor
The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List
We are interested in supervised ranking algorithms that perform especially well near the top of the
ranked list, and are only required to perform sufficiently well on the rest of the list. In this work,
we provide a general form of convex objective that gives high-scoring examples more importance.
This “push” near the top of the list can be chosen arbitrarily large or small, based on the preference
of the user. We choose ℓp-norms to provide a specific type of push; if the user sets p larger, the
objective concentrates harder on the top of the list. We derive a generalization bound based on
the p-norm objective, working around the natural asymmetry of the problem. We then derive a
boosting-style algorithm for the problem of ranking with a push at the top. The usefulness of the
algorithm is illustrated through experiments on repository data. We prove that the minimizer of the
algorithm’s objective is unique in a specific sense. Furthermore, we illustrate how our objective is
related to quality measurements for information retrieval
Efficient Resources Provisioning Based on Load Forecasting in Cloud
Cloud providers should ensure QoS while maximizing resources utilization. One optimal strategy is to timely allocate resources in a fine-grained mode according to application’s actual resources demand. The necessary precondition of this strategy is obtaining future load information in advance. We propose a multi-step-ahead load forecasting method, KSwSVR, based on statistical learning theory which is suitable for the complex and dynamic characteristics of the cloud computing environment. It integrates an improved support vector regression algorithm and Kalman smoother. Public trace data taken from multitypes of resources were used to verify its prediction accuracy, stability, and adaptability, comparing with AR, BPNN, and standard SVR. Subsequently, based on the predicted results, a simple and efficient strategy is proposed for resource provisioning. CPU allocation experiment indicated it can effectively reduce resources consumption while meeting service level agreements requirements
Cognitive Machine Individualism in a Symbiotic Cybersecurity Policy Framework for the Preservation of Internet of Things Integrity: A Quantitative Study
This quantitative study examined the complex nature of modern cyber threats to propose the establishment of cyber as an interdisciplinary field of public policy initiated through the creation of a symbiotic cybersecurity policy framework. For the public good (and maintaining ideological balance), there must be recognition that public policies are at a transition point where the digital public square is a tangible reality that is more than a collection of technological widgets. The academic contribution of this research project is the fusion of humanistic principles with Internet of Things (IoT) technologies that alters our perception of the machine from an instrument of human engineering into a thinking peer to elevate cyber from technical esoterism into an interdisciplinary field of public policy. The contribution to the US national cybersecurity policy body of knowledge is a unified policy framework (manifested in the symbiotic cybersecurity policy triad) that could transform cybersecurity policies from network-based to entity-based. A correlation archival data design was used with the frequency of malicious software attacks as the dependent variable and diversity of intrusion techniques as the independent variable for RQ1. For RQ2, the frequency of detection events was the dependent variable and diversity of intrusion techniques was the independent variable. Self-determination Theory is the theoretical framework as the cognitive machine can recognize, self-endorse, and maintain its own identity based on a sense of self-motivation that is progressively shaped by the machine’s ability to learn. The transformation of cyber policies from technical esoterism into an interdisciplinary field of public policy starts with the recognition that the cognitive machine is an independent consumer of, advisor into, and influenced by public policy theories, philosophical constructs, and societal initiatives
- …